When Agents Trade: Live Multi-Market Trading Benchmark for LLM Agents Paper • 2510.11695 • Published Oct 13, 2025
FinCriticalED: A Visual Benchmark for Financial Fact-Level OCR Evaluation Paper • 2511.14998 • Published Nov 19, 2025
The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models Paper • 2601.03425 • Published 13 days ago • 15
All That Glisters Is Not Gold: A Benchmark for Reference-Free Counterfactual Financial Misinformation Detection Paper • 2601.04160 • Published 12 days ago • 4
Same Claim, Different Judgment: Benchmarking Scenario-Induced Bias in Multilingual Financial Misinformation Detection Paper • 2601.05403 • Published 11 days ago • 9
FinCon: A Synthesized LLM Multi-Agent System with Conceptual Verbal Reinforcement for Enhanced Financial Decision Making Paper • 2407.06567 • Published Jul 9, 2024
MMAFFBen: A Multilingual and Multimodal Affective Analysis Benchmark for Evaluating LLMs and VLMs Paper • 2505.24423 • Published May 30, 2025 • 1
FinAuditing: A Financial Taxonomy-Structured Multi-Document Benchmark for Evaluating LLMs Paper • 2510.08886 • Published Oct 10, 2025 • 19
Me LLaMA: Foundation Large Language Models for Medical Applications Paper • 2402.12749 • Published Feb 20, 2024 • 2
INVESTORBENCH: A Benchmark for Financial Decision-Making Tasks with LLM-based Agent Paper • 2412.18174 • Published Dec 24, 2024
Retrieval-augmented Large Language Models for Financial Time Series Forecasting Paper • 2502.05878 • Published Feb 9, 2025 • 40
FinTagging: An LLM-ready Benchmark for Extracting and Structuring Financial Information Paper • 2505.20650 • Published May 27, 2025 • 17
MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation Paper • 2506.14028 • Published Jun 16, 2025 • 93
FinAudio: A Benchmark for Audio Large Language Models in Financial Applications Paper • 2503.20990 • Published Mar 26, 2025 • 19
FLAG-Trader: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading Paper • 2502.11433 • Published Feb 17, 2025 • 36
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models Paper • 2410.14059 • Published Oct 17, 2024 • 62
Back to the Future: Towards Explainable Temporal Reasoning with Large Language Models Paper • 2310.01074 • Published Oct 2, 2023 • 2
No Language is an Island: Unifying Chinese and English in Financial Large Language Models, Instruction Data, and Benchmarks Paper • 2403.06249 • Published Mar 10, 2024 • 3
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications Paper • 2408.11878 • Published Aug 20, 2024 • 63
Empowering Many, Biasing a Few: Generalist Credit Scoring through Large Language Models Paper • 2310.00566 • Published Oct 1, 2023 • 1