A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code Paper • 2508.18106 • Published Aug 25 • 344
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems Paper • 2411.02959 • Published Nov 5, 2024 • 70
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents Paper • 2507.04009 • Published Jul 5 • 51
MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm Paper • 2506.05218 • Published Jun 5 • 2
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning Paper • 2507.00432 • Published Jul 1 • 79
How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks Paper • 2507.01955 • Published Jul 2 • 35