-
FastVLM: Efficient Vision Encoding for Vision Language Models
Paper • 2412.13303 • Published • 72 -
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 115 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 53 -
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling
Paper • 2509.12201 • Published • 103
Collections
Discover the best community collections!
Collections including paper arxiv:2510.26692
-
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models
Paper • 2410.14059 • Published • 61 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
Token-Efficient Long Video Understanding for Multimodal LLMs
Paper • 2503.04130 • Published • 96 -
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Paper • 2503.10639 • Published • 53
-
Selective Attention Improves Transformer
Paper • 2410.02703 • Published • 24 -
Differential Transformer
Paper • 2410.05258 • Published • 180 -
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention
Paper • 2410.05076 • Published • 8 -
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Paper • 2410.13276 • Published • 29
-
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Paper • 2404.15653 • Published • 29 -
MoDE: CLIP Data Experts via Clustering
Paper • 2404.16030 • Published • 15 -
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper • 2405.12130 • Published • 50 -
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
Paper • 2405.12981 • Published • 33
-
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 155 -
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper • 2403.13372 • Published • 173 -
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers
Paper • 2508.14704 • Published • 42 -
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models
Paper • 2508.09834 • Published • 53
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 6.66k • 1.22k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 134 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 302 -
Lizard: An Efficient Linearization Framework for Large Language Models
Paper • 2507.09025 • Published • 18 -
On the Expressiveness of Softmax Attention: A Recurrent Neural Network Perspective
Paper • 2507.23632 • Published • 6 -
Causal Attention with Lookahead Keys
Paper • 2509.07301 • Published • 21
-
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Paper • 2404.12253 • Published • 55 -
Kimi Linear: An Expressive, Efficient Attention Architecture
Paper • 2510.26692 • Published • 113 -
Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs
Paper • 2508.06601 • Published • 6
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 6 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 23 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 13 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69
-
FastVLM: Efficient Vision Encoding for Vision Language Models
Paper • 2412.13303 • Published • 72 -
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 115 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 53 -
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling
Paper • 2509.12201 • Published • 103
-
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 155 -
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper • 2403.13372 • Published • 173 -
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers
Paper • 2508.14704 • Published • 42 -
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models
Paper • 2508.09834 • Published • 53
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 6.66k • 1.22k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 134 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models
Paper • 2410.14059 • Published • 61 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
Token-Efficient Long Video Understanding for Multimodal LLMs
Paper • 2503.04130 • Published • 96 -
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Paper • 2503.10639 • Published • 53
-
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 302 -
Lizard: An Efficient Linearization Framework for Large Language Models
Paper • 2507.09025 • Published • 18 -
On the Expressiveness of Softmax Attention: A Recurrent Neural Network Perspective
Paper • 2507.23632 • Published • 6 -
Causal Attention with Lookahead Keys
Paper • 2509.07301 • Published • 21
-
Selective Attention Improves Transformer
Paper • 2410.02703 • Published • 24 -
Differential Transformer
Paper • 2410.05258 • Published • 180 -
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention
Paper • 2410.05076 • Published • 8 -
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Paper • 2410.13276 • Published • 29
-
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Paper • 2404.12253 • Published • 55 -
Kimi Linear: An Expressive, Efficient Attention Architecture
Paper • 2510.26692 • Published • 113 -
Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs
Paper • 2508.06601 • Published • 6
-
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Paper • 2404.15653 • Published • 29 -
MoDE: CLIP Data Experts via Clustering
Paper • 2404.16030 • Published • 15 -
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper • 2405.12130 • Published • 50 -
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
Paper • 2405.12981 • Published • 33
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 6 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 23 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 13 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69