Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2403.02884

KAN or MLP: A Fairer Comparison

Paper • 2407.16674 • Published Jul 23, 2024 • 43
MathScale: Scaling Instruction Tuning for Mathematical Reasoning

Paper • 2403.02884 • Published Mar 5, 2024 • 17
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20, 2024 • 50
DPTDR: Deep Prompt Tuning for Dense Passage Retrieval

Paper • 2208.11503 • Published Aug 24, 2022

Datasets - Math

introspector/unimath

Updated Feb 12, 2024 • 900 • 7
MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms

Paper • 1905.13319 • Published May 30, 2019 • 2
Measuring Mathematical Problem Solving With the MATH Dataset

Paper • 2103.03874 • Published Mar 5, 2021 • 5
MathScale: Scaling Instruction Tuning for Mathematical Reasoning

Paper • 2403.02884 • Published Mar 5, 2024 • 17

How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15, 2024 • 42
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 81
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 189
MathScale: Scaling Instruction Tuning for Mathematical Reasoning

Paper • 2403.02884 • Published Mar 5, 2024 • 17

System 2 Attention (is something you might need too)

Paper • 2311.11829 • Published Nov 20, 2023 • 44
TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems

Paper • 2311.11315 • Published Nov 19, 2023 • 8
Alignment for Honesty

Paper • 2312.07000 • Published Dec 12, 2023 • 16
Steering Llama 2 via Contrastive Activation Addition

Paper • 2312.06681 • Published Dec 9, 2023 • 15

MathScale: Scaling Instruction Tuning for Mathematical Reasoning

Paper • 2403.02884 • Published Mar 5, 2024 • 17
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 131
Improving Small Language Models' Mathematical Reasoning via Mix Thoughts Distillation

Paper • 2401.11864 • Published Jan 22, 2024 • 2
Common 7B Language Models Already Possess Strong Math Capabilities

Paper • 2403.04706 • Published Mar 7, 2024 • 20

Orca-Math: Unlocking the potential of SLMs in Grade School Math

Paper • 2402.14830 • Published Feb 16, 2024 • 25
MathScale: Scaling Instruction Tuning for Mathematical Reasoning

Paper • 2403.02884 • Published Mar 5, 2024 • 17
meta-math/MetaMath-Mistral-7B

Text Generation • Updated Dec 21, 2023 • 1.65k • 96
meta-math/MetaMath-13B-V1.0

Text Generation • Updated Dec 21, 2023 • 757 • 13

Rethinking Optimization and Architecture for Tiny Language Models

Paper • 2402.02791 • Published Feb 5, 2024 • 13
More Agents Is All You Need

Paper • 2402.05120 • Published Feb 3, 2024 • 57
Scaling Laws for Forgetting When Fine-Tuning Large Language Models

Paper • 2401.05605 • Published Jan 11, 2024
Aligning Large Language Models with Counterfactual DPO

Paper • 2401.09566 • Published Jan 17, 2024 • 2

KAN or MLP: A Fairer Comparison

Paper • 2407.16674 • Published Jul 23, 2024 • 43
MathScale: Scaling Instruction Tuning for Mathematical Reasoning

Paper • 2403.02884 • Published Mar 5, 2024 • 17
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20, 2024 • 50
DPTDR: Deep Prompt Tuning for Dense Passage Retrieval

Paper • 2208.11503 • Published Aug 24, 2022

MathScale: Scaling Instruction Tuning for Mathematical Reasoning

Paper • 2403.02884 • Published Mar 5, 2024 • 17
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 131
Improving Small Language Models' Mathematical Reasoning via Mix Thoughts Distillation

Paper • 2401.11864 • Published Jan 22, 2024 • 2
Common 7B Language Models Already Possess Strong Math Capabilities

Paper • 2403.04706 • Published Mar 7, 2024 • 20

Datasets - Math

introspector/unimath

Updated Feb 12, 2024 • 900 • 7
MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms

Paper • 1905.13319 • Published May 30, 2019 • 2
Measuring Mathematical Problem Solving With the MATH Dataset

Paper • 2103.03874 • Published Mar 5, 2021 • 5
MathScale: Scaling Instruction Tuning for Mathematical Reasoning

Paper • 2403.02884 • Published Mar 5, 2024 • 17

Orca-Math: Unlocking the potential of SLMs in Grade School Math

Paper • 2402.14830 • Published Feb 16, 2024 • 25
MathScale: Scaling Instruction Tuning for Mathematical Reasoning

Paper • 2403.02884 • Published Mar 5, 2024 • 17
meta-math/MetaMath-Mistral-7B

Text Generation • Updated Dec 21, 2023 • 1.65k • 96
meta-math/MetaMath-13B-V1.0

Text Generation • Updated Dec 21, 2023 • 757 • 13

How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15, 2024 • 42
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 81
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 189
MathScale: Scaling Instruction Tuning for Mathematical Reasoning

Paper • 2403.02884 • Published Mar 5, 2024 • 17

Rethinking Optimization and Architecture for Tiny Language Models

Paper • 2402.02791 • Published Feb 5, 2024 • 13
More Agents Is All You Need

Paper • 2402.05120 • Published Feb 3, 2024 • 57
Scaling Laws for Forgetting When Fine-Tuning Large Language Models

Paper • 2401.05605 • Published Jan 11, 2024
Aligning Large Language Models with Counterfactual DPO

Paper • 2401.09566 • Published Jan 17, 2024 • 2

System 2 Attention (is something you might need too)

Paper • 2311.11829 • Published Nov 20, 2023 • 44
TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems

Paper • 2311.11315 • Published Nov 19, 2023 • 8
Alignment for Honesty

Paper • 2312.07000 • Published Dec 12, 2023 • 16
Steering Llama 2 via Contrastive Activation Addition

Paper • 2312.06681 • Published Dec 9, 2023 • 15

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs