TinyGSM

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

abhishekpanigrahi authored a paper 22 days ago

On the SDEs and Scaling Rules for Adaptive Gradient Algorithms

abhishekpanigrahi authored a paper 22 days ago

Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?

abhishekpanigrahi authored a paper 22 days ago

AdaptMI: Adaptive Skill-based In-context Math Instruction for Small Language Models

View all activity

abhishekpanigrahi

authored 4 papers 22 days ago

On the SDEs and Scaling Rules for Adaptive Gradient Algorithms

Paper • 2205.10287 • Published May 20, 2022

Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?

Paper • 2501.02669 • Published Jan 5 • 1

AdaptMI: Adaptive Skill-based In-context Math Instruction for Small Language Models

Paper • 2505.00147 • Published Apr 30 • 4

Skill-Targeted Adaptive Training

Paper • 2510.10023 • Published Oct 11 • 9

delip

authored 2 papers 11 months ago

Faithful Chain-of-Thought Reasoning

Paper • 2301.13379 • Published Jan 31, 2023

WithdrarXiv: A Large-Scale Dataset for Retraction Study

Paper • 2412.03775 • Published Dec 4, 2024

ClaraBing

authored 3 papers over 1 year ago

Exposing Attention Glitches with Flip-Flop Language Modeling

Paper • 2306.00946 • Published Jun 1, 2023 • 2

TinyGSM: achieving >80% on GSM8k with small language models

Paper • 2312.09241 • Published Dec 14, 2023 • 40

Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression

Paper • 2306.00788 • Published Jun 1, 2023

sjelassi

authored a paper almost 2 years ago

Repeat After Me: Transformers are Better than State Space Models at Copying

Paper • 2402.01032 • Published Feb 1, 2024 • 24

ClaraBing

updated a dataset almost 2 years ago

TinyGSM/TinyGSM

Viewer • Updated Jan 11, 2024 • 11.8M • 854 • 9

alexli

authored a paper about 2 years ago

Your Diffusion Model is Secretly a Zero-Shot Classifier

Paper • 2303.16203 • Published Mar 28, 2023

abhishekpanigrahi

authored a paper over 2 years ago

Task-Specific Skill Localization in Fine-tuned Language Models

Paper • 2302.06600 • Published Feb 13, 2023

YuchenLi01

authored a paper over 2 years ago

How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding

Paper • 2303.04245 • Published Mar 7, 2023

sjelassi

authored a paper over 2 years ago

Length Generalization in Arithmetic Transformers

Paper • 2306.15400 • Published Jun 27, 2023 • 4

AI & ML interests

Recent Activity

Team members 7

TinyGSM's activity