Taishi's picture

20 36

Taishi

Taishi-N324

·

https://taishi-n324.github.io/

AI & ML interests

None yet

Organizations

upvoted a paper 3 months ago

MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources

Paper • 2509.25531 • Published Sep 29, 2025 • 8

upvoted a paper 4 months ago

Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks

Paper • 2508.18672 • Published Aug 26, 2025 • 10

upvoted 10 collections 4 months ago

open-sci-ref releases

Open-sci-ref: reference baselines releases • 1 item • Updated Jun 23, 2025 • 1

open-sci-ref-0.01 nemotron-hq

10 items • Updated Aug 17, 2025 • 4

open-sci-ref-0.01 dclm

7 items • Updated Jul 23, 2025 • 2

open-sci-ref-0.01 c4

26 items • Updated Jul 23, 2025 • 2

open-sci-ref-0.01 SlimPajama

4 items • Updated Jul 23, 2025 • 2

open-sci-ref-0.01 Pile

4 items • Updated Jul 23, 2025 • 2

open-sci-ref-0.01 FineWeb-Edu-1.4T

5 items • Updated Jul 23, 2025 • 2

open-sci-ref-0.01 CommonCorpus

4 items • Updated Jul 23, 2025 • 2

open-sci-ref-0.01 HPLT-2.0

4 items • Updated Jul 23, 2025 • 2

Optimal Sparsity Code

Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks • 65 items • Updated Aug 21, 2025 • 1

upvoted 2 collections 5 months ago

Optimal Sparsity Math

Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks • 67 items • Updated Aug 19, 2025 • 2

open-sci-ref-0.01

Research baseline models trained on various open reference datasets • 12 items • Updated Jul 23, 2025 • 4

upvoted 5 collections 8 months ago

SwallowCode

Rewriting Pre-Training Data Boosts LLM Performance in Math and Code • 66 items • Updated May 7, 2025 • 5

SwallowMath

Rewriting Pre-Training Data Boosts LLM Performance in Math and Code • 11 items • Updated May 7, 2025 • 4

Gemma-2-Swallow

6 items • Updated May 18, 2025 • 4

LLM-jp-3 Pre-trained Models

Pre-trained models in the LLM-jp-3 model series • 10 items • Updated May 28, 2025 • 6

LLM-jp-3 Fine-tuned Models

Fine-tuned models in the LLM-jp-3 model series • 25 items • Updated May 28, 2025 • 6

upvoted a collection 10 months ago

Llama-3.3-Swallow

3 items • Updated Mar 10, 2025 • 3