Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2504.18415

Quantization Reading-List

LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

Paper • 2208.07339 • Published Aug 15, 2022 • 5
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

Paper • 2210.17323 • Published Oct 31, 2022 • 8
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Paper • 2211.10438 • Published Nov 18, 2022 • 6
QLoRA: Efficient Finetuning of Quantized LLMs

Paper • 2305.14314 • Published May 23, 2023 • 57

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2 • 9
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10 • 43
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14 • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14 • 85

Research Papers/Reviews/Literature

Daily Research papers and review including older relevant content.

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 62
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 154
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning

Paper • 2503.15265 • Published Mar 19 • 46
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Paper • 2503.15558 • Published Mar 18 • 50

How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published Dec 19, 2024 • 52
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Paper • 2412.12094 • Published Dec 16, 2024 • 11
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Paper • 2306.07691 • Published Jun 13, 2023 • 12
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform

Paper • 2203.02395 • Published Mar 4, 2022 • 1

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 85
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

🔥BitNet family of large language models (1-bit LLMs).

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated May 1 • 5.83k • 1.22k
microsoft/bitnet-b1.58-2B-4T-bf16

Text Generation • 2B • Updated May 1 • 3.89k • 33
microsoft/bitnet-b1.58-2B-4T-gguf

Text Generation • 2B • Updated May 1 • 3.07k • 211
BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16 • 75

Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11 • 90
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published Jan 21 • 66
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 165
MoBA: Mixture of Block Attention for Long-Context LLMs

Paper • 2502.13189 • Published Feb 18 • 17

Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11 • 90
TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published Feb 11 • 58
Union of Experts: Adapting Hierarchical Routing to Equivalently Decomposed Transformer

Paper • 2503.02495 • Published Mar 4 • 9
BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs

Paper • 2504.18415 • Published Apr 25 • 47

Large Language Model (LLM) and NLP related papers.

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19, 2024 • 6
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 23
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 13
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 69

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 625
Beyond Language Models: Byte Models are Digital World Simulators

Paper • 2402.19155 • Published Feb 29, 2024 • 53
BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs

Paper • 2504.18415 • Published Apr 25 • 47
Kijai/PrecompiledWheels

Updated Jul 22 • 57

Quantization Reading-List

LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

Paper • 2208.07339 • Published Aug 15, 2022 • 5
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

Paper • 2210.17323 • Published Oct 31, 2022 • 8
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Paper • 2211.10438 • Published Nov 18, 2022 • 6
QLoRA: Efficient Finetuning of Quantized LLMs

Paper • 2305.14314 • Published May 23, 2023 • 57

🔥BitNet family of large language models (1-bit LLMs).

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated May 1 • 5.83k • 1.22k
microsoft/bitnet-b1.58-2B-4T-bf16

Text Generation • 2B • Updated May 1 • 3.89k • 33
microsoft/bitnet-b1.58-2B-4T-gguf

Text Generation • 2B • Updated May 1 • 3.07k • 211
BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16 • 75

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2 • 9
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10 • 43
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14 • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14 • 85

Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11 • 90
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published Jan 21 • 66
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 165
MoBA: Mixture of Block Attention for Long-Context LLMs

Paper • 2502.13189 • Published Feb 18 • 17

Research Papers/Reviews/Literature

Daily Research papers and review including older relevant content.

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 62
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 154
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning

Paper • 2503.15265 • Published Mar 19 • 46
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Paper • 2503.15558 • Published Mar 18 • 50

Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11 • 90
TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published Feb 11 • 58
Union of Experts: Adapting Hierarchical Routing to Equivalently Decomposed Transformer

Paper • 2503.02495 • Published Mar 4 • 9
BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs

Paper • 2504.18415 • Published Apr 25 • 47

How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published Dec 19, 2024 • 52
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Paper • 2412.12094 • Published Dec 16, 2024 • 11
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Paper • 2306.07691 • Published Jun 13, 2023 • 12
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform

Paper • 2203.02395 • Published Mar 4, 2022 • 1

Large Language Model (LLM) and NLP related papers.

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19, 2024 • 6
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 23
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 13
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 69

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 85
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 625
Beyond Language Models: Byte Models are Digital World Simulators

Paper • 2402.19155 • Published Feb 29, 2024 • 53
BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs

Paper • 2504.18415 • Published Apr 25 • 47
Kijai/PrecompiledWheels

Updated Jul 22 • 57

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs