Collections
Discover the best community collections!
Collections including paper arxiv:2501.16372
-
Low-Rank Adapters Meet Neural Architecture Search for LLM Compression
Paper • 2501.16372 • Published • 12 -
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
Paper • 2501.16937 • Published • 7 -
Matryoshka Quantization
Paper • 2502.06786 • Published • 32 -
Identifying Sensitive Weights via Post-quantization Integral
Paper • 2503.01901 • Published • 8
-
Shears: Unstructured Sparsity with Neural Low-rank Adapter Search
Paper • 2404.10934 • Published -
Low-Rank Adapters Meet Neural Architecture Search for LLM Compression
Paper • 2501.16372 • Published • 12 -
IntelLabs/shears-llama-7b-50-math-super-adapter
Updated • 2 • 3 -
IntelLabs/shears-llama-7b-50-math-heuristic-adapter
Updated • 4 • 3
-
A LoRA-Based Approach to Fine-Tuning LLMs for Educational Guidance in Resource-Constrained Settings
Paper • 2504.15610 • Published • 1 -
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models
Paper • 2502.13533 • Published • 12 -
LoRA-SP: Streamlined Partial Parameter Adaptation for Resource-Efficient Fine-Tuning of Large Language Models
Paper • 2403.08822 • Published -
LoRA-Pro: Are Low-Rank Adapters Properly Optimized?
Paper • 2407.18242 • Published
-
SQFT: Low-cost Model Adaptation in Low-precision Sparse Foundation Models
Paper • 2410.03750 • Published • 2 -
Low-Rank Adapters Meet Neural Architecture Search for LLM Compression
Paper • 2501.16372 • Published • 12 -
IntelLabs/sqft-phi-3-mini-4k-50-base
Text Generation • 4B • Updated • 307 • 2 -
IntelLabs/sqft-phi-3-mini-4k-50-base-gptq
Text Generation • 0.7B • Updated • 308 • 2
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 60 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 48
-
A LoRA-Based Approach to Fine-Tuning LLMs for Educational Guidance in Resource-Constrained Settings
Paper • 2504.15610 • Published • 1 -
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models
Paper • 2502.13533 • Published • 12 -
LoRA-SP: Streamlined Partial Parameter Adaptation for Resource-Efficient Fine-Tuning of Large Language Models
Paper • 2403.08822 • Published -
LoRA-Pro: Are Low-Rank Adapters Properly Optimized?
Paper • 2407.18242 • Published
-
Low-Rank Adapters Meet Neural Architecture Search for LLM Compression
Paper • 2501.16372 • Published • 12 -
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
Paper • 2501.16937 • Published • 7 -
Matryoshka Quantization
Paper • 2502.06786 • Published • 32 -
Identifying Sensitive Weights via Post-quantization Integral
Paper • 2503.01901 • Published • 8
-
SQFT: Low-cost Model Adaptation in Low-precision Sparse Foundation Models
Paper • 2410.03750 • Published • 2 -
Low-Rank Adapters Meet Neural Architecture Search for LLM Compression
Paper • 2501.16372 • Published • 12 -
IntelLabs/sqft-phi-3-mini-4k-50-base
Text Generation • 4B • Updated • 307 • 2 -
IntelLabs/sqft-phi-3-mini-4k-50-base-gptq
Text Generation • 0.7B • Updated • 308 • 2
-
Shears: Unstructured Sparsity with Neural Low-rank Adapter Search
Paper • 2404.10934 • Published -
Low-Rank Adapters Meet Neural Architecture Search for LLM Compression
Paper • 2501.16372 • Published • 12 -
IntelLabs/shears-llama-7b-50-math-super-adapter
Updated • 2 • 3 -
IntelLabs/shears-llama-7b-50-math-heuristic-adapter
Updated • 4 • 3
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 60 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 48