Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

ezelikman
/
quietstar-8-ahead

Text Generation
Transformers
Safetensors
mistral
text-generation-inference
Model card Files Files and versions
xet
Community
4

Mistral-7b with continued pretraining using Quiet-STaR (https://arxiv.org/abs/2403.09629) for generating 8 thought tokens before each output token.

Downloads last month
16
Safetensors
Model size
7B params
Tensor type
BF16
Β·
Inference Providers NEW
Text Generation
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for ezelikman/quietstar-8-ahead

Merges
2 models
Quantizations
1 model

Dataset used to train ezelikman/quietstar-8-ahead

open-web-math/open-web-math

Viewer β€’ Updated Oct 17, 2023 β€’ 6.32M β€’ 10.3k β€’ 329

Spaces using ezelikman/quietstar-8-ahead 13

πŸ’»
FallnAI/Quantize-HF-Models
πŸƒ
openfree/LLM_Quantization
πŸƒ
seawolf2357/LLM_Quantization
πŸƒ
K00B404/LLM_Quantization
πŸ’»
exploitz3r0bbd/Quantize-HF-Models
πŸ’»
GTOMA83/Quantize-HF-Models
πŸ’»
KBaba7/Quant
πŸƒ
bhaskartripathi/LLM_Quantization
πŸ”₯
ruslanmv/convert_to_gguf
πŸ’»
totolook/Quant
πŸ’»
Oss11/Quantize-HF-Models
πŸ’»
Xlnk/Quantize-HF-Models

Paper for ezelikman/quietstar-8-ahead

Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Paper β€’ 2403.09629 β€’ Published Mar 14, 2024 β€’ 79
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs