YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

ss_bridges_d4096_f0.002

Weight-sparse transformer with bridges, trained with the procedure from Gao et al. (2025).

Model Details (Sparse Model)

Layers: 2
Model Dimension: 4096
Context Length: 512
Head Dimension: 16
Vocabulary Size: 4096

Bridges

Dense Model: jacobcd52/ss_d128_f1
Encoder Activation Fraction: 0.25

Sparsity

Weight Sparsity: True
Target L0 Fraction: 0.002
Activation Sparsity: True

Training

Dataset: data/simplestories-tokenized
Tokenizer: SimpleStories/SimpleStories-1.25M
Total Tokens: 2,000,000,000

Training Run

W&B Run: https://wandb.ai/training-saes/bridges_training/runs/c81q8p5q

Usage

import torch
from huggingface_hub import hf_hub_download

# Download model and bridges
sparse_model_path = hf_hub_download(repo_id="jacobcd52/ss_bridges_d4096_f0.002", filename="sparse_model.bin")
bridges_path = hf_hub_download(repo_id="jacobcd52/ss_bridges_d4096_f0.002", filename="bridges.bin")
config_path = hf_hub_download(repo_id="jacobcd52/ss_bridges_d4096_f0.002", filename="config.json")

# Load (requires the SparseGPT and BridgeSet classes from this repo)
sparse_state_dict = torch.load(sparse_model_path)
bridges_state_dict = torch.load(bridges_path)

Downloads last month: 38

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support