Zenith-7B V1

Standard GPU-optimized language model with code generation and emotional intelligence capabilities.

Features

  • 7B Parameter Model: Efficient for consumer GPUs (8-16GB VRAM)
  • Code Generation: Fine-tuned on Qwen2.5-Coder base for exceptional programming abilities
  • Emotional Intelligence: EQ adapter for recognizing and responding to emotions
  • OpenThoughts Integration: Trained on high-quality reasoning data
  • LoRA/QLoRA Support: Efficient fine-tuning with 4-bit quantization
  • Ollama Compatible: Ready-to-use Modelfile for easy deployment

Quick Start

Installation

# Clone and setup
cd Zenith/V1/7B
pip install -r requirements.txt

Training

# Full fine-tuning
python train.py \
  --base_model Qwen/Qwen2.5-Coder-7B \
  --train_data path/to/train.json \
  --epochs 3 \
  --batch_size 4 \
  --learning_rate 2e-5

# LoRA fine-tuning (recommended for most users)
python train.py \
  --base_model Qwen/Qwen2.5-Coder-7B \
  --train_data path/to/train.json \
  --use_lora \
  --lora_r 16 \
  --lora_alpha 32 \
  --epochs 3 \
  --batch_size 8

Inference

# Interactive mode
python inference.py --checkpoint ./outputs/checkpoint-final

# Single prompt
python inference.py \
  --checkpoint ./outputs/checkpoint-final \
  --prompt "Write a Python function to reverse a linked list" \
  --max_new_tokens 512

Ollama Deployment

# Build and run with Ollama
ollama create zenith-7b -f Modelfile
ollama run zenith-7b "Explain quantum computing in simple terms"

Project Structure

Zenith/V1/7B/
├── configs/              # Configuration files
│   ├── zenith_config.py  # Model architecture config
│   ├── data_config.py    # Data processing config
│   └── training_config.py # Training hyperparameters
├── data/                 # Data processing modules
│   ├── openthoughts_processor.py
│   ├── quality_filter.py
│   ├── curriculum_sampler.py
│   ├── advanced_tokenizer.py
│   └── preprocessing.py
├── src/                  # Source code
│   ├── models/
│   │   ├── zenith_model.py
│   │   ├── dense_layer.py
│   │   └── moe_layer.py
│   └── utils/
├── scripts/              # Utility scripts
├── tests/                # Test suite
├── train.py              # Main training script
├── inference.py          # Inference and generation
├── test_model.py         # Model validation tests
├── finetune_qwen.py      # Qwen fine-tuning guide
├── Modelfile             # Ollama configuration
├── requirements.txt      # Python dependencies
└── README.md             # This file

Configuration

The model uses a unified configuration system in configs/zenith_config.py:

from configs.zenith_config import get_7b_config

config = get_7b_config()
# Parameters:
# - hidden_size: 4096
# - num_layers: 32
# - num_heads: 32
# - num_experts: 0 (dense only, set >1 for MoE)
# - use_eq_adapter: True (emotional intelligence)
# - max_seq_len: 8192

Data Processing

OpenThoughts Integration

The data pipeline supports the OpenThoughts-1.2M dataset:

from data.openthoughts_processor import OpenThoughtsProcessor, OpenThoughtsConfig

config = OpenThoughtsConfig(
    dataset_name="open-thoughts/OpenThoughts3-1.2M",
    streaming=True,
    quality_filtering=True,
    curriculum_learning=True,
    augmentation=True
)
processor = OpenThoughtsProcessor(config)
dataset = processor.load_dataset()

Quality Filtering

Multi-dimensional quality assessment:

  • Length appropriateness
  • Language detection (English only)
  • Repetition detection
  • Coherence scoring
  • Structure validation
  • Thought quality (for CoT data)

Curriculum Learning

Progressive training stages:

  1. Foundation: High-quality, well-structured samples
  2. Reasoning: Chain-of-thought and problem-solving
  3. Code: Programming and technical content
  4. Full: Complete dataset with all samples

Advanced Features

MoE (Mixture of Experts)

Enable sparse activation for better performance:

python train.py --use_moe --num_experts 8
  • Top-2 routing with load balancing
  • 60% of layers use MoE (middle layers)
  • Shared router groups for efficiency

EQ Adapter

Emotional intelligence module:

python train.py --use_eq_adapter --eq_loss_weight 0.1
  • Frustration detection (regression)
  • 8-emotion classification
  • Fused with attention mechanism

LoRA/QLoRA

Efficient fine-tuning with low-rank adaptation:

# LoRA
python train.py --use_lora --lora_r 16 --lora_alpha 32

# QLoRA (4-bit quantization)
python train.py --use_qlora --use_lora --lora_r 8

Testing

Run the test suite:

python test_model.py

Tests include:

  • Model creation and initialization
  • Forward pass and gradient flow
  • Text generation
  • Multi-task outputs (EQ adapter)
  • Loss computation

Requirements

See requirements.txt for full dependencies. Key packages:

  • torch>=2.0.0
  • transformers>=4.35.0
  • datasets>=2.14.0
  • accelerate>=0.24.0
  • peft>=0.6.0 (for LoRA)
  • bitsandbytes>=0.41.0 (for QLoRA)
  • tensorboard>=2.14.0

Performance Tips

  1. Mixed Precision: Use --mixed_precision bf16 for faster training (Ampere+ GPUs)
  2. Gradient Checkpointing: Enabled by default to reduce memory
  3. Batch Size: Adjust based on VRAM (4-8 for 7B full, 16-32 for LoRA)
  4. Sequence Length: Longer sequences use more memory; adjust --max_seq_length

Troubleshooting

Out of Memory

  • Reduce batch size
  • Use gradient accumulation
  • Enable LoRA/QLoRA
  • Use mixed precision
  • Reduce sequence length

Slow Training

  • Increase batch size if possible
  • Use more gradient accumulation steps
  • Ensure data loading is not the bottleneck
  • Use mixed precision

Poor Quality Outputs

  • Train longer (more epochs)
  • Use higher quality data
  • Adjust learning rate (try 1e-5 to 5e-5)
  • Enable curriculum learning
  • Use quality filtering

Citation

If you use Zenith-7B in your research, please cite:

@misc{zenith-7b-2025,
  title={Zenith-7B: A Hybrid MoE Model for Code and Emotional Intelligence},
  year={2025},
  publisher={Zenith Project}
}

License

[Specify your license here]

Contact

For issues and questions, please open an issue on the project repository.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 2 Ask for provider support

Model tree for Matrix-Corp/Zenith-7b-V1

Base model

Qwen/Qwen2.5-7B
Finetuned
(76)
this model

Collection including Matrix-Corp/Zenith-7b-V1