๐Ÿง  Humigence CLI

Your AI. Your pipeline. Zero code.

A complete MLOps suite built for makers, teams, and enterprises. Humigence provides zero-config, GPU-aware fine-tuning with surgical precision and complete reproducibility.

โœจ Key Features

  • ๐ŸŽฏ Interactive Wizard: Step-by-step configuration with Basic/Advanced modes
  • ๐Ÿ–ฅ๏ธ Smart GPU Detection: Automatic detection and selection of available GPUs
  • ๐Ÿš€ Dual-GPU Training: Multi-GPU support with Unsloth + TorchRun
  • ๐Ÿงช Training Recipes: QLoRA (4-bit), LoRA (FP16/BF16), Full Fine-tuning
  • ๐Ÿ“Š Intelligent Batching: Auto-fit batch size to available VRAM
  • ๐Ÿ”„ Complete Reproducibility: Config snapshots and reproduce scripts
  • ๐Ÿ“ˆ Built-in Evaluation: Curated prompts and quality gates
  • ๐Ÿ“ฆ Artifact Export: Structured outputs with run summaries

๐Ÿš€ Quick Start

Prerequisites

  • GPU: NVIDIA GPU with CUDA support (RTX 5090, RTX 4080, etc.)
  • RAM: 8GB+ recommended
  • Storage: 10GB+ for models and datasets
  • Python: 3.8+ with PyTorch

Installation

# Clone the repository
git clone https://github.com/your-username/humigence.git
cd humigence

# Install dependencies
pip install -r requirements.txt

# Set up Unsloth (required for training)
python3 training/unsloth/setup_humigence_unsloth.py

# Launch the interactive wizard
python3 cli/main.py

Basic Usage

# Launch the interactive wizard
python3 cli/main.py

# The wizard will guide you through:
# 1. Model selection
# 2. Dataset configuration  
# 3. Training parameters
# 4. GPU selection (single or multi-GPU)
# 5. Launch training

๐ŸŽฏ Training Workflow

1. Interactive Setup

The Humigence wizard guides you through:

  • Setup Mode: Basic (essential config) or Advanced (full control)
  • Hardware Detection: Automatic GPU, CPU, and memory detection
  • Model Selection: Choose from supported models or custom paths
  • Dataset Loading: Auto-detection from ~/humigence_data/ or custom paths
  • Training Recipe: QLoRA, LoRA, or Full Fine-tuning
  • GPU Selection: Single-GPU auto-selection or multi-GPU prompting

2. GPU Selection

Humigence intelligently handles GPU selection:

  • Single GPU: Automatically selects and uses the available GPU
  • Multiple GPUs: Prompts you to choose:
    ๐Ÿ”ง Training Mode:
    > Multi-GPU Training (all available GPUs)
      Single GPU Training (choose specific GPU)
    

3. Training Execution

๐Ÿš€ Humigence Training Starting...
โœ… Configuration Loaded: [all settings]
๐Ÿ–ฅ๏ธ GPU Detection: 2x RTX 5090 detected
๐Ÿ”ง Training Mode: Multi-GPU Training
๐Ÿ“ฆ Loading model: Qwen/Qwen2.5-0.5B
โœ… LoRA adapters applied
๐Ÿ“š Loading dataset: wikitext2 (10,000 samples)
๐Ÿš€ Starting training with TorchRun...
โœ… Training complete โ€” adapters saved.

๐Ÿ“Š Supported Models

  • Qwen/Qwen2.5-0.5B: 77M parameters (recommended for testing)
  • microsoft/Phi-2: 839M parameters
  • TinyLlama/TinyLlama-1.1B-Chat-v1.0: 369M parameters
  • Custom Models: Any HuggingFace model or local path

๐Ÿ—‚๏ธ Dataset Support

  • JSONL Format: Line-by-line JSON with instruction/output pairs
  • Auto-Detection: Scans ~/humigence_data/ directory
  • Custom Paths: Specify any local dataset file
  • Sample Datasets: Includes demo datasets for testing

Dataset Format

{"instruction": "What is machine learning?", "output": "Machine learning is a subset of artificial intelligence..."}
{"instruction": "Explain quantum computing", "output": "Quantum computing uses quantum mechanical phenomena..."}

๐Ÿ–ฅ๏ธ Hardware Requirements

Minimum Requirements

  • GPU: NVIDIA GPU with 8GB+ VRAM
  • RAM: 16GB+ system RAM
  • Storage: 20GB+ free space

Recommended Setup

  • GPU: RTX 4080/4090/5090 or better
  • RAM: 32GB+ system RAM
  • Storage: 50GB+ free space

Multi-GPU Support

  • Dual-GPU: RTX 5090 + RTX 5090 (tested)
  • Memory: 16GB+ VRAM per GPU recommended
  • Training: Automatic TorchRun distribution

๐Ÿ“ Project Structure

humigence/
โ”œโ”€โ”€ cli/
โ”‚   โ”œโ”€โ”€ main.py              # Main CLI entry point
โ”‚   โ”œโ”€โ”€ config_wizard.py     # Interactive configuration wizard
โ”‚   โ””โ”€โ”€ lora_wizard.py       # LoRA-specific wizard
โ”œโ”€โ”€ training/
โ”‚   โ””โ”€โ”€ unsloth/            # Unsloth integration
โ”‚       โ”œโ”€โ”€ wizard.py       # Unsloth training wizard
โ”‚       โ””โ”€โ”€ train_lora_dual.py  # Multi-GPU training script
โ”œโ”€โ”€ pipelines/
โ”‚   โ””โ”€โ”€ lora_trainer.py     # Training pipeline
โ”œโ”€โ”€ utils/
โ”‚   โ”œโ”€โ”€ device.py           # Hardware detection
โ”‚   โ”œโ”€โ”€ dataset_loader.py   # Dataset utilities
โ”‚   โ””โ”€โ”€ validators.py       # Data validation
โ”œโ”€โ”€ config/
โ”‚   โ””โ”€โ”€ default_config.json # Default configuration
โ””โ”€โ”€ runs/                   # Training outputs
    โ””โ”€โ”€ humigence/
        โ”œโ”€โ”€ config.snapshot.json
        โ”œโ”€โ”€ adapters/       # LoRA weights
        โ””โ”€โ”€ artifacts.zip   # Complete export

๐Ÿ”ง Configuration

Basic Mode (Recommended)

Essential configuration with sensible defaults:

  • Learning Rate: 2e-4
  • Epochs: 1
  • Gradient Accumulation: 4
  • LoRA Rank: 16
  • LoRA Alpha: 32

Advanced Mode

Full control over all parameters:

  • LoRA configuration (rank, alpha, dropout)
  • Training hyperparameters
  • Data processing options
  • Evaluation settings

๐Ÿš€ Training Modes

Single-GPU Training

# Automatically selected when 1 GPU detected
๐Ÿ”ง Single GPU detected - using GPU 0: RTX 5090
๐Ÿš€ Launching single-GPU training...

Multi-GPU Training

# Prompts when multiple GPUs detected
๐Ÿ”ง 2 GPUs detected - choose training mode
> Multi-GPU Training (all available GPUs)
  Single GPU Training (choose specific GPU)

๐Ÿ“ˆ Evaluation & Monitoring

Built-in Evaluation

  • Curated Prompts: 5 diverse evaluation questions
  • Model Inference: Generation with temperature sampling
  • Quality Gates: Loss thresholds and evaluation metrics
  • Status Tracking: ACCEPTED.txt or REJECTED.txt files

Run Monitoring

# View training progress
tail -f runs/humigence/training.log

# Check evaluation results
cat runs/humigence/eval_results.jsonl

# View run summary
cat runs/humigence/run_summary.json

๐Ÿ”„ Reproducibility

Every training run generates:

  • Config Snapshot: Complete configuration in JSON
  • Reproduce Script: One-click rerun capability
  • Artifact Archive: Complete export of all outputs
  • Run Summary: Structured metadata for tracking
# Rerun any training
./runs/humigence/reproduce.sh

# Or use the config directly
python3 training/unsloth/train_lora_dual.py --config runs/humigence/config.snapshot.json

๐Ÿ› ๏ธ Development

Dependencies

Core dependencies are pinned for stability:

transformers>=4.41.0,<5.0.0
torch>=2.1.0
unsloth @ git+https://github.com/unslothai/unsloth.git
rich>=13.0.0
inquirer>=3.1.0

Local Development

# Install in development mode
pip install -e .

# Run tests
python3 -m pytest tests/

# Run specific test
python3 test_gpu_selection.py

๐Ÿค Contributing

We welcome contributions! Please see CONTRIBUTING.md for details.

Quick Contribution Guide

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Make your changes
  4. Add tests if applicable
  5. Commit your changes: git commit -m 'Add amazing feature'
  6. Push to the branch: git push origin feature/amazing-feature
  7. Open a Pull Request

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • Unsloth for fast LoRA training
  • HuggingFace for the transformers library
  • Microsoft for PEFT and LoRA implementations
  • The open-source ML community

๐Ÿ†š Comparison with Other Tools

Feature Humigence CLI Other Tools
Setup Interactive wizard Manual config
GPU Detection Automatic Manual
Multi-GPU Built-in TorchRun Complex setup
Reproducibility Complete snapshots Partial
Evaluation Built-in prompts External tools
Artifacts Structured export Manual collection

๐Ÿ› Troubleshooting

Common Issues

GPU not detected:

# Check CUDA installation
python3 -c "import torch; print(torch.cuda.is_available())"

# Check GPU visibility
nvidia-smi

Out of memory:

# Reduce batch size in config
# Or use QLoRA for memory efficiency

Training fails:

# Check logs
cat runs/humigence/training.log

# Verify dataset format
head -5 ~/humigence_data/your_dataset.jsonl

Getting Help

๐Ÿ—บ๏ธ Roadmap

Current Features โœ…

  • Interactive configuration wizard
  • Single and multi-GPU training
  • QLoRA and LoRA support
  • Built-in evaluation
  • Complete reproducibility

Coming Soon ๐Ÿšง

  • RAG implementation
  • EnterpriseGPT integration
  • Batch inference
  • Context length optimization
  • Web UI interface
  • Model serving

Future Features ๐Ÿ”ฎ

  • Distributed training across nodes
  • Advanced evaluation metrics
  • Model compression
  • Deployment automation

Built with โค๏ธ for the AI community

Humigence โ€” Your AI. Your pipeline. Zero code.

๐Ÿ“Š Stats

GitHub stars GitHub forks GitHub issues GitHub license Python version PyTorch CUDA

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Dataset used to train lilbablo/humigencev2

Evaluation results