--- license: mit tags: - physical-ai-interpretability-sae - LeRobot - Robotics datasets: - villekuosmanen/drop_footbag_into_dice_tower - villekuosmanen/drop_footbag_into_dice_tower_continuous - villekuosmanen/dAgger_drop_footbag_into_dice_tower_1.0.0 - villekuosmanen/dAgger_drop_footbag_into_dice_tower_1.1.0 - villekuosmanen/dAgger_drop_footbag_into_dice_tower_1.2.0 - villekuosmanen/dAgger_drop_footbag_into_dice_tower_1.3.0 - villekuosmanen/dAgger_drop_footbag_into_dice_tower_1.4.0 - villekuosmanen/dAgger_drop_footbag_into_dice_tower_1.5.0 - villekuosmanen/dAgger_drop_footbag_into_dice_tower_1.6.0 - villekuosmanen/eval_footbag_11Sep library_name: physical-ai-interpretability --- # Sparse Autoencoder (SAE) Model This model is a Sparse Autoencoder trained for interpretability analysis of robotics policies using the LeRobot framework. ## Model Details - **Architecture**: Multi-modal Sparse Autoencoder - **Training Dataset**: `[villekuosmanen/drop_footbag_into_dice_tower, villekuosmanen/drop_footbag_into_dice_tower_continuous, villekuosmanen/dAgger_drop_footbag_into_dice_tower_1.0.0, villekuosmanen/dAgger_drop_footbag_into_dice_tower_1.1.0, villekuosmanen/dAgger_drop_footbag_into_dice_tower_1.2.0, villekuosmanen/dAgger_drop_footbag_into_dice_tower_1.3.0, villekuosmanen/dAgger_drop_footbag_into_dice_tower_1.4.0, villekuosmanen/dAgger_drop_footbag_into_dice_tower_1.5.0, villekuosmanen/dAgger_drop_footbag_into_dice_tower_1.6.0, villekuosmanen/eval_footbag_11Sep]` - **Base Policy**: LeRobot ACT policy - **Layer Target**: `model.encoder.layers.3.norm2` - **Tokens**: 77 - **Token Dimension**: 128 - **Feature Dimension**: 12320 - **Expansion Factor**: 1.25 ## Training Configuration - **Learning Rate**: 0.0001 - **Batch Size**: 16 - **L1 Penalty**: 0.3 - **Epochs**: 15 - **Optimizer**: adam ## Usage ```python from physical_ai_interpretability.sae import load_sae_from_hub # Load model from Hub model = load_sae_from_hub("villekuosmanen/drop_footbag_into_dice_tower_ood_sae_success") # Or load using builder from physical_ai_interpretability.sae import SAEBuilder builder = SAEBuilder(device='cuda') model = builder.load_from_hub("villekuosmanen/drop_footbag_into_dice_tower_ood_sae_success") ``` ## Out-of-Distribution Detection This SAE model can be used for OOD detection with LeRobot policies: ```python from physical_ai_interpretability.ood import OODDetector # Create OOD detector with Hub-loaded SAE ood_detector = OODDetector( policy=your_policy, sae_hub_repo_id="villekuosmanen/drop_footbag_into_dice_tower_ood_sae_success" ) # Fit threshold and use for detection ood_detector.fit_ood_threshold_to_validation_dataset(validation_dataset) is_ood, error = ood_detector.is_out_of_distribution(observation) ``` ## Files - `model.safetensors`: The trained SAE model weights - `config.json`: Training and model configuration - `training_state.pt`: Complete training state (optimizer, scheduler, metrics) - `ood_params.json`: OOD detection parameters (if fitted) ``` ## Framework This model was trained using the [physical-ai-interpretability](https://github.com/your-repo/physical-ai-interpretability) framework with [LeRobot](https://github.com/huggingface/lerobot).