--- language: en tags: - catboost - regression - machine-learning - tabular-data - gradient-boosting library_name: catboost widget: - tabular: example_title: "Sample Prediction" data: initially_infected: [4, 6] lowest_immunity: [0.2, 0.1] highest_immunity: [0.75, 0.75] mask_beta_penalty: [0.5, 0.5] pollutant_immunity_reduction: [0.2, 0.1] --- # Agentic Disease Spread CatBoost Regressor Model for Pollutant effects with Beta ## Model Description This is a CatBoost Regressor model trained for regression tasks on tabular data created by simulations from [Agent-based Implementations for Infectious Disease Transmission Models](https://github.com/AlekseiAgarkov/AgenticInfectiousDiseaseTransmissionModels) simulator. CatBoost (Categorical Boosting) is a gradient boosting library developed by Yandex that excels at handling categorical features natively without extensive preprocessing. - **Model type:** Gradient Boosting Decision Trees - **Task:** Regression - **License:** MIT - **Repository:** https://github.com/AlekseiAgarkov/AgenticInfectiousDiseaseTransmissionModels ## Intended Uses & Limitations ### Intended Use - Regression analysis on structured/tabular disease spread agentic simulations data - Scenarios with pollutant effects ### Limitations - Primarily designed for pollutant effects checking - Not suitable for unstructured data (images, text, audio) ## How to Use ### Installation ```bash pip install catboost ``` ### Basic Usage ```python import pickle import pandas as pd from catboost import CatBoostRegressor # Load the model with open('catboost_model.pkl', 'rb') as f: model = pickle.load(f) # Prepare your data (as pandas DataFrame) # Ensure features match training data format data = pd.DataFrame({ 'beta': [value0], 'initially_infected': [value1], 'lowest_immunity': [value2], 'highest_immunity': [value3], 'mask_beta_penalty': [value4], 'pollutant_immunity_reduction': [value5] }) # Make prediction prediction = model.predict(data) ``` ### Using with CatBoost directly ```python from catboost import CatBoostRegressor # Load saved model model = CatBoostRegressor() model.load_model('catboost_model.cbm') # Make predictions predictions = model.predict(data) ``` ## Training Procedure ### Training Data Data details: - Source: https://raw.githubusercontent.com/AlekseiAgarkov/MIFIML-2-Sem1-M25-525-Project-Practice/refs/heads/main/data/sim_data_metrics_20251214.csv - Features: - beta: float - infectivity coefficient (`beta`) - initially_infected: int - number of initially infected agents - lowest_immunity: float - lowest possible immunity in simulation - highest_immunity: float - highest possible immunity in simulation - mask_beta_penalty: float - beta reduction coefficient for a mask weared at contact - pollutant_immunity_reduction: float - immunity reduction coefficient for pollutant - Target variable: 'infected_90d' - Samples: 2000 - Preprocessing: None ### Training Hyperparameters ```yaml iterations: 10000 learning_rate: 0.025 depth: 5 loss_function: 'RMSE' cat_features: None verbose: False early_stopping_rounds: 500 random_seed: 42 ``` ### Evaluation Results | Metric | Value | |-------------------|--------| | Train RMSE | 476.41 | | Validation RMSE | 535.55 | ## Feature Information | Feature Name | Type | Description | Importance | |------------------------------|---------|---------------------------------------------------------|------------| | beta | Numeric | infectivity coefficient (`beta`) | 80.79 | | initially_infected | Numeric | number of initially infected agents | 17.94 | | lowest_immunity | Numeric | lowest possible immunity in simulation | 0.17 | | highest_immunity | Numeric | highest possible immunity in simulation | 0.42 | | mask_beta_penalty | Numeric | beta reduction coefficient for a mask weared at contact | 0.53 | | pollutant_immunity_reduction | Numeric | immunity reduction coefficient for pollutant | 0.15 | ## Model Architecture - **Algorithm:** Gradient Boosting on Decision Trees - **Number of trees:** 188 - **Tree depth:** 5 - **Learning rate:** 0.025 - **Loss function:** RMSE - **Feature importance type:** default ## Model Card Authors Aleksei Agarkov / MEPhI ## Model Card Contact agarkov.aleksei1@yandex.ru ## Disclaimer This model is provided "as is" without warranty of any kind. Users should evaluate the model's suitability for their specific use case and perform appropriate testing before deployment in production environments.