--- language: en license: mit library_name: sklearn tags: - trading - finance - gold - xauusd - forex - algorithmic-trading - smart-money-concepts - smc - xgboost - lightgbm - machine-learning - backtesting - technical-analysis - multi-timeframe - intraday-trading - high-frequency-trading - ensemble-model - keras - tensorflow datasets: - yahoo-finance-gc-f metrics: - accuracy - precision - recall - f1 - sharpe - max_drawdown - cagr - win_rate model-index: - name: romeo-v5-daily results: - task: type: binary-classification name: Daily Price Direction Prediction dataset: type: yahoo-finance-gc-f name: Gold Futures (GC=F) metrics: - type: accuracy value: 49.47 name: Win Rate - type: sharpe value: 0.3119 name: Sharpe Ratio - type: max_drawdown value: -47.66 name: Max Drawdown (%) - type: cagr value: 0.0444 name: CAGR --- # Romeo V5 — Ensemble Trading Model for XAUUSD ## Model Details ### Model Description Romeo V5 is an ensemble machine learning model designed for predicting price movements in XAUUSD (Gold vs US Dollar) futures. It combines tree-based models (XGBoost and LightGBM) with an optional Keras neural network head to generate trading signals. The model outputs a probability score for long (up) trades, and the backtester handles entry/exit logic, position sizing, and risk management. - **Model Type**: Ensemble Classifier (XGBoost + LightGBM + optional Keras NN) - **Asset**: XAUUSD (Gold Futures) - **Strategy**: Smart Money Concepts (SMC) with technical indicators - **Prediction Horizon**: Daily timeframe (5-day ahead direction) - **Framework**: Scikit-learn, XGBoost, LightGBM, TensorFlow/Keras ### Model Architecture - **Ensemble Components**: - XGBoost Classifier: Gradient boosting on decision trees. - LightGBM Classifier: Efficient gradient boosting with leaf-wise growth. - Optional Keras Neural Network: Dense layers with custom `SumAxis1Layer` to replace anonymous Lambda for serialization. - **Features**: 31 canonical features including technical indicators (SMA, EMA, RSI, Bollinger Bands) and SMC elements (order blocks, volume profiles). - **Serialization**: Tree models saved in joblib `.pkl` format; Keras model in native `.keras` format. - **Weights**: Ensemble weights stored in artifact for weighted probability averaging. ### Intended Use - **Primary Use**: Research, backtesting, and evaluation on historical XAUUSD data. - **Secondary Use**: Educational purposes for understanding ensemble trading models. - **Out-of-Scope**: Not financial advice. Do not use for live trading without proper validation, risk controls, and regulatory compliance. ### Factors - **Relevant Factors**: Market volatility, economic indicators affecting gold prices (e.g., USD strength, inflation data). - **Evaluation Factors**: Tested on unseen data; robustness scanned across slippage, commission, and threshold parameters. ### Metrics - **Evaluation Data**: Unseen daily data (out-of-sample). - **Metrics**: - Initial Capital: 100 - Final Capital: 484.82 - CAGR: 0.0444 - Annual Volatility: 0.4118 - Sharpe Ratio: 0.3119 - Max Drawdown: -47.66% - Total Trades: 3610 - Win Rate: 49.47% - Avg PnL per Trade: 0.1066 ### Training Data - **Source**: Yahoo Finance (GC=F) historical data. - **Preprocessing**: Feature engineering with technical indicators and SMC concepts. - **Split**: Trained on historical data; evaluated on unseen fresh dataset. ### Quantitative Analyses - **Robustness Scan**: Coarse grid sweep (slippage: 0-1 pips, commission: 0-0.0005, threshold: 0.5-0.6). Best scenarios: low friction, threshold ~0.5. Worst: high commission/threshold. - **M2M Equity**: Per-bar mark-to-market equity calculation for accurate risk metrics. ### Ethical Considerations - **Bias**: Model trained on historical data; may not account for future market changes or black swan events. - **Risk**: High volatility in forex; potential for significant losses. - **Transparency**: Full disclosure of assumptions, limitations, and evaluation. ### Caveats and Recommendations - **Limitations**: Simplified position sizing; small-account behavior may differ with margin rules. Historical backtests not indicative of future results. - **Recommendations**: Use with stop-loss, diversify, and consult financial advisors. Validate on your own data before use. ## Usage ### Loading the Model ```python import joblib artifact = joblib.load('trading_model_romeo_daily.pkl') features = artifact['features'] # Canonical feature list models = artifact['models'] # Dict of XGBoost/LightGBM models weights = artifact['weights'] # Ensemble weights ``` ### Making Predictions ```python import pandas as pd # Prepare df with features matching artifact['features'] X = df[features].fillna(0) # Fill missing features with 0 probabilities = sum(weight * model.predict_proba(X)[:, 1] for model, weight in zip(models.values(), weights.values())) / sum(weights.values()) signals = (probabilities > threshold).astype(int) # threshold e.g. 0.5 ``` ### Backtesting Use `v5/backtest_v5.py` with `--data ` to run on custom data. It aligns features automatically. ### Requirements - Python 3.8+ - scikit-learn, xgboost, lightgbm, tensorflow, joblib ## Files - `trading_model_romeo_daily.pkl`: Main artifact. - `romeo_keras_daily.keras`: Optional Keras model. - `README.md`: This model card. - `metadata.json`: Structured metadata. ## Contact For issues or contributions: https://github.com/JonusNattapong/AITradings-samsam