---
language: en
license: mit
library_name: sklearn
tags:
  - trading
  - finance
  - gold
  - xauusd
  - forex
  - algorithmic-trading
  - smart-money-concepts
  - smc
  - xgboost
  - lightgbm
  - machine-learning
  - backtesting
  - technical-analysis
  - multi-timeframe
  - intraday-trading
  - high-frequency-trading
  - ensemble-model
  - keras
  - tensorflow
datasets:
  - yahoo-finance-gc-f
metrics:
  - accuracy
  - precision
  - recall
  - f1
  - sharpe
  - max_drawdown
  - cagr
  - win_rate
model-index:
  - name: romeo-v5-daily
    results:
      - task:
          type: binary-classification
          name: Daily Price Direction Prediction
        dataset:
          type: yahoo-finance-gc-f
          name: Gold Futures (GC=F)
        metrics:
          - type: accuracy
            value: 49.47
            name: Win Rate
          - type: sharpe
            value: 0.3119
            name: Sharpe Ratio
          - type: max_drawdown
            value: -47.66
            name: Max Drawdown (%)
          - type: cagr
            value: 0.0444
            name: CAGR
---

# Romeo V5 — Ensemble Trading Model for XAUUSD

## Model Details

### Model Description
Romeo V5 is an ensemble machine learning model designed for predicting price movements in XAUUSD (Gold vs US Dollar) futures. It combines tree-based models (XGBoost and LightGBM) with an optional Keras neural network head to generate trading signals. The model outputs a probability score for long (up) trades, and the backtester handles entry/exit logic, position sizing, and risk management.

- **Model Type**: Ensemble Classifier (XGBoost + LightGBM + optional Keras NN)
- **Asset**: XAUUSD (Gold Futures)
- **Strategy**: Smart Money Concepts (SMC) with technical indicators
- **Prediction Horizon**: Daily timeframe (5-day ahead direction)
- **Framework**: Scikit-learn, XGBoost, LightGBM, TensorFlow/Keras

### Model Architecture
- **Ensemble Components**:
  - XGBoost Classifier: Gradient boosting on decision trees.
  - LightGBM Classifier: Efficient gradient boosting with leaf-wise growth.
  - Optional Keras Neural Network: Dense layers with custom `SumAxis1Layer` to replace anonymous Lambda for serialization.
- **Features**: 31 canonical features including technical indicators (SMA, EMA, RSI, Bollinger Bands) and SMC elements (order blocks, volume profiles).
- **Serialization**: Tree models saved in joblib `.pkl` format; Keras model in native `.keras` format.
- **Weights**: Ensemble weights stored in artifact for weighted probability averaging.

### Intended Use
- **Primary Use**: Research, backtesting, and evaluation on historical XAUUSD data.
- **Secondary Use**: Educational purposes for understanding ensemble trading models.
- **Out-of-Scope**: Not financial advice. Do not use for live trading without proper validation, risk controls, and regulatory compliance.

### Factors
- **Relevant Factors**: Market volatility, economic indicators affecting gold prices (e.g., USD strength, inflation data).
- **Evaluation Factors**: Tested on unseen data; robustness scanned across slippage, commission, and threshold parameters.

### Metrics
- **Evaluation Data**: Unseen daily data (out-of-sample).
- **Metrics**:
  - Initial Capital: 100
  - Final Capital: 484.82
  - CAGR: 0.0444
  - Annual Volatility: 0.4118
  - Sharpe Ratio: 0.3119
  - Max Drawdown: -47.66%
  - Total Trades: 3610
  - Win Rate: 49.47%
  - Avg PnL per Trade: 0.1066

### Training Data
- **Source**: Yahoo Finance (GC=F) historical data.
- **Preprocessing**: Feature engineering with technical indicators and SMC concepts.
- **Split**: Trained on historical data; evaluated on unseen fresh dataset.

### Quantitative Analyses
- **Robustness Scan**: Coarse grid sweep (slippage: 0-1 pips, commission: 0-0.0005, threshold: 0.5-0.6). Best scenarios: low friction, threshold ~0.5. Worst: high commission/threshold.
- **M2M Equity**: Per-bar mark-to-market equity calculation for accurate risk metrics.

### Ethical Considerations
- **Bias**: Model trained on historical data; may not account for future market changes or black swan events.
- **Risk**: High volatility in forex; potential for significant losses.
- **Transparency**: Full disclosure of assumptions, limitations, and evaluation.

### Caveats and Recommendations
- **Limitations**: Simplified position sizing; small-account behavior may differ with margin rules. Historical backtests not indicative of future results.
- **Recommendations**: Use with stop-loss, diversify, and consult financial advisors. Validate on your own data before use.

## Usage

### Loading the Model
```python
import joblib
artifact = joblib.load('trading_model_romeo_daily.pkl')
features = artifact['features']  # Canonical feature list
models = artifact['models']      # Dict of XGBoost/LightGBM models
weights = artifact['weights']    # Ensemble weights
```

### Making Predictions
```python
import pandas as pd
# Prepare df with features matching artifact['features']
X = df[features].fillna(0)  # Fill missing features with 0
probabilities = sum(weight * model.predict_proba(X)[:, 1] for model, weight in zip(models.values(), weights.values())) / sum(weights.values())
signals = (probabilities > threshold).astype(int)  # threshold e.g. 0.5
```

### Backtesting
Use `v5/backtest_v5.py` with `--data <path>` to run on custom data. It aligns features automatically.

### Requirements
- Python 3.8+
- scikit-learn, xgboost, lightgbm, tensorflow, joblib

## Files
- `trading_model_romeo_daily.pkl`: Main artifact.
- `romeo_keras_daily.keras`: Optional Keras model.
- `README.md`: This model card.
- `metadata.json`: Structured metadata.

## Contact
For issues or contributions: https://github.com/JonusNattapong/AITradings-samsam