File size: 14,707 Bytes
37a8fd4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
---

language: en
license: mit
library_name: sklearn
tags:
  - trading
  - finance
  - gold
  - xauusd
  - forex
  - algorithmic-trading
  - smart-money-concepts
  - smc
  - xgboost
  - machine-learning
  - backtesting
  - technical-analysis
  - multi-timeframe
  - intraday-trading
  - high-frequency-trading
datasets:
  - yahoo-finance-gc-f
metrics:
  - accuracy
  - precision
  - recall
  - f1
model-index:
  - name: xauusd-trading-ai-smc-daily
    results:
      - task:
          type: binary-classification
          name: Daily Price Direction Prediction
        dataset:
          type: yahoo-finance-gc-f
          name: Gold Futures (GC=F)
        metrics:
          - type: accuracy
            value: 80.3
            name: Accuracy
          - type: precision
            value: 71
            name: Precision (Class 1)
          - type: recall
            value: 81
            name: Recall (Class 1)
          - type: f1
            value: 76
            name: F1-Score
  - name: xauusd-trading-ai-smc-15m
    results:
      - task:
          type: binary-classification
          name: 15-Minute Price Direction Prediction
        dataset:
          type: yahoo-finance-gc-f
          name: Gold Futures (GC=F)
        metrics:
          - type: accuracy
            value: 77.0
            name: Accuracy
          - type: precision
            value: 76
            name: Precision (Class 1)
          - type: recall
            value: 77
            name: Recall (Class 1)
          - type: f1
            value: 76
            name: F1-Score
---

---

# XAUUSD Multi-Timeframe Trading AI Model

## Files Included

### Core Models
- `trading_model.pkl` - Original daily timeframe XGBoost model (85.4% win rate)
- `trading_model_15m.pkl` - 15-minute intraday model (77% validation accuracy)
- `trading_model_1m.pkl` - 1-minute intraday model (partially trained)
- `trading_model_30m.pkl` - 30-minute intraday model (ready for training)

### Documentation
- `README.md` - This comprehensive model card
- `XAUUSD_Trading_AI_Paper.md` - **Research paper with academic structure, literature review, and methodology**
- `XAUUSD_Trading_AI_Paper.docx` - **Word document version (professional format)**
- `XAUUSD_Trading_AI_Paper.html` - **HTML web version (styled and readable)**
- `XAUUSD_Trading_AI_Paper.tex` - **LaTeX source (for academic publishing)**
- `XAUUSD_Trading_AI_Technical_Whitepaper.md` - **Technical whitepaper with mathematical formulations and implementation details**
- `XAUUSD_Trading_AI_Technical_Whitepaper.docx` - **Word document version (professional format)**
- `XAUUSD_Trading_AI_Technical_Whitepaper.html` - **HTML web version (styled and readable)**
- `XAUUSD_Trading_AI_Technical_Whitepaper.tex` - **LaTeX source (for academic publishing)**

### Performance & Analysis
- `backtest_report.csv` - Daily model yearly backtesting performance results
- `backtest_multi_timeframe_results.csv` - Intraday model backtesting results
- `feature_importance_15m.csv` - 15-minute model feature importance analysis

### Scripts & Tools
- `train_multi_timeframe.py` - Multi-timeframe model training script
- `backtest_multi_timeframe.py` - Intraday model backtesting framework
- `multi_timeframe_summary.py` - Comprehensive performance analysis tool
- `fetch_data.py` - Enhanced data acquisition for multiple timeframes

### Dataset Files
- **Daily Data**: `daily_data.csv`, `processed_daily_data.csv`, `smc_features_dataset.csv`, `X_features.csv`, `y_target.csv`
- **Intraday Data**: `1m_data.csv` (5,204 samples), `15m_data.csv` (3,814 samples), `30m_data.csv` (1,910 samples)

## Recent Enhancements (v2.0)

### Visual Documentation
- **Dataset Flow Diagram**: Complete data processing pipeline from raw Yahoo Finance data to model training
- **Model Architecture Diagram**: XGBoost ensemble structure with decision flow visualization
- **Buy/Sell Workflow Diagram**: End-to-end trading execution process with risk management

### Advanced Formulas & Techniques
- **Position Sizing Formula**: Risk-adjusted position calculation with Kelly Criterion adaptation
- **Risk Metrics**: Sharpe Ratio, Sortino Ratio, Calmar Ratio, and Maximum Drawdown calculations
- **SMC Techniques**: Advanced Order Block detection with volume profile analysis
- **Dynamic Thresholds**: Market volatility-based prediction threshold adjustment
- **Ensemble Signals**: Multi-source signal confirmation (ML + Technical + SMC)

### Performance Analytics
- **Monthly Performance Heatmap**: Visual representation of returns across all test years
- **Risk-Return Scatter Plot**: Performance comparison across different risk levels
- **Market Regime Analysis**: Performance breakdown by trending vs sideways markets

### Documentation Updates
- **Enhanced Technical Whitepaper**: Added comprehensive visual diagrams and mathematical formulations
- **Enhanced Research Paper**: Added Mermaid diagrams, advanced algorithms, and detailed performance analysis
- **Professional Exports**: Both documents now available in HTML, Word, and LaTeX formats

## Multi-Timeframe Trading System (Latest Addition)

### Overview
The system has been extended to support intraday trading across multiple timeframes, enabling higher-frequency trading strategies while maintaining the proven SMC + technical indicator approach.

### Supported Timeframes
- **1-minute (1m)**: Ultra-short-term scalping opportunities
- **15-minute (15m)**: Short-term swing trading
- **30-minute (30m)**: Medium-term position trading
- **Daily (1d)**: Original baseline model (85.4% win rate)

### Data Acquisition
- **Source**: Yahoo Finance API with enhanced intraday data fetching
- **Limitations**: Historical intraday data restricted (recent periods only)
- **Current Datasets**:
  - 1m: 5,204 samples (7 days of recent data)
  - 15m: 3,814 samples (60 days of recent data)
  - 30m: 1,910 samples (60 days of recent data)

### Model Architecture
- **Base Algorithm**: XGBoost Classifier (same as daily model)
- **Features**: 23 features (technical indicators + SMC elements)
- **Training**: Grid search hyperparameter optimization
- **Validation**: 80/20 train/test split with stratification

### Training Results
- **15m Model**: Successfully trained with 77% validation accuracy
- **Feature Importance**: Technical indicators dominant (SMA_50, EMA_12, BB_lower)

- **Training Status**: 1m model partially trained, 30m model interrupted (available for completion)



### Backtesting Performance

- **Framework**: Backtrader with realistic commission modeling

- **Risk Management**: Fixed stake sizing ($1,000 per trade)

- **15m Results**: -0.83% return with 1 trade (conservative strategy)

- **Analysis**: Models show conservative behavior to avoid overtrading



### Key Insights

- ✅ Successfully scaled daily model architecture to intraday timeframes

- ✅ Technical indicators remain most important across all timeframes

- ✅ Conservative prediction thresholds prevent excessive trading

- ⚠️ Limited historical data affects backtesting statistical significance

- ⚠️ Yahoo Finance API constraints limit comprehensive validation



### Files Added

- `train_multi_timeframe.py` - Multi-timeframe model training script

- `backtest_multi_timeframe.py` - Intraday model backtesting framework

- `multi_timeframe_summary.py` - Comprehensive performance analysis

- `trading_model_15m.pkl` - Trained 15-minute model

- `feature_importance_15m.csv` - Feature importance analysis

- `backtest_multi_timeframe_results.csv` - Backtesting performance data

### Next Steps
1. Complete 30m model training
2. Implement walk-forward optimization
3. Add extended historical data sources
4. Deploy best performing intraday model
5. Compare intraday vs daily performance

## Model Description

This is an AI-powered trading model for XAUUSD (Gold vs US Dollar) futures, trained using Smart Money Concepts (SMC) strategy elements. The model uses machine learning to predict 5-day ahead price movements and generate trading signals with high win rates.

### Key Features
- **Asset**: XAUUSD (Gold Futures)
- **Strategy**: Smart Money Concepts (SMC) with technical indicators
- **Prediction Horizon**: 5-day ahead price direction
- **Model Type**: XGBoost Classifier

## Romeo (V5) — Ensemble model

Romeo (codename V5) is the latest ensemble model combining tree-based learners (XGBoost / LightGBM) and an optional Keras head. The artifacts live in `models_romeo/` and include a canonical feature list used by the backtester to align unseen data.

Artifacts
- `models_romeo/trading_model_romeo_daily.pkl` — ensemble artifact (joblib) with `models`, `weights`, and `features` keys.
- `models_romeo/romeo_keras_daily.keras` — optional Keras model file when included in training.
- `models_romeo/MODEL_CARD.md` — this model's card with evaluation and transparency notes.

Evaluation (selected run on unseen daily data)
- Initial capital: 100
- Final capital: 484.8199
- CAGR: 0.0444
- Annual volatility: 0.4118
- Sharpe: 0.3119
- Max Drawdown: -47.66%
- Total trades: 3610
- Win rate: 49.47%

Uploading to Hugging Face
-------------------------
There is a helper script to upload the model artifacts to Hugging Face Hub:

1. Install dependencies:
```bash

pip install huggingface_hub

```

2. Set your HF token in the environment (Windows cmd.exe):
```cmd

set HF_TOKEN=hf_YourTokenHere

```

3. Upload:
```cmd

python v5\upload_model_v5_to_hf.py --repo-name your-username/romeo-v5 --model-dir models_romeo

```

The script will create the repo (if it doesn't exist) and upload all files from `models_romeo/`.

Usage example
-------------
Load the artifact and run predictions:

```python

import joblib

artifact = joblib.load('models_romeo/trading_model_romeo_daily.pkl')

features = artifact['features']

# prepare X matching features

# model usage depends on artifact['models'] layout; check MODEL_CARD.md for details

```

Notes & Next Steps
------------------
- Position sizing is simplified in the backtester; consider implementing fixed-risk sizing before live use.
- Consider re-running the robustness scan using the M2M metric as primary evaluation (recommended).

- **Accuracy**: 80.3% on test data
- **Win Rate**: 85.4% in backtesting

## Intended Use

This model is designed for:
- Educational purposes in algorithmic trading
- Research on SMC strategies
- Backtesting trading strategies
- Understanding ML applications in financial markets

**⚠️ Warning**: This is not financial advice. Trading involves risk of loss. Use at your own discretion.

## Training Data

- **Source**: Yahoo Finance (GC=F - Gold Futures)
- **Period**: 2000-2020 (excluding recent months for efficiency)
- **Features**: 23 features including:
  - Price data (Open, High, Low, Close, Volume)
  - Technical indicators (SMA, EMA, RSI, MACD, Bollinger Bands)
  - SMC features (Fair Value Gaps, Order Blocks, Recovery patterns)
  - Lag features (Close prices from previous days)
- **Target**: Binary classification (1 if price rises in 5 days, 0 otherwise)
- **Dataset Size**: 8,816 samples
- **Class Distribution**: 54% down, 46% up (balanced with scale_pos_weight)

## Performance Metrics

### Model Performance
- **Accuracy**: 80.3%
- **Precision (Class 1)**: 71%
- **Recall (Class 1)**: 81%
- **F1-Score**: 76%

### Backtesting Results (2015-2020)
- **Overall Win Rate**: 85.4%
- **Total Return**: 18.2%
- **Sharpe Ratio**: 1.41
- **Yearly Win Rates**:
  - 2015: 62.5%
  - 2016: 100.0%
  - 2017: 100.0%
  - 2018: 72.7%
  - 2019: 76.9%
  - 2020: 94.1%

## Limitations

- Trained on historical data only (2000-2020)
- May not perform well in unprecedented market conditions
- Requires proper risk management
- No consideration of transaction costs, slippage, or market impact
- Model predictions are probabilistic, not guaranteed

## Usage

### Prerequisites
```python

pip install joblib scikit-learn pandas numpy

```

### Loading the Model
```python

import joblib

import pandas as pd

from sklearn.preprocessing import StandardScaler



# Load model

model = joblib.load('trading_model.pkl')



# Load scalers (you need to recreate or save them)

# ... preprocessing code ...



# Prepare features

features = prepare_features(your_data)

prediction = model.predict(features)

probability = model.predict_proba(features)

```

### Features Required
The model expects 23 features in this order:
1. Close
2. High
3. Low
4. Open
5. Volume
6. SMA_20

7. SMA_50
8. EMA_12

9. EMA_26
10. RSI
11. MACD
12. MACD_signal

13. MACD_hist
14. BB_upper

15. BB_middle
16. BB_lower

17. FVG_Size
18. FVG_Type_Encoded
19. OB_Type_Encoded
20. Recovery_Type_Encoded
21. Close_lag1

22. Close_lag2
23. Close_lag3



## Training Details



- **Algorithm**: XGBoost Classifier

- **Hyperparameters**:

  - n_estimators: 200
  - max_depth: 7

  - learning_rate: 0.2
  - scale_pos_weight: 1.17 (for class balancing)
- **Cross-validation**: 3-fold
- **Optimization**: Grid search on hyperparameters

## SMC Strategy Elements

The model incorporates Smart Money Concepts:
- **Fair Value Gaps (FVG)**: Price imbalances between candles
- **Order Blocks (OB)**: Areas of significant buying/selling
- **Recovery Patterns**: Pullbacks in trending markets

## Upload to Hugging Face

To share this model on Hugging Face:

1. Create a Hugging Face account at https://huggingface.co/join
2. Generate an access token at https://huggingface.co/settings/tokens with "Write" permissions
3. Test your token: `python test_token.py YOUR_TOKEN`
4. Upload: `python upload_to_hf.py YOUR_TOKEN`

The script will upload:
- `trading_model.pkl` - The trained XGBoost model
- `README.md` - This model card with metadata
- All dataset files (CSV format)

## Citation

If you use this model in your research, please cite:

```

@misc{xauusd-trading-ai,

  title={XAUUSD Trading AI Model with SMC Strategy},

  author={AI Trading System},

  year={2025},

  url={https://huggingface.co/JonusNattapong/xauusd-trading-ai-smc}

}

```

### Academic Paper
For the complete academic research paper with methodology, results, and analysis:

**arXiv Paper**: [XAUUSD Trading AI: A Machine Learning Approach Using Smart Money Concepts](https://arxiv.org/abs/XXXX.XXXXX)

## License

This model is released under the MIT License. See LICENSE file for details.

## Contact

For questions or issues, please open an issue on the Hugging Face repository.