JonusNattapong commited on
Commit
00b0f1f
·
verified ·
1 Parent(s): a6a9c3b

Upload XAUUSD_Trading_AI_Technical_Whitepaper.tex with huggingface_hub

Browse files
XAUUSD_Trading_AI_Technical_Whitepaper.tex ADDED
@@ -0,0 +1,1558 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ \section{XAUUSD Trading AI: Technical
2
+ Whitepaper}\label{xauusd-trading-ai-technical-whitepaper}
3
+
4
+ \subsection{Machine Learning Framework with Smart Money Concepts
5
+ Integration}\label{machine-learning-framework-with-smart-money-concepts-integration}
6
+
7
+ \textbf{Version 1.0} \textbar{} \textbf{Date: September 18, 2025}
8
+ \textbar{} \textbf{Author: Jonus Nattapong Tapachom}
9
+
10
+ \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}
11
+
12
+ \subsection{Executive Summary}\label{executive-summary}
13
+
14
+ This technical whitepaper presents a comprehensive algorithmic trading
15
+ framework for XAUUSD (Gold/USD futures) price prediction, integrating
16
+ Smart Money Concepts (SMC) with advanced machine learning techniques.
17
+ The system achieves an 85.4\% win rate across 1,247 trades in
18
+ backtesting (2015-2020), with a Sharpe ratio of 1.41 and total return of
19
+ 18.2\%.
20
+
21
+ \textbf{Key Technical Achievements:} - \textbf{23-Feature Engineering
22
+ Pipeline}: Combining traditional technical indicators with SMC-derived
23
+ features - \textbf{XGBoost Optimization}: Hyperparameter-tuned gradient
24
+ boosting with class balancing - \textbf{Time-Series Cross-Validation}:
25
+ Preventing data leakage in temporal predictions - \textbf{Multi-Regime
26
+ Robustness}: Consistent performance across bull, bear, and sideways
27
+ markets
28
+
29
+ \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}
30
+
31
+ \subsection{1. System Architecture}\label{system-architecture}
32
+
33
+ \subsubsection{1.1 Core Components}\label{core-components}
34
+
35
+ \begin{verbatim}
36
+ ┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
37
+ │ Data Pipeline │───▶│ Feature Engineer │───▶│ ML Model │
38
+ │ │ │ │ │ │
39
+ │ • Yahoo Finance │ │ • Technical │ │ • XGBoost │
40
+ │ • Preprocessing │ │ • SMC Features │ │ • Prediction │
41
+ │ • Quality Check │ │ • Normalization │ │ • Probability │
42
+ └─────────────────┘ └──────────────────┘ └─────────────────┘
43
+
44
+ ┌─────────────────┐ ┌──────────────────┐ ▼
45
+ │ Backtesting │◀───│ Strategy Engine │ ┌─────────────────┐
46
+ │ Framework │ │ │ │ Signal │
47
+ │ │ │ • Position │ │ Generation │
48
+ │ • Performance │ │ • Risk Mgmt │ │ │
49
+ │ • Metrics │ │ • Execution │ └─────────────────┘
50
+ └─────────────────┘ └──────────────────┘
51
+ \end{verbatim}
52
+
53
+ \subsubsection{1.2 Data Flow Architecture}\label{data-flow-architecture}
54
+
55
+ \begin{Shaded}
56
+ \begin{Highlighting}[]
57
+ \NormalTok{graph TD}
58
+ \NormalTok{ A[Yahoo Finance API] {-}{-}\textgreater{} B[Raw Price Data]}
59
+ \NormalTok{ B {-}{-}\textgreater{} C[Data Validation]}
60
+ \NormalTok{ C {-}{-}\textgreater{} D[Technical Indicators]}
61
+ \NormalTok{ D {-}{-}\textgreater{} E[SMC Feature Extraction]}
62
+ \NormalTok{ E {-}{-}\textgreater{} F[Feature Normalization]}
63
+ \NormalTok{ F {-}{-}\textgreater{} G[Train/Validation Split]}
64
+ \NormalTok{ G {-}{-}\textgreater{} H[XGBoost Training]}
65
+ \NormalTok{ H {-}{-}\textgreater{} I[Model Validation]}
66
+ \NormalTok{ I {-}{-}\textgreater{} J[Backtesting Engine]}
67
+ \NormalTok{ J {-}{-}\textgreater{} K[Performance Analysis]}
68
+ \end{Highlighting}
69
+ \end{Shaded}
70
+
71
+ \subsubsection{1.3 Dataset Flow Diagram}\label{dataset-flow-diagram}
72
+
73
+ \begin{Shaded}
74
+ \begin{Highlighting}[]
75
+ \NormalTok{graph TD}
76
+ \NormalTok{ A[Yahoo Finance\textless{}br/\textgreater{}GC=F Data\textless{}br/\textgreater{}2000{-}2020] {-}{-}\textgreater{} B[Data Cleaning\textless{}br/\textgreater{}• Remove NaN\textless{}br/\textgreater{}• Outlier Detection\textless{}br/\textgreater{}• Format Validation]}
77
+
78
+ \NormalTok{ B {-}{-}\textgreater{} C[Feature Engineering Pipeline\textless{}br/\textgreater{}23 Features]}
79
+
80
+ \NormalTok{ C {-}{-}\textgreater{} D\{Feature Categories\}}
81
+ \NormalTok{ D {-}{-}\textgreater{} E[Price Data\textless{}br/\textgreater{}Open, High, Low, Close, Volume]}
82
+ \NormalTok{ D {-}{-}\textgreater{} F[Technical Indicators\textless{}br/\textgreater{}SMA, EMA, RSI, MACD, Bollinger]}
83
+ \NormalTok{ D {-}{-}\textgreater{} G[SMC Features\textless{}br/\textgreater{}FVG, Order Blocks, Recovery]}
84
+ \NormalTok{ D {-}{-}\textgreater{} H[Temporal Features\textless{}br/\textgreater{}Close Lag 1,2,3]}
85
+
86
+ \NormalTok{ E {-}{-}\textgreater{} I[Standardization\textless{}br/\textgreater{}Z{-}Score Normalization]}
87
+ \NormalTok{ F {-}{-}\textgreater{} I}
88
+ \NormalTok{ G {-}{-}\textgreater{} I}
89
+ \NormalTok{ H {-}{-}\textgreater{} I}
90
+
91
+ \NormalTok{ I {-}{-}\textgreater{} J[Target Creation\textless{}br/\textgreater{}5{-}Day Ahead Binary\textless{}br/\textgreater{}Price Direction]}
92
+
93
+ \NormalTok{ J {-}{-}\textgreater{} K[Class Balancing\textless{}br/\textgreater{}scale\_pos\_weight = 1.17]}
94
+
95
+ \NormalTok{ K {-}{-}\textgreater{} L[Train/Test Split\textless{}br/\textgreater{}80/20 Temporal Split]}
96
+
97
+ \NormalTok{ L {-}{-}\textgreater{} M[XGBoost Training\textless{}br/\textgreater{}Hyperparameter Optimization]}
98
+
99
+ \NormalTok{ M {-}{-}\textgreater{} N[Model Validation\textless{}br/\textgreater{}Cross{-}Validation\textless{}br/\textgreater{}Out{-}of{-}Sample Test]}
100
+
101
+ \NormalTok{ N {-}{-}\textgreater{} O[Backtesting\textless{}br/\textgreater{}2015{-}2020\textless{}br/\textgreater{}1,247 Trades]}
102
+
103
+ \NormalTok{ O {-}{-}\textgreater{} P[Performance Analysis\textless{}br/\textgreater{}Win Rate, Returns,\textless{}br/\textgreater{}Risk Metrics]}
104
+ \end{Highlighting}
105
+ \end{Shaded}
106
+
107
+ \subsubsection{1.4 Model Architecture
108
+ Diagram}\label{model-architecture-diagram}
109
+
110
+ \begin{Shaded}
111
+ \begin{Highlighting}[]
112
+ \NormalTok{graph TD}
113
+ \NormalTok{ A[Input Layer\textless{}br/\textgreater{}23 Features] {-}{-}\textgreater{} B[Feature Processing]}
114
+
115
+ \NormalTok{ B {-}{-}\textgreater{} C\{XGBoost Ensemble\textless{}br/\textgreater{}200 Trees\}}
116
+
117
+ \NormalTok{ C {-}{-}\textgreater{} D[Tree 1\textless{}br/\textgreater{}max\_depth=7]}
118
+ \NormalTok{ C {-}{-}\textgreater{} E[Tree 2\textless{}br/\textgreater{}max\_depth=7]}
119
+ \NormalTok{ C {-}{-}\textgreater{} F[Tree n\textless{}br/\textgreater{}max\_depth=7]}
120
+
121
+ \NormalTok{ D {-}{-}\textgreater{} G[Weighted Sum\textless{}br/\textgreater{}learning\_rate=0.2]}
122
+ \NormalTok{ E {-}{-}\textgreater{} G}
123
+ \NormalTok{ F {-}{-}\textgreater{} G}
124
+
125
+ \NormalTok{ G {-}{-}\textgreater{} H[Logistic Function\textless{}br/\textgreater{}σ(x) = 1/(1+e\^{}({-}x))]}
126
+
127
+ \NormalTok{ H {-}{-}\textgreater{} I[Probability Output\textless{}br/\textgreater{}P(y=1|x)]}
128
+
129
+ \NormalTok{ I {-}{-}\textgreater{} J\{Binary Classification\textless{}br/\textgreater{}Threshold = 0.5\}}
130
+
131
+ \NormalTok{ J {-}{-}\textgreater{} K[SELL Signal\textless{}br/\textgreater{}P(y=1) \textless{} 0.5]}
132
+ \NormalTok{ J {-}{-}\textgreater{} L[BUY Signal\textless{}br/\textgreater{}P(y=1) ≥ 0.5]}
133
+
134
+ \NormalTok{ L {-}{-}\textgreater{} M[Trading Decision\textless{}br/\textgreater{}Long Position]}
135
+ \NormalTok{ K {-}{-}\textgreater{} N[Trading Decision\textless{}br/\textgreater{}Short Position]}
136
+ \end{Highlighting}
137
+ \end{Shaded}
138
+
139
+ \subsubsection{1.5 Buy/Sell Workflow
140
+ Diagram}\label{buysell-workflow-diagram}
141
+
142
+ \begin{Shaded}
143
+ \begin{Highlighting}[]
144
+ \NormalTok{graph TD}
145
+ \NormalTok{ A[Market Data\textless{}br/\textgreater{}Real{-}time XAUUSD] {-}{-}\textgreater{} B[Feature Extraction\textless{}br/\textgreater{}23 Features Calculated]}
146
+
147
+ \NormalTok{ B {-}{-}\textgreater{} C[Model Prediction\textless{}br/\textgreater{}XGBoost Inference]}
148
+
149
+ \NormalTok{ C {-}{-}\textgreater{} D\{Probability Score\textless{}br/\textgreater{}P(Price ↑ in 5 days)\}}
150
+
151
+ \NormalTok{ D {-}{-}\textgreater{} E[P ≥ 0.5\textless{}br/\textgreater{}BUY Signal]}
152
+ \NormalTok{ D {-}{-}\textgreater{} F[P \textless{} 0.5\textless{}br/\textgreater{}SELL Signal]}
153
+
154
+ \NormalTok{ E {-}{-}\textgreater{} G\{Current Position\textless{}br/\textgreater{}Check\}}
155
+
156
+ \NormalTok{ G {-}{-}\textgreater{} H[No Position\textless{}br/\textgreater{}Open LONG]}
157
+ \NormalTok{ G {-}{-}\textgreater{} I[Short Position\textless{}br/\textgreater{}Close SHORT\textless{}br/\textgreater{}Open LONG]}
158
+
159
+ \NormalTok{ H {-}{-}\textgreater{} J[Position Management\textless{}br/\textgreater{}Hold until signal reversal]}
160
+ \NormalTok{ I {-}{-}\textgreater{} J}
161
+
162
+ \NormalTok{ F {-}{-}\textgreater{} K\{Current Position\textless{}br/\textgreater{}Check\}}
163
+
164
+ \NormalTok{ K {-}{-}\textgreater{} L[No Position\textless{}br/\textgreater{}Open SHORT]}
165
+ \NormalTok{ K {-}{-}\textgreater{} M[Long Position\textless{}br/\textgreater{}Close LONG\textless{}br/\textgreater{}Open SHORT]}
166
+
167
+ \NormalTok{ L {-}{-}\textgreater{} N[Position Management\textless{}br/\textgreater{}Hold until signal reversal]}
168
+ \NormalTok{ M {-}{-}\textgreater{} N}
169
+
170
+ \NormalTok{ J {-}{-}\textgreater{} O[Risk Management\textless{}br/\textgreater{}No Stop Loss\textless{}br/\textgreater{}No Take Profit]}
171
+ \NormalTok{ N {-}{-}\textgreater{} O}
172
+
173
+ \NormalTok{ O {-}{-}\textgreater{} P[Daily Rebalancing\textless{}br/\textgreater{}End of Day\textless{}br/\textgreater{}Position Review]}
174
+
175
+ \NormalTok{ P {-}{-}\textgreater{} Q\{New Signal\textless{}br/\textgreater{}Generated?\}}
176
+
177
+ \NormalTok{ Q {-}{-}\textgreater{} R[Yes\textless{}br/\textgreater{}Execute Trade]}
178
+ \NormalTok{ Q {-}{-}\textgreater{} S[No\textless{}br/\textgreater{}Hold Position]}
179
+
180
+ \NormalTok{ R {-}{-}\textgreater{} T[Transaction Logging\textless{}br/\textgreater{}Entry Price\textless{}br/\textgreater{}Position Size\textless{}br/\textgreater{}Timestamp]}
181
+ \NormalTok{ S {-}{-}\textgreater{} U[Monitor Market\textless{}br/\textgreater{}Next Day]}
182
+
183
+ \NormalTok{ T {-}{-}\textgreater{} V[Performance Tracking\textless{}br/\textgreater{}P\&L Calculation\textless{}br/\textgreater{}Win/Loss Recording]}
184
+ \NormalTok{ U {-}{-}\textgreater{} A}
185
+
186
+ \NormalTok{ V {-}{-}\textgreater{} W[End of Month\textless{}br/\textgreater{}Performance Report]}
187
+ \NormalTok{ W {-}{-}\textgreater{} X[Strategy Optimization\textless{}br/\textgreater{}Model Retraining\textless{}br/\textgreater{}Parameter Tuning]}
188
+ \end{Highlighting}
189
+ \end{Shaded}
190
+
191
+ \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}
192
+
193
+ \subsection{2. Mathematical Framework}\label{mathematical-framework}
194
+
195
+ \subsubsection{2.1 Problem Formulation}\label{problem-formulation}
196
+
197
+ \textbf{Objective}: Predict binary price direction for XAUUSD at time
198
+ t+5 given information up to time t.
199
+
200
+ \textbf{Mathematical Representation:}
201
+
202
+ \begin{verbatim}
203
+ y_{t+5} = f(X_t) ∈ {0, 1}
204
+ \end{verbatim}
205
+
206
+ Where: - \texttt{y\_\{t+5\}\ =\ 1} if Close\_\{t+5\} \textgreater{}
207
+ Close\_t (price increase) - \texttt{y\_\{t+5\}\ =\ 0} if Close\_\{t+5\}
208
+ ≤ Close\_t (price decrease or equal) - \texttt{X\_t} is the feature
209
+ vector at time t
210
+
211
+ \subsubsection{2.2 Feature Space
212
+ Definition}\label{feature-space-definition}
213
+
214
+ \textbf{Feature Vector Dimension}: 23 features
215
+
216
+ \textbf{Feature Categories:} 1. \textbf{Price Features} (5): Open, High,
217
+ Low, Close, Volume 2. \textbf{Technical Indicators} (11): SMA, EMA, RSI,
218
+ MACD components, Bollinger Bands 3. \textbf{SMC Features} (3): FVG Size,
219
+ Order Block Type, Recovery Pattern Type 4. \textbf{Temporal Features}
220
+ (3): Close price lags (1, 2, 3 days) 5. \textbf{Derived Features} (1):
221
+ Volume-weighted price changes
222
+
223
+ \subsubsection{2.3 XGBoost Mathematical
224
+ Foundation}\label{xgboost-mathematical-foundation}
225
+
226
+ \textbf{Objective Function:}
227
+
228
+ \begin{verbatim}
229
+ Obj(θ) = ∑_{i=1}^n l(y_i, ŷ_i) + ∑_{k=1}^K Ω(f_k)
230
+ \end{verbatim}
231
+
232
+ Where: - \texttt{l(y\_i,\ ŷ\_i)} is the loss function (log loss for
233
+ binary classification) - \texttt{Ω(f\_k)} is the regularization term -
234
+ \texttt{K} is the number of trees
235
+
236
+ \textbf{Gradient Boosting Update:}
237
+
238
+ \begin{verbatim}
239
+ ŷ_i^{(t)} = ŷ_i^{(t-1)} + η · f_t(x_i)
240
+ \end{verbatim}
241
+
242
+ Where: - \texttt{η} is the learning rate (0.2) - \texttt{f\_t} is the
243
+ t-th tree - \texttt{ŷ\_i\^{}\{(t)\}} is the prediction after t
244
+ iterations
245
+
246
+ \subsubsection{2.4 Class Balancing
247
+ Formulation}\label{class-balancing-formulation}
248
+
249
+ \textbf{Scale Positive Weight Calculation:}
250
+
251
+ \begin{verbatim}
252
+ scale_pos_weight = (negative_samples) / (positive_samples) = 0.54/0.46 ≈ 1.17
253
+ \end{verbatim}
254
+
255
+ \textbf{Modified Objective:}
256
+
257
+ \begin{verbatim}
258
+ Obj(θ) = ∑_{i=1}^n w_i · l(y_i, ŷ_i) + ∑_{k=1}^K Ω(f_k)
259
+ \end{verbatim}
260
+
261
+ Where \texttt{w\_i\ =\ scale\_pos\_weight} for positive class samples.
262
+
263
+ \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}
264
+
265
+ \subsection{3. Feature Engineering
266
+ Pipeline}\label{feature-engineering-pipeline}
267
+
268
+ \subsubsection{3.1 Technical Indicators
269
+ Implementation}\label{technical-indicators-implementation}
270
+
271
+ \paragraph{3.1.1 Simple Moving Average
272
+ (SMA)}\label{simple-moving-average-sma}
273
+
274
+ \begin{verbatim}
275
+ SMA_n(t) = (1/n) · ∑_{i=0}^{n-1} Close_{t-i}
276
+ \end{verbatim}
277
+
278
+ \begin{itemize}
279
+ \tightlist
280
+ \item
281
+ \textbf{Parameters}: n = 20, 50 periods
282
+ \item
283
+ \textbf{Purpose}: Trend identification
284
+ \end{itemize}
285
+
286
+ \paragraph{3.1.2 Exponential Moving Average
287
+ (EMA)}\label{exponential-moving-average-ema}
288
+
289
+ \begin{verbatim}
290
+ EMA_n(t) = α · Close_t + (1-α) · EMA_n(t-1)
291
+ \end{verbatim}
292
+
293
+ Where \texttt{α\ =\ 2/(n+1)} and n = 12, 26 periods
294
+
295
+ \paragraph{3.1.3 Relative Strength Index
296
+ (RSI)}\label{relative-strength-index-rsi}
297
+
298
+ \begin{verbatim}
299
+ RSI(t) = 100 - [100 / (1 + RS(t))]
300
+ \end{verbatim}
301
+
302
+ Where:
303
+
304
+ \begin{verbatim}
305
+ RS(t) = Average Gain / Average Loss (14-period)
306
+ \end{verbatim}
307
+
308
+ \paragraph{3.1.4 MACD Oscillator}\label{macd-oscillator}
309
+
310
+ \begin{verbatim}
311
+ MACD(t) = EMA_12(t) - EMA_26(t)
312
+ Signal(t) = EMA_9(MACD)
313
+ Histogram(t) = MACD(t) - Signal(t)
314
+ \end{verbatim}
315
+
316
+ \paragraph{3.1.5 Bollinger Bands}\label{bollinger-bands}
317
+
318
+ \begin{verbatim}
319
+ Middle(t) = SMA_20(t)
320
+ Upper(t) = Middle(t) + 2 · σ_t
321
+ Lower(t) = Middle(t) - 2 · σ_t
322
+ \end{verbatim}
323
+
324
+ Where \texttt{σ\_t} is the 20-period standard deviation.
325
+
326
+ \subsubsection{3.2 Smart Money Concepts
327
+ Implementation}\label{smart-money-concepts-implementation}
328
+
329
+ \paragraph{3.2.1 Fair Value Gap (FVG) Detection
330
+ Algorithm}\label{fair-value-gap-fvg-detection-algorithm}
331
+
332
+ \begin{Shaded}
333
+ \begin{Highlighting}[]
334
+ \KeywordTok{def}\NormalTok{ detect\_fvg(prices\_df):}
335
+ \CommentTok{"""}
336
+ \CommentTok{ Detect Fair Value Gaps in price action}
337
+ \CommentTok{ Returns: List of FVG objects with type, size, and location}
338
+ \CommentTok{ """}
339
+ \NormalTok{ fvgs }\OperatorTok{=}\NormalTok{ []}
340
+
341
+ \ControlFlowTok{for}\NormalTok{ i }\KeywordTok{in} \BuiltInTok{range}\NormalTok{(}\DecValTok{1}\NormalTok{, }\BuiltInTok{len}\NormalTok{(prices\_df) }\OperatorTok{{-}} \DecValTok{1}\NormalTok{):}
342
+ \NormalTok{ current\_low }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i]}
343
+ \NormalTok{ current\_high }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i]}
344
+ \NormalTok{ prev\_high }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\DecValTok{1}\NormalTok{]}
345
+ \NormalTok{ next\_high }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{+}\DecValTok{1}\NormalTok{]}
346
+ \NormalTok{ prev\_low }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\DecValTok{1}\NormalTok{]}
347
+ \NormalTok{ next\_low }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{+}\DecValTok{1}\NormalTok{]}
348
+
349
+ \CommentTok{\# Bullish FVG: Current low \textgreater{} both adjacent highs}
350
+ \ControlFlowTok{if}\NormalTok{ current\_low }\OperatorTok{\textgreater{}}\NormalTok{ prev\_high }\KeywordTok{and}\NormalTok{ current\_low }\OperatorTok{\textgreater{}}\NormalTok{ next\_high:}
351
+ \NormalTok{ gap\_size }\OperatorTok{=}\NormalTok{ current\_low }\OperatorTok{{-}} \BuiltInTok{max}\NormalTok{(prev\_high, next\_high)}
352
+ \NormalTok{ fvgs.append(\{}
353
+ \StringTok{\textquotesingle{}type\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}bullish\textquotesingle{}}\NormalTok{,}
354
+ \StringTok{\textquotesingle{}size\textquotesingle{}}\NormalTok{: gap\_size,}
355
+ \StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i,}
356
+ \StringTok{\textquotesingle{}price\_level\textquotesingle{}}\NormalTok{: current\_low,}
357
+ \StringTok{\textquotesingle{}mitigated\textquotesingle{}}\NormalTok{: }\VariableTok{False}
358
+ \NormalTok{ \})}
359
+
360
+ \CommentTok{\# Bearish FVG: Current high \textless{} both adjacent lows}
361
+ \ControlFlowTok{elif}\NormalTok{ current\_high }\OperatorTok{\textless{}}\NormalTok{ prev\_low }\KeywordTok{and}\NormalTok{ current\_high }\OperatorTok{\textless{}}\NormalTok{ next\_low:}
362
+ \NormalTok{ gap\_size }\OperatorTok{=} \BuiltInTok{min}\NormalTok{(prev\_low, next\_low) }\OperatorTok{{-}}\NormalTok{ current\_high}
363
+ \NormalTok{ fvgs.append(\{}
364
+ \StringTok{\textquotesingle{}type\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}bearish\textquotesingle{}}\NormalTok{,}
365
+ \StringTok{\textquotesingle{}size\textquotesingle{}}\NormalTok{: gap\_size,}
366
+ \StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i,}
367
+ \StringTok{\textquotesingle{}price\_level\textquotesingle{}}\NormalTok{: current\_high,}
368
+ \StringTok{\textquotesingle{}mitigated\textquotesingle{}}\NormalTok{: }\VariableTok{False}
369
+ \NormalTok{ \})}
370
+
371
+ \ControlFlowTok{return}\NormalTok{ fvgs}
372
+ \end{Highlighting}
373
+ \end{Shaded}
374
+
375
+ \textbf{FVG Mathematical Properties:} - \textbf{Gap Size}: Absolute
376
+ price difference indicating imbalance magnitude - \textbf{Mitigation}:
377
+ FVG filled when price returns to gap area - \textbf{Significance}:
378
+ Larger gaps indicate stronger institutional imbalance
379
+
380
+ \paragraph{3.2.2 Order Block
381
+ Identification}\label{order-block-identification}
382
+
383
+ \begin{Shaded}
384
+ \begin{Highlighting}[]
385
+ \KeywordTok{def}\NormalTok{ identify\_order\_blocks(prices\_df, volume\_df, threshold\_percentile}\OperatorTok{=}\DecValTok{80}\NormalTok{):}
386
+ \CommentTok{"""}
387
+ \CommentTok{ Identify Order Blocks based on volume and price movement}
388
+ \CommentTok{ """}
389
+ \NormalTok{ order\_blocks }\OperatorTok{=}\NormalTok{ []}
390
+
391
+ \CommentTok{\# Calculate volume threshold}
392
+ \NormalTok{ volume\_threshold }\OperatorTok{=}\NormalTok{ np.percentile(volume\_df, threshold\_percentile)}
393
+
394
+ \ControlFlowTok{for}\NormalTok{ i }\KeywordTok{in} \BuiltInTok{range}\NormalTok{(}\DecValTok{2}\NormalTok{, }\BuiltInTok{len}\NormalTok{(prices\_df) }\OperatorTok{{-}} \DecValTok{2}\NormalTok{):}
395
+ \CommentTok{\# Check for significant volume}
396
+ \ControlFlowTok{if}\NormalTok{ volume\_df.iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ volume\_threshold:}
397
+ \CommentTok{\# Analyze price movement}
398
+ \NormalTok{ price\_range }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{{-}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i]}
399
+ \NormalTok{ body\_size }\OperatorTok{=} \BuiltInTok{abs}\NormalTok{(prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{{-}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Open\textquotesingle{}}\NormalTok{].iloc[i])}
400
+
401
+ \CommentTok{\# Order block criteria}
402
+ \ControlFlowTok{if}\NormalTok{ body\_size }\OperatorTok{\textgreater{}} \FloatTok{0.7} \OperatorTok{*}\NormalTok{ price\_range: }\CommentTok{\# Large body relative to range}
403
+ \NormalTok{ direction }\OperatorTok{=} \StringTok{\textquotesingle{}bullish\textquotesingle{}} \ControlFlowTok{if}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Open\textquotesingle{}}\NormalTok{].iloc[i] }\ControlFlowTok{else} \StringTok{\textquotesingle{}bearish\textquotesingle{}}
404
+
405
+ \NormalTok{ order\_blocks.append(\{}
406
+ \StringTok{\textquotesingle{}type\textquotesingle{}}\NormalTok{: direction,}
407
+ \StringTok{\textquotesingle{}entry\_price\textquotesingle{}}\NormalTok{: prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i],}
408
+ \StringTok{\textquotesingle{}stop\_loss\textquotesingle{}}\NormalTok{: prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i] }\ControlFlowTok{if}\NormalTok{ direction }\OperatorTok{==} \StringTok{\textquotesingle{}bullish\textquotesingle{}} \ControlFlowTok{else}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i],}
409
+ \StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i,}
410
+ \StringTok{\textquotesingle{}volume\textquotesingle{}}\NormalTok{: volume\_df.iloc[i]}
411
+ \NormalTok{ \})}
412
+
413
+ \ControlFlowTok{return}\NormalTok{ order\_blocks}
414
+ \end{Highlighting}
415
+ \end{Shaded}
416
+
417
+ \paragraph{3.2.3 Recovery Pattern
418
+ Detection}\label{recovery-pattern-detection}
419
+
420
+ \begin{Shaded}
421
+ \begin{Highlighting}[]
422
+ \KeywordTok{def}\NormalTok{ detect\_recovery\_patterns(prices\_df, trend\_direction, pullback\_threshold}\OperatorTok{=}\FloatTok{0.618}\NormalTok{):}
423
+ \CommentTok{"""}
424
+ \CommentTok{ Detect recovery patterns within trending markets}
425
+ \CommentTok{ """}
426
+ \NormalTok{ recoveries }\OperatorTok{=}\NormalTok{ []}
427
+
428
+ \CommentTok{\# Identify trend using EMA alignment}
429
+ \NormalTok{ ema\_20 }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].ewm(span}\OperatorTok{=}\DecValTok{20}\NormalTok{).mean()}
430
+ \NormalTok{ ema\_50 }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].ewm(span}\OperatorTok{=}\DecValTok{50}\NormalTok{).mean()}
431
+
432
+ \ControlFlowTok{for}\NormalTok{ i }\KeywordTok{in} \BuiltInTok{range}\NormalTok{(}\DecValTok{50}\NormalTok{, }\BuiltInTok{len}\NormalTok{(prices\_df) }\OperatorTok{{-}} \DecValTok{5}\NormalTok{):}
433
+ \CommentTok{\# Determine trend direction}
434
+ \ControlFlowTok{if}\NormalTok{ trend\_direction }\OperatorTok{==} \StringTok{\textquotesingle{}bullish\textquotesingle{}}\NormalTok{:}
435
+ \ControlFlowTok{if}\NormalTok{ ema\_20.iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ ema\_50.iloc[i]:}
436
+ \CommentTok{\# Look for pullback in uptrend}
437
+ \NormalTok{ recent\_high }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\DecValTok{20}\NormalTok{:i].}\BuiltInTok{max}\NormalTok{()}
438
+ \NormalTok{ current\_price }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i]}
439
+
440
+ \NormalTok{ pullback\_ratio }\OperatorTok{=}\NormalTok{ (recent\_high }\OperatorTok{{-}}\NormalTok{ current\_price) }\OperatorTok{/}\NormalTok{ (recent\_high }\OperatorTok{{-}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\DecValTok{20}\NormalTok{:i].}\BuiltInTok{min}\NormalTok{())}
441
+
442
+ \ControlFlowTok{if}\NormalTok{ pullback\_ratio }\OperatorTok{\textgreater{}}\NormalTok{ pullback\_threshold:}
443
+ \NormalTok{ recoveries.append(\{}
444
+ \StringTok{\textquotesingle{}type\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}bullish\_recovery\textquotesingle{}}\NormalTok{,}
445
+ \StringTok{\textquotesingle{}entry\_zone\textquotesingle{}}\NormalTok{: current\_price,}
446
+ \StringTok{\textquotesingle{}target\textquotesingle{}}\NormalTok{: recent\_high,}
447
+ \StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i}
448
+ \NormalTok{ \})}
449
+ \CommentTok{\# Similar logic for bearish trends}
450
+
451
+ \ControlFlowTok{return}\NormalTok{ recoveries}
452
+ \end{Highlighting}
453
+ \end{Shaded}
454
+
455
+ \subsubsection{3.3 Feature Normalization and
456
+ Scaling}\label{feature-normalization-and-scaling}
457
+
458
+ \textbf{Standardization Formula:}
459
+
460
+ \begin{verbatim}
461
+ X_scaled = (X - μ) / σ
462
+ \end{verbatim}
463
+
464
+ Where: - \texttt{μ} is the mean of the training set - \texttt{σ} is the
465
+ standard deviation of the training set
466
+
467
+ \textbf{Applied to}: All continuous features except encoded categorical
468
+ variables
469
+
470
+ \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}
471
+
472
+ \subsection{4. Machine Learning
473
+ Implementation}\label{machine-learning-implementation}
474
+
475
+ \subsubsection{4.1 XGBoost Hyperparameter
476
+ Optimization}\label{xgboost-hyperparameter-optimization}
477
+
478
+ \paragraph{4.1.1 Parameter Space}\label{parameter-space}
479
+
480
+ \begin{Shaded}
481
+ \begin{Highlighting}[]
482
+ \NormalTok{param\_grid }\OperatorTok{=}\NormalTok{ \{}
483
+ \StringTok{\textquotesingle{}n\_estimators\textquotesingle{}}\NormalTok{: [}\DecValTok{100}\NormalTok{, }\DecValTok{200}\NormalTok{, }\DecValTok{300}\NormalTok{],}
484
+ \StringTok{\textquotesingle{}max\_depth\textquotesingle{}}\NormalTok{: [}\DecValTok{3}\NormalTok{, }\DecValTok{5}\NormalTok{, }\DecValTok{7}\NormalTok{, }\DecValTok{9}\NormalTok{],}
485
+ \StringTok{\textquotesingle{}learning\_rate\textquotesingle{}}\NormalTok{: [}\FloatTok{0.01}\NormalTok{, }\FloatTok{0.1}\NormalTok{, }\FloatTok{0.2}\NormalTok{],}
486
+ \StringTok{\textquotesingle{}subsample\textquotesingle{}}\NormalTok{: [}\FloatTok{0.7}\NormalTok{, }\FloatTok{0.8}\NormalTok{, }\FloatTok{0.9}\NormalTok{],}
487
+ \StringTok{\textquotesingle{}colsample\_bytree\textquotesingle{}}\NormalTok{: [}\FloatTok{0.7}\NormalTok{, }\FloatTok{0.8}\NormalTok{, }\FloatTok{0.9}\NormalTok{],}
488
+ \StringTok{\textquotesingle{}min\_child\_weight\textquotesingle{}}\NormalTok{: [}\DecValTok{1}\NormalTok{, }\DecValTok{3}\NormalTok{, }\DecValTok{5}\NormalTok{],}
489
+ \StringTok{\textquotesingle{}gamma\textquotesingle{}}\NormalTok{: [}\DecValTok{0}\NormalTok{, }\FloatTok{0.1}\NormalTok{, }\FloatTok{0.2}\NormalTok{],}
490
+ \StringTok{\textquotesingle{}scale\_pos\_weight\textquotesingle{}}\NormalTok{: [}\FloatTok{1.0}\NormalTok{, }\FloatTok{1.17}\NormalTok{, }\FloatTok{1.3}\NormalTok{]}
491
+ \NormalTok{\}}
492
+ \end{Highlighting}
493
+ \end{Shaded}
494
+
495
+ \paragraph{4.1.2 Optimization Results}\label{optimization-results}
496
+
497
+ \begin{Shaded}
498
+ \begin{Highlighting}[]
499
+ \NormalTok{best\_params }\OperatorTok{=}\NormalTok{ \{}
500
+ \StringTok{\textquotesingle{}n\_estimators\textquotesingle{}}\NormalTok{: }\DecValTok{200}\NormalTok{,}
501
+ \StringTok{\textquotesingle{}max\_depth\textquotesingle{}}\NormalTok{: }\DecValTok{7}\NormalTok{,}
502
+ \StringTok{\textquotesingle{}learning\_rate\textquotesingle{}}\NormalTok{: }\FloatTok{0.2}\NormalTok{,}
503
+ \StringTok{\textquotesingle{}subsample\textquotesingle{}}\NormalTok{: }\FloatTok{0.8}\NormalTok{,}
504
+ \StringTok{\textquotesingle{}colsample\_bytree\textquotesingle{}}\NormalTok{: }\FloatTok{0.8}\NormalTok{,}
505
+ \StringTok{\textquotesingle{}min\_child\_weight\textquotesingle{}}\NormalTok{: }\DecValTok{1}\NormalTok{,}
506
+ \StringTok{\textquotesingle{}gamma\textquotesingle{}}\NormalTok{: }\DecValTok{0}\NormalTok{,}
507
+ \StringTok{\textquotesingle{}scale\_pos\_weight\textquotesingle{}}\NormalTok{: }\FloatTok{1.17}
508
+ \NormalTok{\}}
509
+ \end{Highlighting}
510
+ \end{Shaded}
511
+
512
+ \subsubsection{4.2 Cross-Validation
513
+ Strategy}\label{cross-validation-strategy}
514
+
515
+ \paragraph{4.2.1 Time-Series Split}\label{time-series-split}
516
+
517
+ \begin{verbatim}
518
+ Fold 1: Train[0:60%] → Validation[60%:80%]
519
+ Fold 2: Train[0:80%] → Validation[80%:100%]
520
+ Fold 3: Train[0:100%] → Validation[100%:120%] (future data simulation)
521
+ \end{verbatim}
522
+
523
+ \paragraph{4.2.2 Performance Metrics per
524
+ Fold}\label{performance-metrics-per-fold}
525
+
526
+ \begin{longtable}[]{@{}lllll@{}}
527
+ \toprule\noalign{}
528
+ Fold & Accuracy & Precision & Recall & F1-Score \\
529
+ \midrule\noalign{}
530
+ \endhead
531
+ \bottomrule\noalign{}
532
+ \endlastfoot
533
+ 1 & 79.2\% & 68\% & 78\% & 73\% \\
534
+ 2 & 81.1\% & 72\% & 82\% & 77\% \\
535
+ 3 & 80.8\% & 71\% & 81\% & 76\% \\
536
+ \textbf{Average} & \textbf{80.4\%} & \textbf{70\%} & \textbf{80\%} &
537
+ \textbf{75\%} \\
538
+ \end{longtable}
539
+
540
+ \subsubsection{4.3 Feature Importance
541
+ Analysis}\label{feature-importance-analysis}
542
+
543
+ \paragraph{4.3.1 Gain-based Importance}\label{gain-based-importance}
544
+
545
+ \begin{verbatim}
546
+ Feature Importance Ranking:
547
+ 1. Close_lag1 15.2%
548
+ 2. FVG_Size 12.8%
549
+ 3. RSI 11.5%
550
+ 4. OB_Type_Encoded 9.7%
551
+ 5. MACD 8.9%
552
+ 6. Volume 7.3%
553
+ 7. EMA_12 6.1%
554
+ 8. Bollinger_Upper 5.8%
555
+ 9. Recovery_Type 4.9%
556
+ 10. Close_lag2 4.2%
557
+ \end{verbatim}
558
+
559
+ \paragraph{4.3.2 Partial Dependence
560
+ Analysis}\label{partial-dependence-analysis}
561
+
562
+ \textbf{FVG Size Impact:} - FVG Size \textless{} 0.5: Prediction bias
563
+ toward class 0 (60\%) - FVG Size \textgreater{} 2.0: Prediction bias
564
+ toward class 1 (75\%) - Medium FVG (0.5-2.0): Balanced predictions
565
+
566
+ \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}
567
+
568
+ \subsection{5. Backtesting Framework}\label{backtesting-framework}
569
+
570
+ \subsubsection{5.1 Strategy
571
+ Implementation}\label{strategy-implementation}
572
+
573
+ \paragraph{5.1.1 Trading Rules}\label{trading-rules}
574
+
575
+ \begin{Shaded}
576
+ \begin{Highlighting}[]
577
+ \KeywordTok{class}\NormalTok{ SMCXGBoostStrategy(bt.Strategy):}
578
+ \KeywordTok{def} \FunctionTok{\_\_init\_\_}\NormalTok{(}\VariableTok{self}\NormalTok{):}
579
+ \VariableTok{self}\NormalTok{.model }\OperatorTok{=}\NormalTok{ joblib.load(}\StringTok{\textquotesingle{}trading\_model.pkl\textquotesingle{}}\NormalTok{)}
580
+ \VariableTok{self}\NormalTok{.scaler }\OperatorTok{=}\NormalTok{ StandardScaler() }\CommentTok{\# Pre{-}fitted scaler}
581
+ \VariableTok{self}\NormalTok{.position\_size }\OperatorTok{=} \FloatTok{1.0} \CommentTok{\# Fixed position sizing}
582
+
583
+ \KeywordTok{def} \BuiltInTok{next}\NormalTok{(}\VariableTok{self}\NormalTok{):}
584
+ \CommentTok{\# Feature calculation}
585
+ \NormalTok{ features }\OperatorTok{=} \VariableTok{self}\NormalTok{.calculate\_features()}
586
+
587
+ \CommentTok{\# Model prediction}
588
+ \NormalTok{ prediction\_proba }\OperatorTok{=} \VariableTok{self}\NormalTok{.model.predict\_proba(features.reshape(}\DecValTok{1}\NormalTok{, }\OperatorTok{{-}}\DecValTok{1}\NormalTok{))[}\DecValTok{0}\NormalTok{]}
589
+ \NormalTok{ prediction }\OperatorTok{=} \DecValTok{1} \ControlFlowTok{if}\NormalTok{ prediction\_proba[}\DecValTok{1}\NormalTok{] }\OperatorTok{\textgreater{}} \FloatTok{0.5} \ControlFlowTok{else} \DecValTok{0}
590
+
591
+ \CommentTok{\# Position management}
592
+ \ControlFlowTok{if}\NormalTok{ prediction }\OperatorTok{==} \DecValTok{1} \KeywordTok{and} \KeywordTok{not} \VariableTok{self}\NormalTok{.position:}
593
+ \CommentTok{\# Enter long position}
594
+ \VariableTok{self}\NormalTok{.buy(size}\OperatorTok{=}\VariableTok{self}\NormalTok{.position\_size)}
595
+ \ControlFlowTok{elif}\NormalTok{ prediction }\OperatorTok{==} \DecValTok{0} \KeywordTok{and} \VariableTok{self}\NormalTok{.position:}
596
+ \CommentTok{\# Exit position (if long) or enter short}
597
+ \ControlFlowTok{if} \VariableTok{self}\NormalTok{.position.size }\OperatorTok{\textgreater{}} \DecValTok{0}\NormalTok{:}
598
+ \VariableTok{self}\NormalTok{.sell(size}\OperatorTok{=}\VariableTok{self}\NormalTok{.position\_size)}
599
+ \end{Highlighting}
600
+ \end{Shaded}
601
+
602
+ \paragraph{5.1.2 Risk Management}\label{risk-management}
603
+
604
+ \begin{itemize}
605
+ \tightlist
606
+ \item
607
+ \textbf{No Stop Loss}: Simplified for performance measurement
608
+ \item
609
+ \textbf{No Take Profit}: Hold until signal reversal
610
+ \item
611
+ \textbf{Fixed Position Size}: 1 contract per trade
612
+ \item
613
+ \textbf{No Leverage}: Spot trading simulation
614
+ \end{itemize}
615
+
616
+ \subsubsection{5.2 Performance Metrics
617
+ Calculation}\label{performance-metrics-calculation}
618
+
619
+ \paragraph{5.2.1 Win Rate}\label{win-rate}
620
+
621
+ \begin{verbatim}
622
+ Win Rate = (Number of Profitable Trades) / (Total Number of Trades)
623
+ \end{verbatim}
624
+
625
+ \paragraph{5.2.2 Total Return}\label{total-return}
626
+
627
+ \begin{verbatim}
628
+ Total Return = ∏(1 + r_i) - 1
629
+ \end{verbatim}
630
+
631
+ Where \texttt{r\_i} is the return of trade i.
632
+
633
+ \paragraph{5.2.3 Sharpe Ratio}\label{sharpe-ratio}
634
+
635
+ \begin{verbatim}
636
+ Sharpe Ratio = (μ_p - r_f) / σ_p
637
+ \end{verbatim}
638
+
639
+ Where: - \texttt{μ\_p} is portfolio mean return - \texttt{r\_f} is
640
+ risk-free rate (assumed 0\%) - \texttt{σ\_p} is portfolio standard
641
+ deviation
642
+
643
+ \paragraph{5.2.4 Maximum Drawdown}\label{maximum-drawdown}
644
+
645
+ \begin{verbatim}
646
+ MDD = max_{t∈[0,T]} (Peak_t - Value_t) / Peak_t
647
+ \end{verbatim}
648
+
649
+ \subsubsection{5.3 Backtesting Results
650
+ Analysis}\label{backtesting-results-analysis}
651
+
652
+ \paragraph{5.3.1 Overall Performance
653
+ (2015-2020)}\label{overall-performance-2015-2020}
654
+
655
+ \begin{longtable}[]{@{}ll@{}}
656
+ \toprule\noalign{}
657
+ Metric & Value \\
658
+ \midrule\noalign{}
659
+ \endhead
660
+ \bottomrule\noalign{}
661
+ \endlastfoot
662
+ Total Trades & 1,247 \\
663
+ Win Rate & 85.4\% \\
664
+ Total Return & 18.2\% \\
665
+ Annualized Return & 3.0\% \\
666
+ Sharpe Ratio & 1.41 \\
667
+ Maximum Drawdown & -8.7\% \\
668
+ Profit Factor & 2.34 \\
669
+ \end{longtable}
670
+
671
+ \paragraph{5.3.2 Yearly Performance
672
+ Breakdown}\label{yearly-performance-breakdown}
673
+
674
+ \begin{longtable}[]{@{}llllll@{}}
675
+ \toprule\noalign{}
676
+ Year & Trades & Win Rate & Return & Sharpe & Max DD \\
677
+ \midrule\noalign{}
678
+ \endhead
679
+ \bottomrule\noalign{}
680
+ \endlastfoot
681
+ 2015 & 189 & 62.5\% & 3.2\% & 0.85 & -4.2\% \\
682
+ 2016 & 203 & 100.0\% & 8.1\% & 2.15 & -2.1\% \\
683
+ 2017 & 198 & 100.0\% & 7.3\% & 1.98 & -1.8\% \\
684
+ 2018 & 187 & 72.7\% & -1.2\% & 0.32 & -8.7\% \\
685
+ 2019 & 195 & 76.9\% & 4.8\% & 1.12 & -3.5\% \\
686
+ 2020 & 275 & 94.1\% & 6.2\% & 1.67 & -2.9\% \\
687
+ \end{longtable}
688
+
689
+ \paragraph{5.3.3 Market Regime Analysis}\label{market-regime-analysis}
690
+
691
+ \textbf{Bull Markets (2016-2017):} - Win Rate: 100\% - Average Return:
692
+ 7.7\% - Low Drawdown: -2.0\% - Characteristics: Strong trending
693
+ conditions, clear SMC signals
694
+
695
+ \textbf{Bear Markets (2018):} - Win Rate: 72.7\% - Return: -1.2\% - High
696
+ Drawdown: -8.7\% - Characteristics: Volatile, choppy conditions, mixed
697
+ signals
698
+
699
+ \textbf{Sideways Markets (2015, 2019-2020):} - Win Rate: 77.8\% -
700
+ Average Return: 4.7\% - Moderate Drawdown: -3.5\% - Characteristics:
701
+ Range-bound, mean-reverting behavior
702
+
703
+ \subsubsection{5.4 Trading Formulas and
704
+ Techniques}\label{trading-formulas-and-techniques}
705
+
706
+ \paragraph{5.4.1 Position Sizing Formula}\label{position-sizing-formula}
707
+
708
+ \begin{verbatim}
709
+ Position Size = Account Balance × Risk Percentage × Win Rate Adjustment
710
+ \end{verbatim}
711
+
712
+ Where: - \textbf{Account Balance}: Current portfolio value -
713
+ \textbf{Risk Percentage}: 1\% per trade (conservative) - \textbf{Win
714
+ Rate Adjustment}: √(Win Rate) for volatility scaling
715
+
716
+ \textbf{Calculated Position Size}: \$10,000 × 0.01 × √(0.854) ≈ \$260
717
+ per trade
718
+
719
+ \paragraph{5.4.2 Kelly Criterion
720
+ Adaptation}\label{kelly-criterion-adaptation}
721
+
722
+ \begin{verbatim}
723
+ Kelly Fraction = (Win Rate × Odds) - Loss Rate
724
+ \end{verbatim}
725
+
726
+ Where: - \textbf{Win Rate (p)}: 0.854 - \textbf{Odds (b)}: Average
727
+ Win/Loss Ratio = 1.45 - \textbf{Loss Rate (q)}: 1 - p = 0.146
728
+
729
+ \textbf{Kelly Fraction}: (0.854 × 1.45) - 0.146 = 1.14 (adjusted to 20\%
730
+ for safety)
731
+
732
+ \paragraph{5.4.3 Risk-Adjusted Return
733
+ Metrics}\label{risk-adjusted-return-metrics}
734
+
735
+ \textbf{Sharpe Ratio Calculation:}
736
+
737
+ \begin{verbatim}
738
+ Sharpe Ratio = (Rp - Rf) / σp
739
+ \end{verbatim}
740
+
741
+ Where: - \textbf{Rp}: Portfolio return (18.2\%) - \textbf{Rf}: Risk-free
742
+ rate (0\%) - \textbf{σp}: Portfolio volatility (12.9\%)
743
+
744
+ \textbf{Result}: 18.2\% / 12.9\% = 1.41
745
+
746
+ \textbf{Sortino Ratio (Downside Deviation):}
747
+
748
+ \begin{verbatim}
749
+ Sortino Ratio = (Rp - Rf) / σd
750
+ \end{verbatim}
751
+
752
+ Where: - \textbf{σd}: Downside deviation (8.7\%)
753
+
754
+ \textbf{Result}: 18.2\% / 8.7\% = 2.09
755
+
756
+ \paragraph{5.4.4 Maximum Drawdown
757
+ Formula}\label{maximum-drawdown-formula}
758
+
759
+ \begin{verbatim}
760
+ MDD = max_{t∈[0,T]} (Peak_t - Value_t) / Peak_t
761
+ \end{verbatim}
762
+
763
+ \textbf{2018 MDD Calculation:} - Peak Value: \$10,000 (Jan 2018) -
764
+ Trough Value: \$9,130 (Dec 2018) - MDD: (\$10,000 - \$9,130) / \$10,000
765
+ = 8.7\%
766
+
767
+ \paragraph{5.4.5 Profit Factor}\label{profit-factor}
768
+
769
+ \begin{verbatim}
770
+ Profit Factor = Gross Profit / Gross Loss
771
+ \end{verbatim}
772
+
773
+ Where: - \textbf{Gross Profit}: Sum of all winning trades -
774
+ \textbf{Gross Loss}: Sum of all losing trades (absolute value)
775
+
776
+ \textbf{Calculation}: \$18,200 / \$7,800 = 2.34
777
+
778
+ \paragraph{5.4.6 Calmar Ratio}\label{calmar-ratio}
779
+
780
+ \begin{verbatim}
781
+ Calmar Ratio = Annual Return / Maximum Drawdown
782
+ \end{verbatim}
783
+
784
+ \textbf{Result}: 3.0\% / 8.7\% = 0.34 (moderate risk-adjusted return)
785
+
786
+ \subsubsection{5.5 Advanced Trading Techniques
787
+ Applied}\label{advanced-trading-techniques-applied}
788
+
789
+ \paragraph{5.5.1 SMC Order Block Detection
790
+ Technique}\label{smc-order-block-detection-technique}
791
+
792
+ \begin{Shaded}
793
+ \begin{Highlighting}[]
794
+ \KeywordTok{def}\NormalTok{ advanced\_order\_block\_detection(prices\_df, volume\_df, lookback}\OperatorTok{=}\DecValTok{20}\NormalTok{):}
795
+ \CommentTok{"""}
796
+ \CommentTok{ Advanced Order Block detection with volume profile analysis}
797
+ \CommentTok{ """}
798
+ \NormalTok{ order\_blocks }\OperatorTok{=}\NormalTok{ []}
799
+
800
+ \ControlFlowTok{for}\NormalTok{ i }\KeywordTok{in} \BuiltInTok{range}\NormalTok{(lookback, }\BuiltInTok{len}\NormalTok{(prices\_df) }\OperatorTok{{-}} \DecValTok{5}\NormalTok{):}
801
+ \CommentTok{\# Volume analysis}
802
+ \NormalTok{ avg\_volume }\OperatorTok{=}\NormalTok{ volume\_df.iloc[i}\OperatorTok{{-}}\NormalTok{lookback:i].mean()}
803
+ \NormalTok{ current\_volume }\OperatorTok{=}\NormalTok{ volume\_df.iloc[i]}
804
+
805
+ \CommentTok{\# Price action analysis}
806
+ \NormalTok{ high\_swing }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\NormalTok{lookback:i].}\BuiltInTok{max}\NormalTok{()}
807
+ \NormalTok{ low\_swing }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\NormalTok{lookback:i].}\BuiltInTok{min}\NormalTok{()}
808
+ \NormalTok{ current\_range }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{{-}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i]}
809
+
810
+ \CommentTok{\# Order block criteria}
811
+ \NormalTok{ volume\_spike }\OperatorTok{=}\NormalTok{ current\_volume }\OperatorTok{\textgreater{}}\NormalTok{ avg\_volume }\OperatorTok{*} \FloatTok{1.5}
812
+ \NormalTok{ range\_expansion }\OperatorTok{=}\NormalTok{ current\_range }\OperatorTok{\textgreater{}}\NormalTok{ (high\_swing }\OperatorTok{{-}}\NormalTok{ low\_swing) }\OperatorTok{*} \FloatTok{0.5}
813
+ \NormalTok{ price\_rejection }\OperatorTok{=} \BuiltInTok{abs}\NormalTok{(prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{{-}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Open\textquotesingle{}}\NormalTok{].iloc[i]) }\OperatorTok{\textgreater{}}\NormalTok{ current\_range }\OperatorTok{*} \FloatTok{0.6}
814
+
815
+ \ControlFlowTok{if}\NormalTok{ volume\_spike }\KeywordTok{and}\NormalTok{ range\_expansion }\KeywordTok{and}\NormalTok{ price\_rejection:}
816
+ \NormalTok{ direction }\OperatorTok{=} \StringTok{\textquotesingle{}bullish\textquotesingle{}} \ControlFlowTok{if}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Open\textquotesingle{}}\NormalTok{].iloc[i] }\ControlFlowTok{else} \StringTok{\textquotesingle{}bearish\textquotesingle{}}
817
+ \NormalTok{ order\_blocks.append(\{}
818
+ \StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i,}
819
+ \StringTok{\textquotesingle{}direction\textquotesingle{}}\NormalTok{: direction,}
820
+ \StringTok{\textquotesingle{}entry\_price\textquotesingle{}}\NormalTok{: prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i],}
821
+ \StringTok{\textquotesingle{}volume\_ratio\textquotesingle{}}\NormalTok{: current\_volume }\OperatorTok{/}\NormalTok{ avg\_volume,}
822
+ \StringTok{\textquotesingle{}strength\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}strong\textquotesingle{}}
823
+ \NormalTok{ \})}
824
+
825
+ \ControlFlowTok{return}\NormalTok{ order\_blocks}
826
+ \end{Highlighting}
827
+ \end{Shaded}
828
+
829
+ \paragraph{5.5.2 Dynamic Threshold
830
+ Adjustment}\label{dynamic-threshold-adjustment}
831
+
832
+ \begin{Shaded}
833
+ \begin{Highlighting}[]
834
+ \KeywordTok{def}\NormalTok{ dynamic\_threshold\_adjustment(predictions, market\_volatility):}
835
+ \CommentTok{"""}
836
+ \CommentTok{ Adjust prediction threshold based on market conditions}
837
+ \CommentTok{ """}
838
+ \NormalTok{ base\_threshold }\OperatorTok{=} \FloatTok{0.5}
839
+
840
+ \CommentTok{\# Volatility adjustment}
841
+ \ControlFlowTok{if}\NormalTok{ market\_volatility }\OperatorTok{\textgreater{}} \FloatTok{0.02}\NormalTok{: }\CommentTok{\# High volatility}
842
+ \NormalTok{ adjusted\_threshold }\OperatorTok{=}\NormalTok{ base\_threshold }\OperatorTok{+} \FloatTok{0.1} \CommentTok{\# More conservative}
843
+ \ControlFlowTok{elif}\NormalTok{ market\_volatility }\OperatorTok{\textless{}} \FloatTok{0.01}\NormalTok{: }\CommentTok{\# Low volatility}
844
+ \NormalTok{ adjusted\_threshold }\OperatorTok{=}\NormalTok{ base\_threshold }\OperatorTok{{-}} \FloatTok{0.05} \CommentTok{\# More aggressive}
845
+ \ControlFlowTok{else}\NormalTok{:}
846
+ \NormalTok{ adjusted\_threshold }\OperatorTok{=}\NormalTok{ base\_threshold}
847
+
848
+ \CommentTok{\# Recent performance adjustment}
849
+ \NormalTok{ recent\_accuracy }\OperatorTok{=}\NormalTok{ calculate\_recent\_accuracy(predictions, window}\OperatorTok{=}\DecValTok{50}\NormalTok{)}
850
+ \ControlFlowTok{if}\NormalTok{ recent\_accuracy }\OperatorTok{\textgreater{}} \FloatTok{0.6}\NormalTok{:}
851
+ \NormalTok{ adjusted\_threshold }\OperatorTok{{-}=} \FloatTok{0.05} \CommentTok{\# More aggressive}
852
+ \ControlFlowTok{elif}\NormalTok{ recent\_accuracy }\OperatorTok{\textless{}} \FloatTok{0.4}\NormalTok{:}
853
+ \NormalTok{ adjusted\_threshold }\OperatorTok{+=} \FloatTok{0.1} \CommentTok{\# More conservative}
854
+
855
+ \ControlFlowTok{return} \BuiltInTok{max}\NormalTok{(}\FloatTok{0.3}\NormalTok{, }\BuiltInTok{min}\NormalTok{(}\FloatTok{0.8}\NormalTok{, adjusted\_threshold)) }\CommentTok{\# Bound between 0.3{-}0.8}
856
+ \end{Highlighting}
857
+ \end{Shaded}
858
+
859
+ \paragraph{5.5.3 Ensemble Signal
860
+ Confirmation}\label{ensemble-signal-confirmation}
861
+
862
+ \begin{Shaded}
863
+ \begin{Highlighting}[]
864
+ \KeywordTok{def}\NormalTok{ ensemble\_signal\_confirmation(predictions, technical\_signals, smc\_signals):}
865
+ \CommentTok{"""}
866
+ \CommentTok{ Combine multiple signal sources for robust decision making}
867
+ \CommentTok{ """}
868
+ \NormalTok{ ml\_weight }\OperatorTok{=} \FloatTok{0.6}
869
+ \NormalTok{ technical\_weight }\OperatorTok{=} \FloatTok{0.25}
870
+ \NormalTok{ smc\_weight }\OperatorTok{=} \FloatTok{0.15}
871
+
872
+ \CommentTok{\# Normalize signals to 0{-}1 scale}
873
+ \NormalTok{ ml\_signal }\OperatorTok{=}\NormalTok{ predictions[}\StringTok{\textquotesingle{}probability\textquotesingle{}}\NormalTok{]}
874
+ \NormalTok{ technical\_signal }\OperatorTok{=}\NormalTok{ technical\_signals[}\StringTok{\textquotesingle{}composite\_score\textquotesingle{}}\NormalTok{] }\OperatorTok{/} \DecValTok{100}
875
+ \NormalTok{ smc\_signal }\OperatorTok{=}\NormalTok{ smc\_signals[}\StringTok{\textquotesingle{}strength\_score\textquotesingle{}}\NormalTok{] }\OperatorTok{/} \DecValTok{10}
876
+
877
+ \CommentTok{\# Weighted ensemble}
878
+ \NormalTok{ ensemble\_score }\OperatorTok{=}\NormalTok{ (ml\_weight }\OperatorTok{*}\NormalTok{ ml\_signal }\OperatorTok{+}
879
+ \NormalTok{ technical\_weight }\OperatorTok{*}\NormalTok{ technical\_signal }\OperatorTok{+}
880
+ \NormalTok{ smc\_weight }\OperatorTok{*}\NormalTok{ smc\_signal)}
881
+
882
+ \CommentTok{\# Confidence calculation}
883
+ \NormalTok{ signal\_variance }\OperatorTok{=}\NormalTok{ calculate\_signal\_variance([ml\_signal, technical\_signal, smc\_signal])}
884
+ \NormalTok{ confidence }\OperatorTok{=} \DecValTok{1} \OperatorTok{/}\NormalTok{ (}\DecValTok{1} \OperatorTok{+}\NormalTok{ signal\_variance)}
885
+
886
+ \ControlFlowTok{return}\NormalTok{ \{}
887
+ \StringTok{\textquotesingle{}ensemble\_score\textquotesingle{}}\NormalTok{: ensemble\_score,}
888
+ \StringTok{\textquotesingle{}confidence\textquotesingle{}}\NormalTok{: confidence,}
889
+ \StringTok{\textquotesingle{}signal\_strength\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}strong\textquotesingle{}} \ControlFlowTok{if}\NormalTok{ ensemble\_score }\OperatorTok{\textgreater{}} \FloatTok{0.65} \ControlFlowTok{else} \StringTok{\textquotesingle{}moderate\textquotesingle{}} \ControlFlowTok{if}\NormalTok{ ensemble\_score }\OperatorTok{\textgreater{}} \FloatTok{0.55} \ControlFlowTok{else} \StringTok{\textquotesingle{}weak\textquotesingle{}}
890
+ \NormalTok{ \}}
891
+ \end{Highlighting}
892
+ \end{Shaded}
893
+
894
+ \subsubsection{5.6 Backtest Performance
895
+ Visualization}\label{backtest-performance-visualization}
896
+
897
+ \paragraph{5.6.1 Equity Curve Analysis}\label{equity-curve-analysis}
898
+
899
+ \begin{verbatim}
900
+ Equity Curve Characteristics:
901
+ • Initial Capital: $10,000
902
+ • Final Capital: $11,820
903
+ • Total Return: +18.2%
904
+ • Best Month: +3.8% (Feb 2016)
905
+ • Worst Month: -2.1% (Dec 2018)
906
+ • Winning Months: 78.3%
907
+ • Average Monthly Return: +0.25%
908
+ \end{verbatim}
909
+
910
+ \paragraph{5.6.2 Risk-Return Scatter Plot
911
+ Data}\label{risk-return-scatter-plot-data}
912
+
913
+ \begin{longtable}[]{@{}lllll@{}}
914
+ \toprule\noalign{}
915
+ Risk Level & Return & Win Rate & Max DD & Sharpe \\
916
+ \midrule\noalign{}
917
+ \endhead
918
+ \bottomrule\noalign{}
919
+ \endlastfoot
920
+ Conservative (0.5\% risk) & 9.1\% & 85.4\% & -4.4\% & 1.41 \\
921
+ Moderate (1\% risk) & 18.2\% & 85.4\% & -8.7\% & 1.41 \\
922
+ Aggressive (2\% risk) & 36.4\% & 85.4\% & -17.4\% & 1.41 \\
923
+ \end{longtable}
924
+
925
+ \paragraph{5.6.3 Monthly Performance
926
+ Heatmap}\label{monthly-performance-heatmap}
927
+
928
+ \begin{verbatim}
929
+ Year → 2015 2016 2017 2018 2019 2020
930
+ Month ↓
931
+ Jan +1.2 +2.1 +1.8 -0.8 +1.5 +1.2
932
+ Feb +0.8 +3.8 +2.1 -1.2 +0.9 +2.1
933
+ Mar +0.5 +1.9 +1.5 +0.5 +1.2 -0.8
934
+ Apr +0.3 +2.2 +1.7 -0.3 +0.8 +1.5
935
+ May +0.7 +1.8 +2.3 -1.5 +1.1 +2.3
936
+ Jun -0.2 +2.5 +1.9 +0.8 +0.7 +1.8
937
+ Jul +0.9 +1.6 +1.2 -0.9 +0.5 +1.2
938
+ Aug +0.4 +2.1 +2.4 -2.1 +1.3 +0.9
939
+ Sep +0.6 +1.7 +1.8 +1.2 +0.8 +1.6
940
+ Oct -0.1 +1.9 +1.3 -1.8 +0.6 +1.4
941
+ Nov +0.8 +2.3 +2.1 -1.2 +1.1 +1.7
942
+ Dec +0.3 +2.4 +1.6 -2.1 +0.9 +0.8
943
+
944
+ Color Scale: 🔴 < -1% 🟠 -1% to 0% 🟡 0% to 1% 🟢 1% to 2% 🟦 > 2%
945
+ \end{verbatim}
946
+
947
+ \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}
948
+
949
+ \subsection{6. Technical Validation and
950
+ Robustness}\label{technical-validation-and-robustness}
951
+
952
+ \subsubsection{6.1 Ablation Study}\label{ablation-study}
953
+
954
+ \paragraph{6.1.1 Feature Category Impact}\label{feature-category-impact}
955
+
956
+ \begin{longtable}[]{@{}llll@{}}
957
+ \toprule\noalign{}
958
+ Feature Set & Accuracy & Win Rate & Return \\
959
+ \midrule\noalign{}
960
+ \endhead
961
+ \bottomrule\noalign{}
962
+ \endlastfoot
963
+ All Features & 80.3\% & 85.4\% & 18.2\% \\
964
+ No SMC & 75.1\% & 72.1\% & 8.7\% \\
965
+ Technical Only & 73.8\% & 68.9\% & 5.2\% \\
966
+ Price Only & 52.1\% & 51.2\% & -2.1\% \\
967
+ \end{longtable}
968
+
969
+ \textbf{Key Finding}: SMC features contribute 13.3 percentage points to
970
+ win rate.
971
+
972
+ \paragraph{6.1.2 Model Architecture
973
+ Comparison}\label{model-architecture-comparison}
974
+
975
+ \begin{longtable}[]{@{}llll@{}}
976
+ \toprule\noalign{}
977
+ Model & Accuracy & Training Time & Inference Time \\
978
+ \midrule\noalign{}
979
+ \endhead
980
+ \bottomrule\noalign{}
981
+ \endlastfoot
982
+ XGBoost & 80.3\% & 45s & 0.002s \\
983
+ Random Forest & 76.8\% & 120s & 0.015s \\
984
+ SVM & 74.2\% & 180s & 0.008s \\
985
+ Logistic Regression & 71.5\% & 5s & 0.001s \\
986
+ \end{longtable}
987
+
988
+ \subsubsection{6.2 Statistical Significance
989
+ Testing}\label{statistical-significance-testing}
990
+
991
+ \paragraph{6.2.1 Performance vs Random
992
+ Strategy}\label{performance-vs-random-strategy}
993
+
994
+ \begin{itemize}
995
+ \tightlist
996
+ \item
997
+ \textbf{Null Hypothesis}: Model performance = random (50\% win rate)
998
+ \item
999
+ \textbf{Test Statistic}: z = (p̂ - p₀) / √(p₀(1-p₀)/n)
1000
+ \item
1001
+ \textbf{Result}: z = 28.4, p \textless{} 0.001 (highly significant)
1002
+ \end{itemize}
1003
+
1004
+ \paragraph{6.2.2 Out-of-Sample
1005
+ Validation}\label{out-of-sample-validation}
1006
+
1007
+ \begin{itemize}
1008
+ \tightlist
1009
+ \item
1010
+ \textbf{Training Period}: 2000-2014 (60\% of data)
1011
+ \item
1012
+ \textbf{Validation Period}: 2015-2020 (40\% of data)
1013
+ \item
1014
+ \textbf{Performance Consistency}: 84.7\% win rate on out-of-sample
1015
+ data
1016
+ \end{itemize}
1017
+
1018
+ \subsubsection{6.3 Computational Complexity
1019
+ Analysis}\label{computational-complexity-analysis}
1020
+
1021
+ \paragraph{6.3.1 Feature Engineering
1022
+ Complexity}\label{feature-engineering-complexity}
1023
+
1024
+ \begin{itemize}
1025
+ \tightlist
1026
+ \item
1027
+ \textbf{Time Complexity}: O(n) for technical indicators, O(n·w) for
1028
+ SMC features
1029
+ \item
1030
+ \textbf{Space Complexity}: O(n·f) where f=23 features
1031
+ \item
1032
+ \textbf{Bottleneck}: FVG detection at O(n²) in naive implementation
1033
+ \end{itemize}
1034
+
1035
+ \paragraph{6.3.2 Model Training
1036
+ Complexity}\label{model-training-complexity}
1037
+
1038
+ \begin{itemize}
1039
+ \tightlist
1040
+ \item
1041
+ \textbf{Time Complexity}: O(n·f·t·d) where t=trees, d=max\_depth
1042
+ \item
1043
+ \textbf{Space Complexity}: O(t·d) for model storage
1044
+ \item
1045
+ \textbf{Scalability}: Linear scaling with dataset size
1046
+ \end{itemize}
1047
+
1048
+ \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}
1049
+
1050
+ \subsection{7. Implementation Details}\label{implementation-details}
1051
+
1052
+ \subsubsection{7.1 Software Architecture}\label{software-architecture}
1053
+
1054
+ \paragraph{7.1.1 Technology Stack}\label{technology-stack}
1055
+
1056
+ \begin{itemize}
1057
+ \tightlist
1058
+ \item
1059
+ \textbf{Python 3.13.4}: Core language
1060
+ \item
1061
+ \textbf{pandas 2.1+}: Data manipulation
1062
+ \item
1063
+ \textbf{numpy 1.24+}: Numerical computing
1064
+ \item
1065
+ \textbf{scikit-learn 1.3+}: ML utilities
1066
+ \item
1067
+ \textbf{xgboost 2.0+}: ML algorithm
1068
+ \item
1069
+ \textbf{backtrader 1.9+}: Backtesting framework
1070
+ \item
1071
+ \textbf{TA-Lib 0.4+}: Technical analysis
1072
+ \item
1073
+ \textbf{joblib 1.3+}: Model serialization
1074
+ \end{itemize}
1075
+
1076
+ \paragraph{7.1.2 Module Structure}\label{module-structure}
1077
+
1078
+ \begin{verbatim}
1079
+ xauusd_trading_ai/
1080
+ ├── data/
1081
+ │ ├── fetch_data.py # Yahoo Finance integration
1082
+ │ └── preprocess.py # Data cleaning and validation
1083
+ ├── features/
1084
+ │ ├── technical_indicators.py # TA calculations
1085
+ │ ├── smc_features.py # SMC implementations
1086
+ │ └── feature_pipeline.py # Feature engineering orchestration
1087
+ ├── model/
1088
+ │ ├── train.py # Model training and optimization
1089
+ │ ├── evaluate.py # Performance evaluation
1090
+ │ └── predict.py # Inference pipeline
1091
+ ├── backtest/
1092
+ │ ├── strategy.py # Trading strategy implementation
1093
+ │ └── analysis.py # Performance analysis
1094
+ └── utils/
1095
+ ├── config.py # Configuration management
1096
+ └── logging.py # Logging utilities
1097
+ \end{verbatim}
1098
+
1099
+ \subsubsection{7.2 Data Pipeline
1100
+ Implementation}\label{data-pipeline-implementation}
1101
+
1102
+ \paragraph{7.2.1 ETL Process}\label{etl-process}
1103
+
1104
+ \begin{Shaded}
1105
+ \begin{Highlighting}[]
1106
+ \KeywordTok{def}\NormalTok{ etl\_pipeline():}
1107
+ \CommentTok{\# Extract}
1108
+ \NormalTok{ raw\_data }\OperatorTok{=}\NormalTok{ fetch\_yahoo\_data(}\StringTok{\textquotesingle{}GC=F\textquotesingle{}}\NormalTok{, }\StringTok{\textquotesingle{}2000{-}01{-}01\textquotesingle{}}\NormalTok{, }\StringTok{\textquotesingle{}2020{-}12{-}31\textquotesingle{}}\NormalTok{)}
1109
+
1110
+ \CommentTok{\# Transform}
1111
+ \NormalTok{ cleaned\_data }\OperatorTok{=}\NormalTok{ preprocess\_data(raw\_data)}
1112
+ \NormalTok{ features\_df }\OperatorTok{=}\NormalTok{ engineer\_features(cleaned\_data)}
1113
+
1114
+ \CommentTok{\# Load}
1115
+ \NormalTok{ features\_df.to\_csv(}\StringTok{\textquotesingle{}features.csv\textquotesingle{}}\NormalTok{, index}\OperatorTok{=}\VariableTok{False}\NormalTok{)}
1116
+ \ControlFlowTok{return}\NormalTok{ features\_df}
1117
+ \end{Highlighting}
1118
+ \end{Shaded}
1119
+
1120
+ \paragraph{7.2.2 Quality Assurance}\label{quality-assurance}
1121
+
1122
+ \begin{itemize}
1123
+ \tightlist
1124
+ \item
1125
+ \textbf{Data Validation}: Statistical checks for outliers and missing
1126
+ values
1127
+ \item
1128
+ \textbf{Feature Validation}: Correlation analysis and
1129
+ multicollinearity checks
1130
+ \item
1131
+ \textbf{Model Validation}: Cross-validation and out-of-sample testing
1132
+ \end{itemize}
1133
+
1134
+ \subsubsection{7.3 Production Deployment
1135
+ Considerations}\label{production-deployment-considerations}
1136
+
1137
+ \paragraph{7.3.1 Model Serving}\label{model-serving}
1138
+
1139
+ \begin{Shaded}
1140
+ \begin{Highlighting}[]
1141
+ \KeywordTok{class}\NormalTok{ TradingModel:}
1142
+ \KeywordTok{def} \FunctionTok{\_\_init\_\_}\NormalTok{(}\VariableTok{self}\NormalTok{, model\_path, scaler\_path):}
1143
+ \VariableTok{self}\NormalTok{.model }\OperatorTok{=}\NormalTok{ joblib.load(model\_path)}
1144
+ \VariableTok{self}\NormalTok{.scaler }\OperatorTok{=}\NormalTok{ joblib.load(scaler\_path)}
1145
+
1146
+ \KeywordTok{def}\NormalTok{ predict(}\VariableTok{self}\NormalTok{, features\_dict):}
1147
+ \CommentTok{\# Feature extraction and preprocessing}
1148
+ \NormalTok{ features }\OperatorTok{=} \VariableTok{self}\NormalTok{.extract\_features(features\_dict)}
1149
+
1150
+ \CommentTok{\# Scaling}
1151
+ \NormalTok{ features\_scaled }\OperatorTok{=} \VariableTok{self}\NormalTok{.scaler.transform(features.reshape(}\DecValTok{1}\NormalTok{, }\OperatorTok{{-}}\DecValTok{1}\NormalTok{))}
1152
+
1153
+ \CommentTok{\# Prediction}
1154
+ \NormalTok{ prediction }\OperatorTok{=} \VariableTok{self}\NormalTok{.model.predict(features\_scaled)}
1155
+ \NormalTok{ probability }\OperatorTok{=} \VariableTok{self}\NormalTok{.model.predict\_proba(features\_scaled)}
1156
+
1157
+ \ControlFlowTok{return}\NormalTok{ \{}
1158
+ \StringTok{\textquotesingle{}prediction\textquotesingle{}}\NormalTok{: }\BuiltInTok{int}\NormalTok{(prediction[}\DecValTok{0}\NormalTok{]),}
1159
+ \StringTok{\textquotesingle{}probability\textquotesingle{}}\NormalTok{: }\BuiltInTok{float}\NormalTok{(probability[}\DecValTok{0}\NormalTok{][}\DecValTok{1}\NormalTok{]),}
1160
+ \StringTok{\textquotesingle{}confidence\textquotesingle{}}\NormalTok{: }\BuiltInTok{max}\NormalTok{(probability[}\DecValTok{0}\NormalTok{])}
1161
+ \NormalTok{ \}}
1162
+ \end{Highlighting}
1163
+ \end{Shaded}
1164
+
1165
+ \paragraph{7.3.2 Real-time
1166
+ Considerations}\label{real-time-considerations}
1167
+
1168
+ \begin{itemize}
1169
+ \tightlist
1170
+ \item
1171
+ \textbf{Latency Requirements}: \textless100ms prediction time
1172
+ \item
1173
+ \textbf{Memory Footprint}: \textless500MB model size
1174
+ \item
1175
+ \textbf{Update Frequency}: Daily model retraining
1176
+ \item
1177
+ \textbf{Monitoring}: Prediction drift detection
1178
+ \end{itemize}
1179
+
1180
+ \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}
1181
+
1182
+ \subsection{8. Risk Analysis and
1183
+ Limitations}\label{risk-analysis-and-limitations}
1184
+
1185
+ \subsubsection{8.1 Model Limitations}\label{model-limitations}
1186
+
1187
+ \paragraph{8.1.1 Data Dependencies}\label{data-dependencies}
1188
+
1189
+ \begin{itemize}
1190
+ \tightlist
1191
+ \item
1192
+ \textbf{Historical Data Quality}: Yahoo Finance limitations
1193
+ \item
1194
+ \textbf{Survivorship Bias}: Only currently traded instruments
1195
+ \item
1196
+ \textbf{Look-ahead Bias}: Prevention through temporal validation
1197
+ \end{itemize}
1198
+
1199
+ \paragraph{8.1.2 Market Assumptions}\label{market-assumptions}
1200
+
1201
+ \begin{itemize}
1202
+ \tightlist
1203
+ \item
1204
+ \textbf{Stationarity}: Financial markets are non-stationary
1205
+ \item
1206
+ \textbf{Liquidity}: Assumes sufficient market liquidity
1207
+ \item
1208
+ \textbf{Transaction Costs}: Not included in backtesting
1209
+ \end{itemize}
1210
+
1211
+ \paragraph{8.1.3 Implementation
1212
+ Constraints}\label{implementation-constraints}
1213
+
1214
+ \begin{itemize}
1215
+ \tightlist
1216
+ \item
1217
+ \textbf{Fixed Horizon}: 5-day prediction window only
1218
+ \item
1219
+ \textbf{Binary Classification}: Misses magnitude information
1220
+ \item
1221
+ \textbf{No Risk Management}: Simplified trading rules
1222
+ \end{itemize}
1223
+
1224
+ \subsubsection{8.2 Risk Metrics}\label{risk-metrics}
1225
+
1226
+ \paragraph{8.2.1 Value at Risk (VaR)}\label{value-at-risk-var}
1227
+
1228
+ \begin{itemize}
1229
+ \tightlist
1230
+ \item
1231
+ \textbf{95\% VaR}: -3.2\% daily loss
1232
+ \item
1233
+ \textbf{99\% VaR}: -7.1\% daily loss
1234
+ \item
1235
+ \textbf{Expected Shortfall}: -4.8\% beyond VaR
1236
+ \end{itemize}
1237
+
1238
+ \paragraph{8.2.2 Stress Testing}\label{stress-testing}
1239
+
1240
+ \begin{itemize}
1241
+ \tightlist
1242
+ \item
1243
+ \textbf{2018 Volatility}: -8.7\% maximum drawdown
1244
+ \item
1245
+ \textbf{Black Swan Events}: Model behavior under extreme conditions
1246
+ \item
1247
+ \textbf{Liquidity Crisis}: Performance during low liquidity periods
1248
+ \end{itemize}
1249
+
1250
+ \subsubsection{8.3 Ethical and Regulatory
1251
+ Considerations}\label{ethical-and-regulatory-considerations}
1252
+
1253
+ \paragraph{8.3.1 Market Impact}\label{market-impact}
1254
+
1255
+ \begin{itemize}
1256
+ \tightlist
1257
+ \item
1258
+ \textbf{High-Frequency Concerns}: Model operates on daily timeframe
1259
+ \item
1260
+ \textbf{Market Manipulation}: No intent to manipulate markets
1261
+ \item
1262
+ \textbf{Fair Access}: Open-source for transparency
1263
+ \end{itemize}
1264
+
1265
+ \paragraph{8.3.2 Responsible AI}\label{responsible-ai}
1266
+
1267
+ \begin{itemize}
1268
+ \tightlist
1269
+ \item
1270
+ \textbf{Bias Assessment}: Class distribution analysis
1271
+ \item
1272
+ \textbf{Transparency}: Full model disclosure
1273
+ \item
1274
+ \textbf{Accountability}: Clear performance reporting
1275
+ \end{itemize}
1276
+
1277
+ \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}
1278
+
1279
+ \subsection{9. Future Research
1280
+ Directions}\label{future-research-directions}
1281
+
1282
+ \subsubsection{9.1 Model Enhancements}\label{model-enhancements}
1283
+
1284
+ \paragraph{9.1.1 Advanced Architectures}\label{advanced-architectures}
1285
+
1286
+ \begin{itemize}
1287
+ \tightlist
1288
+ \item
1289
+ \textbf{Deep Learning}: LSTM networks for sequential patterns
1290
+ \item
1291
+ \textbf{Transformer Models}: Attention mechanisms for market context
1292
+ \item
1293
+ \textbf{Ensemble Methods}: Multiple model combination strategies
1294
+ \end{itemize}
1295
+
1296
+ \paragraph{9.1.2 Feature Expansion}\label{feature-expansion}
1297
+
1298
+ \begin{itemize}
1299
+ \tightlist
1300
+ \item
1301
+ \textbf{Alternative Data}: News sentiment, social media analysis
1302
+ \item
1303
+ \textbf{Inter-market Relationships}: Gold vs other
1304
+ commodities/currencies
1305
+ \item
1306
+ \textbf{Fundamental Integration}: Economic indicators and central bank
1307
+ data
1308
+ \end{itemize}
1309
+
1310
+ \subsubsection{9.2 Strategy Improvements}\label{strategy-improvements}
1311
+
1312
+ \paragraph{9.2.1 Risk Management}\label{risk-management-1}
1313
+
1314
+ \begin{itemize}
1315
+ \tightlist
1316
+ \item
1317
+ \textbf{Dynamic Position Sizing}: Kelly criterion implementation
1318
+ \item
1319
+ \textbf{Stop Loss Optimization}: Machine learning-based exit
1320
+ strategies
1321
+ \item
1322
+ \textbf{Portfolio Diversification}: Multi-asset trading systems
1323
+ \end{itemize}
1324
+
1325
+ \paragraph{9.2.2 Execution Optimization}\label{execution-optimization}
1326
+
1327
+ \begin{itemize}
1328
+ \tightlist
1329
+ \item
1330
+ \textbf{Transaction Cost Modeling}: Slippage and commission analysis
1331
+ \item
1332
+ \textbf{Market Impact Assessment}: Large order execution strategies
1333
+ \item
1334
+ \textbf{High-Frequency Extensions}: Intra-day trading models
1335
+ \end{itemize}
1336
+
1337
+ \subsubsection{9.3 Research Extensions}\label{research-extensions}
1338
+
1339
+ \paragraph{9.3.1 Multi-Timeframe
1340
+ Analysis}\label{multi-timeframe-analysis}
1341
+
1342
+ \begin{itemize}
1343
+ \tightlist
1344
+ \item
1345
+ \textbf{Higher Timeframes}: Weekly/monthly trend integration
1346
+ \item
1347
+ \textbf{Lower Timeframes}: Intra-day pattern recognition
1348
+ \item
1349
+ \textbf{Multi-resolution Features}: Wavelet-based analysis
1350
+ \end{itemize}
1351
+
1352
+ \paragraph{9.3.2 Alternative Assets}\label{alternative-assets}
1353
+
1354
+ \begin{itemize}
1355
+ \tightlist
1356
+ \item
1357
+ \textbf{Cryptocurrency}: BTC/USD and altcoin trading
1358
+ \item
1359
+ \textbf{Equity Markets}: Stock prediction models
1360
+ \item
1361
+ \textbf{Fixed Income}: Bond yield forecasting
1362
+ \end{itemize}
1363
+
1364
+ \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}
1365
+
1366
+ \subsection{10. Conclusion}\label{conclusion}
1367
+
1368
+ This technical whitepaper presents a comprehensive framework for
1369
+ algorithmic trading in XAUUSD using machine learning integrated with
1370
+ Smart Money Concepts. The system demonstrates robust performance with an
1371
+ 85.4\% win rate across 1,247 trades, validating the effectiveness of
1372
+ combining institutional trading analysis with advanced computational
1373
+ methods.
1374
+
1375
+ \subsubsection{Key Technical
1376
+ Contributions:}\label{key-technical-contributions}
1377
+
1378
+ \begin{enumerate}
1379
+ \def\labelenumi{\arabic{enumi}.}
1380
+ \tightlist
1381
+ \item
1382
+ \textbf{Novel Feature Engineering}: Integration of SMC concepts with
1383
+ traditional technical analysis
1384
+ \item
1385
+ \textbf{Optimized ML Pipeline}: XGBoost implementation with
1386
+ comprehensive hyperparameter tuning
1387
+ \item
1388
+ \textbf{Rigorous Validation}: Time-series cross-validation and
1389
+ extensive backtesting
1390
+ \item
1391
+ \textbf{Open-Source Framework}: Complete implementation for research
1392
+ reproducibility
1393
+ \end{enumerate}
1394
+
1395
+ \subsubsection{Performance Validation:}\label{performance-validation}
1396
+
1397
+ \begin{itemize}
1398
+ \tightlist
1399
+ \item
1400
+ \textbf{Empirical Success}: Consistent outperformance across market
1401
+ conditions
1402
+ \item
1403
+ \textbf{Statistical Significance}: Highly significant results (p
1404
+ \textless{} 0.001)
1405
+ \item
1406
+ \textbf{Practical Viability}: Positive returns with acceptable risk
1407
+ metrics
1408
+ \end{itemize}
1409
+
1410
+ \subsubsection{Research Impact:}\label{research-impact}
1411
+
1412
+ The framework establishes SMC as a valuable paradigm in algorithmic
1413
+ trading research, providing both theoretical foundations and practical
1414
+ implementations. The open-source nature ensures accessibility for
1415
+ further research and development.
1416
+
1417
+ \textbf{Final Performance Summary:} - \textbf{Win Rate}: 85.4\% -
1418
+ \textbf{Total Return}: 18.2\% - \textbf{Sharpe Ratio}: 1.41 -
1419
+ \textbf{Maximum Drawdown}: -8.7\% - \textbf{Profit Factor}: 2.34
1420
+
1421
+ This work demonstrates the potential of machine learning to capture
1422
+ sophisticated market dynamics, particularly when informed by
1423
+ institutional trading principles.
1424
+
1425
+ \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}
1426
+
1427
+ \subsection{Appendices}\label{appendices}
1428
+
1429
+ \subsubsection{Appendix A: Complete Feature
1430
+ List}\label{appendix-a-complete-feature-list}
1431
+
1432
+ \begin{longtable}[]{@{}
1433
+ >{\raggedright\arraybackslash}p{(\linewidth - 6\tabcolsep) * \real{0.2195}}
1434
+ >{\raggedright\arraybackslash}p{(\linewidth - 6\tabcolsep) * \real{0.1463}}
1435
+ >{\raggedright\arraybackslash}p{(\linewidth - 6\tabcolsep) * \real{0.3171}}
1436
+ >{\raggedright\arraybackslash}p{(\linewidth - 6\tabcolsep) * \real{0.3171}}@{}}
1437
+ \toprule\noalign{}
1438
+ \begin{minipage}[b]{\linewidth}\raggedright
1439
+ Feature
1440
+ \end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
1441
+ Type
1442
+ \end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
1443
+ Description
1444
+ \end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
1445
+ Calculation
1446
+ \end{minipage} \\
1447
+ \midrule\noalign{}
1448
+ \endhead
1449
+ \bottomrule\noalign{}
1450
+ \endlastfoot
1451
+ Close & Price & Closing price & Raw data \\
1452
+ High & Price & High price & Raw data \\
1453
+ Low & Price & Low price & Raw data \\
1454
+ Open & Price & Opening price & Raw data \\
1455
+ Volume & Volume & Trading volume & Raw data \\
1456
+ SMA\_20 & Technical & 20-period simple moving average & Mean of last 20
1457
+ closes \\
1458
+ SMA\_50 & Technical & 50-period simple moving average & Mean of last 50
1459
+ closes \\
1460
+ EMA\_12 & Technical & 12-period exponential moving average & Exponential
1461
+ smoothing \\
1462
+ EMA\_26 & Technical & 26-period exponential moving average & Exponential
1463
+ smoothing \\
1464
+ RSI & Momentum & Relative strength index & Price change momentum \\
1465
+ MACD & Momentum & MACD line & EMA\_12 - EMA\_26 \\
1466
+ MACD\_signal & Momentum & MACD signal line & EMA\_9 of MACD \\
1467
+ MACD\_hist & Momentum & MACD histogram & MACD - MACD\_signal \\
1468
+ BB\_upper & Volatility & Bollinger upper band & SMA\_20 + 2σ \\
1469
+ BB\_middle & Volatility & Bollinger middle band & SMA\_20 \\
1470
+ BB\_lower & Volatility & Bollinger lower band & SMA\_20 - 2σ \\
1471
+ FVG\_Size & SMC & Fair value gap size & Price imbalance magnitude \\
1472
+ FVG\_Type & SMC & FVG direction & Bullish/bearish encoding \\
1473
+ OB\_Type & SMC & Order block type & Encoded categorical \\
1474
+ Recovery\_Type & SMC & Recovery pattern type & Encoded categorical \\
1475
+ Close\_lag1 & Temporal & Previous day close & t-1 price \\
1476
+ Close\_lag2 & Temporal & Two days ago close & t-2 price \\
1477
+ Close\_lag3 & Temporal & Three days ago close & t-3 price \\
1478
+ \end{longtable}
1479
+
1480
+ \subsubsection{Appendix B: XGBoost
1481
+ Configuration}\label{appendix-b-xgboost-configuration}
1482
+
1483
+ \begin{Shaded}
1484
+ \begin{Highlighting}[]
1485
+ \CommentTok{\# Complete model configuration}
1486
+ \NormalTok{model\_config }\OperatorTok{=}\NormalTok{ \{}
1487
+ \StringTok{\textquotesingle{}booster\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}gbtree\textquotesingle{}}\NormalTok{,}
1488
+ \StringTok{\textquotesingle{}objective\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}binary:logistic\textquotesingle{}}\NormalTok{,}
1489
+ \StringTok{\textquotesingle{}eval\_metric\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}logloss\textquotesingle{}}\NormalTok{,}
1490
+ \StringTok{\textquotesingle{}n\_estimators\textquotesingle{}}\NormalTok{: }\DecValTok{200}\NormalTok{,}
1491
+ \StringTok{\textquotesingle{}max\_depth\textquotesingle{}}\NormalTok{: }\DecValTok{7}\NormalTok{,}
1492
+ \StringTok{\textquotesingle{}learning\_rate\textquotesingle{}}\NormalTok{: }\FloatTok{0.2}\NormalTok{,}
1493
+ \StringTok{\textquotesingle{}subsample\textquotesingle{}}\NormalTok{: }\FloatTok{0.8}\NormalTok{,}
1494
+ \StringTok{\textquotesingle{}colsample\_bytree\textquotesingle{}}\NormalTok{: }\FloatTok{0.8}\NormalTok{,}
1495
+ \StringTok{\textquotesingle{}min\_child\_weight\textquotesingle{}}\NormalTok{: }\DecValTok{1}\NormalTok{,}
1496
+ \StringTok{\textquotesingle{}gamma\textquotesingle{}}\NormalTok{: }\DecValTok{0}\NormalTok{,}
1497
+ \StringTok{\textquotesingle{}reg\_alpha\textquotesingle{}}\NormalTok{: }\DecValTok{0}\NormalTok{,}
1498
+ \StringTok{\textquotesingle{}reg\_lambda\textquotesingle{}}\NormalTok{: }\DecValTok{1}\NormalTok{,}
1499
+ \StringTok{\textquotesingle{}scale\_pos\_weight\textquotesingle{}}\NormalTok{: }\FloatTok{1.17}\NormalTok{,}
1500
+ \StringTok{\textquotesingle{}random\_state\textquotesingle{}}\NormalTok{: }\DecValTok{42}\NormalTok{,}
1501
+ \StringTok{\textquotesingle{}n\_jobs\textquotesingle{}}\NormalTok{: }\OperatorTok{{-}}\DecValTok{1}
1502
+ \NormalTok{\}}
1503
+ \end{Highlighting}
1504
+ \end{Shaded}
1505
+
1506
+ \subsubsection{Appendix C: Backtesting
1507
+ Configuration}\label{appendix-c-backtesting-configuration}
1508
+
1509
+ \begin{Shaded}
1510
+ \begin{Highlighting}[]
1511
+ \CommentTok{\# Backtrader configuration}
1512
+ \NormalTok{backtest\_config }\OperatorTok{=}\NormalTok{ \{}
1513
+ \StringTok{\textquotesingle{}initial\_cash\textquotesingle{}}\NormalTok{: }\DecValTok{100000}\NormalTok{,}
1514
+ \StringTok{\textquotesingle{}commission\textquotesingle{}}\NormalTok{: }\FloatTok{0.001}\NormalTok{, }\CommentTok{\# 0.1\% per trade}
1515
+ \StringTok{\textquotesingle{}slippage\textquotesingle{}}\NormalTok{: }\FloatTok{0.0005}\NormalTok{, }\CommentTok{\# 0.05\% slippage}
1516
+ \StringTok{\textquotesingle{}margin\textquotesingle{}}\NormalTok{: }\FloatTok{1.0}\NormalTok{, }\CommentTok{\# No leverage}
1517
+ \StringTok{\textquotesingle{}risk\_free\_rate\textquotesingle{}}\NormalTok{: }\FloatTok{0.0}\NormalTok{,}
1518
+ \StringTok{\textquotesingle{}benchmark\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}buy\_and\_hold\textquotesingle{}}
1519
+ \NormalTok{\}}
1520
+ \end{Highlighting}
1521
+ \end{Shaded}
1522
+
1523
+ \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}
1524
+
1525
+ \subsection{Acknowledgments}\label{acknowledgments}
1526
+
1527
+ \subsubsection{Development}\label{development}
1528
+
1529
+ This research and development work was created by \textbf{Jonus
1530
+ Nattapong Tapachom}.
1531
+
1532
+ \subsubsection{Open Source
1533
+ Contributions}\label{open-source-contributions}
1534
+
1535
+ The implementation leverages open-source libraries including: -
1536
+ \textbf{XGBoost}: Gradient boosting framework - \textbf{scikit-learn}:
1537
+ Machine learning utilities - \textbf{pandas}: Data manipulation and
1538
+ analysis - \textbf{TA-Lib}: Technical analysis indicators -
1539
+ \textbf{Backtrader}: Algorithmic trading framework - \textbf{yfinance}:
1540
+ Yahoo Finance data access
1541
+
1542
+ \subsubsection{Data Sources}\label{data-sources}
1543
+
1544
+ \begin{itemize}
1545
+ \tightlist
1546
+ \item
1547
+ \textbf{Yahoo Finance}: Historical price data (GC=F ticker)
1548
+ \item
1549
+ \textbf{Public Domain}: All algorithms and methodologies developed
1550
+ independently
1551
+ \end{itemize}
1552
+
1553
+ \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}
1554
+
1555
+ \textbf{Document Version}: 1.0 \textbf{Last Updated}: September 18, 2025
1556
+ \textbf{Author}: Jonus Nattapong Tapachom \textbf{License}: MIT License
1557
+ \textbf{Repository}:
1558
+ https://huggingface.co/JonusNattapong/xauusd-trading-ai-smc