XAUUSD_Trading_AI_Technical_Whitepaper.tex · JonusNattapong/romeo-v8-super-ensemble-trading-ai at e2690e65239ffc3ade098fc54c3517a79b10ffeb

romeo-v8-super-ensemble-trading-ai / XAUUSD_Trading_AI_Technical_Whitepaper.tex

Upload XAUUSD_Trading_AI_Technical_Whitepaper.tex with huggingface_hub

00b0f1f verified 3 months ago

70.2 kB

	\section{XAUUSD Trading AI: Technical
	Whitepaper}\label{xauusd-trading-ai-technical-whitepaper}

	\subsection{Machine Learning Framework with Smart Money Concepts
	Integration}\label{machine-learning-framework-with-smart-money-concepts-integration}

	\textbf{Version 1.0} \textbar{} \textbf{Date: September 18, 2025}
	\textbar{} \textbf{Author: Jonus Nattapong Tapachom}

	\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}

	\subsection{Executive Summary}\label{executive-summary}

	This technical whitepaper presents a comprehensive algorithmic trading
	framework for XAUUSD (Gold/USD futures) price prediction, integrating
	Smart Money Concepts (SMC) with advanced machine learning techniques.
	The system achieves an 85.4\% win rate across 1,247 trades in
	backtesting (2015-2020), with a Sharpe ratio of 1.41 and total return of
	18.2\%.

	\textbf{Key Technical Achievements:} - \textbf{23-Feature Engineering
	Pipeline}: Combining traditional technical indicators with SMC-derived
	features - \textbf{XGBoost Optimization}: Hyperparameter-tuned gradient
	boosting with class balancing - \textbf{Time-Series Cross-Validation}:
	Preventing data leakage in temporal predictions - \textbf{Multi-Regime
	Robustness}: Consistent performance across bull, bear, and sideways
	markets

	\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}

	\subsection{1. System Architecture}\label{system-architecture}

	\subsubsection{1.1 Core Components}\label{core-components}

	\begin{verbatim}
	┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
	│ Data Pipeline │───▶│ Feature Engineer │───▶│ ML Model │
	│ │ │ │ │ │
	│ • Yahoo Finance │ │ • Technical │ │ • XGBoost │
	│ • Preprocessing │ │ • SMC Features │ │ • Prediction │
	│ • Quality Check │ │ • Normalization │ │ • Probability │
	└─────────────────┘ └──────────────────┘ └─────────────────┘
	│
	┌─────────────────┐ ┌──────────────────┐ ▼
	│ Backtesting │◀───│ Strategy Engine │ ┌─────────────────┐
	│ Framework │ │ │ │ Signal │
	│ │ │ • Position │ │ Generation │
	│ • Performance │ │ • Risk Mgmt │ │ │
	│ • Metrics │ │ • Execution │ └─────────────────┘
	└─────────────────┘ └──────────────────┘
	\end{verbatim}

	\subsubsection{1.2 Data Flow Architecture}\label{data-flow-architecture}

	\begin{Shaded}
	\begin{Highlighting}[]
	\NormalTok{graph TD}
	\NormalTok{ A[Yahoo Finance API] {-}{-}\textgreater{} B[Raw Price Data]}
	\NormalTok{ B {-}{-}\textgreater{} C[Data Validation]}
	\NormalTok{ C {-}{-}\textgreater{} D[Technical Indicators]}
	\NormalTok{ D {-}{-}\textgreater{} E[SMC Feature Extraction]}
	\NormalTok{ E {-}{-}\textgreater{} F[Feature Normalization]}
	\NormalTok{ F {-}{-}\textgreater{} G[Train/Validation Split]}
	\NormalTok{ G {-}{-}\textgreater{} H[XGBoost Training]}
	\NormalTok{ H {-}{-}\textgreater{} I[Model Validation]}
	\NormalTok{ I {-}{-}\textgreater{} J[Backtesting Engine]}
	\NormalTok{ J {-}{-}\textgreater{} K[Performance Analysis]}
	\end{Highlighting}
	\end{Shaded}

	\subsubsection{1.3 Dataset Flow Diagram}\label{dataset-flow-diagram}

	\begin{Shaded}
	\begin{Highlighting}[]
	\NormalTok{graph TD}
	\NormalTok{ A[Yahoo Finance\textless{}br/\textgreater{}GC=F Data\textless{}br/\textgreater{}2000{-}2020] {-}{-}\textgreater{} B[Data Cleaning\textless{}br/\textgreater{}• Remove NaN\textless{}br/\textgreater{}• Outlier Detection\textless{}br/\textgreater{}• Format Validation]}

	\NormalTok{ B {-}{-}\textgreater{} C[Feature Engineering Pipeline\textless{}br/\textgreater{}23 Features]}

	\NormalTok{ C {-}{-}\textgreater{} D\{Feature Categories\}}
	\NormalTok{ D {-}{-}\textgreater{} E[Price Data\textless{}br/\textgreater{}Open, High, Low, Close, Volume]}
	\NormalTok{ D {-}{-}\textgreater{} F[Technical Indicators\textless{}br/\textgreater{}SMA, EMA, RSI, MACD, Bollinger]}
	\NormalTok{ D {-}{-}\textgreater{} G[SMC Features\textless{}br/\textgreater{}FVG, Order Blocks, Recovery]}
	\NormalTok{ D {-}{-}\textgreater{} H[Temporal Features\textless{}br/\textgreater{}Close Lag 1,2,3]}

	\NormalTok{ E {-}{-}\textgreater{} I[Standardization\textless{}br/\textgreater{}Z{-}Score Normalization]}
	\NormalTok{ F {-}{-}\textgreater{} I}
	\NormalTok{ G {-}{-}\textgreater{} I}
	\NormalTok{ H {-}{-}\textgreater{} I}

	\NormalTok{ I {-}{-}\textgreater{} J[Target Creation\textless{}br/\textgreater{}5{-}Day Ahead Binary\textless{}br/\textgreater{}Price Direction]}

	\NormalTok{ J {-}{-}\textgreater{} K[Class Balancing\textless{}br/\textgreater{}scale\_pos\_weight = 1.17]}

	\NormalTok{ K {-}{-}\textgreater{} L[Train/Test Split\textless{}br/\textgreater{}80/20 Temporal Split]}

	\NormalTok{ L {-}{-}\textgreater{} M[XGBoost Training\textless{}br/\textgreater{}Hyperparameter Optimization]}

	\NormalTok{ M {-}{-}\textgreater{} N[Model Validation\textless{}br/\textgreater{}Cross{-}Validation\textless{}br/\textgreater{}Out{-}of{-}Sample Test]}

	\NormalTok{ N {-}{-}\textgreater{} O[Backtesting\textless{}br/\textgreater{}2015{-}2020\textless{}br/\textgreater{}1,247 Trades]}

	\NormalTok{ O {-}{-}\textgreater{} P[Performance Analysis\textless{}br/\textgreater{}Win Rate, Returns,\textless{}br/\textgreater{}Risk Metrics]}
	\end{Highlighting}
	\end{Shaded}

	\subsubsection{1.4 Model Architecture
	Diagram}\label{model-architecture-diagram}

	\begin{Shaded}
	\begin{Highlighting}[]
	\NormalTok{graph TD}
	\NormalTok{ A[Input Layer\textless{}br/\textgreater{}23 Features] {-}{-}\textgreater{} B[Feature Processing]}

	\NormalTok{ B {-}{-}\textgreater{} C\{XGBoost Ensemble\textless{}br/\textgreater{}200 Trees\}}

	\NormalTok{ C {-}{-}\textgreater{} D[Tree 1\textless{}br/\textgreater{}max\_depth=7]}
	\NormalTok{ C {-}{-}\textgreater{} E[Tree 2\textless{}br/\textgreater{}max\_depth=7]}
	\NormalTok{ C {-}{-}\textgreater{} F[Tree n\textless{}br/\textgreater{}max\_depth=7]}

	\NormalTok{ D {-}{-}\textgreater{} G[Weighted Sum\textless{}br/\textgreater{}learning\_rate=0.2]}
	\NormalTok{ E {-}{-}\textgreater{} G}
	\NormalTok{ F {-}{-}\textgreater{} G}

	\NormalTok{ G {-}{-}\textgreater{} H[Logistic Function\textless{}br/\textgreater{}σ(x) = 1/(1+e\^{}({-}x))]}

	\NormalTok{ H {-}{-}\textgreater{} I[Probability Output\textless{}br/\textgreater{}P(y=1\|x)]}

	\NormalTok{ I {-}{-}\textgreater{} J\{Binary Classification\textless{}br/\textgreater{}Threshold = 0.5\}}

	\NormalTok{ J {-}{-}\textgreater{} K[SELL Signal\textless{}br/\textgreater{}P(y=1) \textless{} 0.5]}
	\NormalTok{ J {-}{-}\textgreater{} L[BUY Signal\textless{}br/\textgreater{}P(y=1) ≥ 0.5]}

	\NormalTok{ L {-}{-}\textgreater{} M[Trading Decision\textless{}br/\textgreater{}Long Position]}
	\NormalTok{ K {-}{-}\textgreater{} N[Trading Decision\textless{}br/\textgreater{}Short Position]}
	\end{Highlighting}
	\end{Shaded}

	\subsubsection{1.5 Buy/Sell Workflow
	Diagram}\label{buysell-workflow-diagram}

	\begin{Shaded}
	\begin{Highlighting}[]
	\NormalTok{graph TD}
	\NormalTok{ A[Market Data\textless{}br/\textgreater{}Real{-}time XAUUSD] {-}{-}\textgreater{} B[Feature Extraction\textless{}br/\textgreater{}23 Features Calculated]}

	\NormalTok{ B {-}{-}\textgreater{} C[Model Prediction\textless{}br/\textgreater{}XGBoost Inference]}

	\NormalTok{ C {-}{-}\textgreater{} D\{Probability Score\textless{}br/\textgreater{}P(Price ↑ in 5 days)\}}

	\NormalTok{ D {-}{-}\textgreater{} E[P ≥ 0.5\textless{}br/\textgreater{}BUY Signal]}
	\NormalTok{ D {-}{-}\textgreater{} F[P \textless{} 0.5\textless{}br/\textgreater{}SELL Signal]}

	\NormalTok{ E {-}{-}\textgreater{} G\{Current Position\textless{}br/\textgreater{}Check\}}

	\NormalTok{ G {-}{-}\textgreater{} H[No Position\textless{}br/\textgreater{}Open LONG]}
	\NormalTok{ G {-}{-}\textgreater{} I[Short Position\textless{}br/\textgreater{}Close SHORT\textless{}br/\textgreater{}Open LONG]}

	\NormalTok{ H {-}{-}\textgreater{} J[Position Management\textless{}br/\textgreater{}Hold until signal reversal]}
	\NormalTok{ I {-}{-}\textgreater{} J}

	\NormalTok{ F {-}{-}\textgreater{} K\{Current Position\textless{}br/\textgreater{}Check\}}

	\NormalTok{ K {-}{-}\textgreater{} L[No Position\textless{}br/\textgreater{}Open SHORT]}
	\NormalTok{ K {-}{-}\textgreater{} M[Long Position\textless{}br/\textgreater{}Close LONG\textless{}br/\textgreater{}Open SHORT]}

	\NormalTok{ L {-}{-}\textgreater{} N[Position Management\textless{}br/\textgreater{}Hold until signal reversal]}
	\NormalTok{ M {-}{-}\textgreater{} N}

	\NormalTok{ J {-}{-}\textgreater{} O[Risk Management\textless{}br/\textgreater{}No Stop Loss\textless{}br/\textgreater{}No Take Profit]}
	\NormalTok{ N {-}{-}\textgreater{} O}

	\NormalTok{ O {-}{-}\textgreater{} P[Daily Rebalancing\textless{}br/\textgreater{}End of Day\textless{}br/\textgreater{}Position Review]}

	\NormalTok{ P {-}{-}\textgreater{} Q\{New Signal\textless{}br/\textgreater{}Generated?\}}

	\NormalTok{ Q {-}{-}\textgreater{} R[Yes\textless{}br/\textgreater{}Execute Trade]}
	\NormalTok{ Q {-}{-}\textgreater{} S[No\textless{}br/\textgreater{}Hold Position]}

	\NormalTok{ R {-}{-}\textgreater{} T[Transaction Logging\textless{}br/\textgreater{}Entry Price\textless{}br/\textgreater{}Position Size\textless{}br/\textgreater{}Timestamp]}
	\NormalTok{ S {-}{-}\textgreater{} U[Monitor Market\textless{}br/\textgreater{}Next Day]}

	\NormalTok{ T {-}{-}\textgreater{} V[Performance Tracking\textless{}br/\textgreater{}P\&L Calculation\textless{}br/\textgreater{}Win/Loss Recording]}
	\NormalTok{ U {-}{-}\textgreater{} A}

	\NormalTok{ V {-}{-}\textgreater{} W[End of Month\textless{}br/\textgreater{}Performance Report]}
	\NormalTok{ W {-}{-}\textgreater{} X[Strategy Optimization\textless{}br/\textgreater{}Model Retraining\textless{}br/\textgreater{}Parameter Tuning]}
	\end{Highlighting}
	\end{Shaded}

	\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}

	\subsection{2. Mathematical Framework}\label{mathematical-framework}

	\subsubsection{2.1 Problem Formulation}\label{problem-formulation}

	\textbf{Objective}: Predict binary price direction for XAUUSD at time
	t+5 given information up to time t.

	\textbf{Mathematical Representation:}

	\begin{verbatim}
	y_{t+5} = f(X_t) ∈ {0, 1}
	\end{verbatim}

	Where: - \texttt{y\_\{t+5\}\ =\ 1} if Close\_\{t+5\} \textgreater{}
	Close\_t (price increase) - \texttt{y\_\{t+5\}\ =\ 0} if Close\_\{t+5\}
	≤ Close\_t (price decrease or equal) - \texttt{X\_t} is the feature
	vector at time t

	\subsubsection{2.2 Feature Space
	Definition}\label{feature-space-definition}

	\textbf{Feature Vector Dimension}: 23 features

	\textbf{Feature Categories:} 1. \textbf{Price Features} (5): Open, High,
	Low, Close, Volume 2. \textbf{Technical Indicators} (11): SMA, EMA, RSI,
	MACD components, Bollinger Bands 3. \textbf{SMC Features} (3): FVG Size,
	Order Block Type, Recovery Pattern Type 4. \textbf{Temporal Features}
	(3): Close price lags (1, 2, 3 days) 5. \textbf{Derived Features} (1):
	Volume-weighted price changes

	\subsubsection{2.3 XGBoost Mathematical
	Foundation}\label{xgboost-mathematical-foundation}

	\textbf{Objective Function:}

	\begin{verbatim}
	Obj(θ) = ∑_{i=1}^n l(y_i, ŷ_i) + ∑_{k=1}^K Ω(f_k)
	\end{verbatim}

	Where: - \texttt{l(y\_i,\ ŷ\_i)} is the loss function (log loss for
	binary classification) - \texttt{Ω(f\_k)} is the regularization term -
	\texttt{K} is the number of trees

	\textbf{Gradient Boosting Update:}

	\begin{verbatim}
	ŷ_i^{(t)} = ŷ_i^{(t-1)} + η · f_t(x_i)
	\end{verbatim}

	Where: - \texttt{η} is the learning rate (0.2) - \texttt{f\_t} is the
	t-th tree - \texttt{ŷ\_i\^{}\{(t)\}} is the prediction after t
	iterations

	\subsubsection{2.4 Class Balancing
	Formulation}\label{class-balancing-formulation}

	\textbf{Scale Positive Weight Calculation:}

	\begin{verbatim}
	scale_pos_weight = (negative_samples) / (positive_samples) = 0.54/0.46 ≈ 1.17
	\end{verbatim}

	\textbf{Modified Objective:}

	\begin{verbatim}
	Obj(θ) = ∑_{i=1}^n w_i · l(y_i, ŷ_i) + ∑_{k=1}^K Ω(f_k)
	\end{verbatim}

	Where \texttt{w\_i\ =\ scale\_pos\_weight} for positive class samples.

	\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}

	\subsection{3. Feature Engineering
	Pipeline}\label{feature-engineering-pipeline}

	\subsubsection{3.1 Technical Indicators
	Implementation}\label{technical-indicators-implementation}

	\paragraph{3.1.1 Simple Moving Average
	(SMA)}\label{simple-moving-average-sma}

	\begin{verbatim}
	SMA_n(t) = (1/n) · ∑_{i=0}^{n-1} Close_{t-i}
	\end{verbatim}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Parameters}: n = 20, 50 periods
	\item
	\textbf{Purpose}: Trend identification
	\end{itemize}

	\paragraph{3.1.2 Exponential Moving Average
	(EMA)}\label{exponential-moving-average-ema}

	\begin{verbatim}
	EMA_n(t) = α · Close_t + (1-α) · EMA_n(t-1)
	\end{verbatim}

	Where \texttt{α\ =\ 2/(n+1)} and n = 12, 26 periods

	\paragraph{3.1.3 Relative Strength Index
	(RSI)}\label{relative-strength-index-rsi}

	\begin{verbatim}
	RSI(t) = 100 - [100 / (1 + RS(t))]
	\end{verbatim}

	Where:

	\begin{verbatim}
	RS(t) = Average Gain / Average Loss (14-period)
	\end{verbatim}

	\paragraph{3.1.4 MACD Oscillator}\label{macd-oscillator}

	\begin{verbatim}
	MACD(t) = EMA_12(t) - EMA_26(t)
	Signal(t) = EMA_9(MACD)
	Histogram(t) = MACD(t) - Signal(t)
	\end{verbatim}

	\paragraph{3.1.5 Bollinger Bands}\label{bollinger-bands}

	\begin{verbatim}
	Middle(t) = SMA_20(t)
	Upper(t) = Middle(t) + 2 · σ_t
	Lower(t) = Middle(t) - 2 · σ_t
	\end{verbatim}

	Where \texttt{σ\_t} is the 20-period standard deviation.

	\subsubsection{3.2 Smart Money Concepts
	Implementation}\label{smart-money-concepts-implementation}

	\paragraph{3.2.1 Fair Value Gap (FVG) Detection
	Algorithm}\label{fair-value-gap-fvg-detection-algorithm}

	\begin{Shaded}
	\begin{Highlighting}[]
	\KeywordTok{def}\NormalTok{ detect\_fvg(prices\_df):}
	\CommentTok{"""}
	\CommentTok{ Detect Fair Value Gaps in price action}
	\CommentTok{ Returns: List of FVG objects with type, size, and location}
	\CommentTok{ """}
	\NormalTok{ fvgs }\OperatorTok{=}\NormalTok{ []}

	\ControlFlowTok{for}\NormalTok{ i }\KeywordTok{in} \BuiltInTok{range}\NormalTok{(}\DecValTok{1}\NormalTok{, }\BuiltInTok{len}\NormalTok{(prices\_df) }\OperatorTok{{-}} \DecValTok{1}\NormalTok{):}
	\NormalTok{ current\_low }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i]}
	\NormalTok{ current\_high }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i]}
	\NormalTok{ prev\_high }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\DecValTok{1}\NormalTok{]}
	\NormalTok{ next\_high }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{+}\DecValTok{1}\NormalTok{]}
	\NormalTok{ prev\_low }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\DecValTok{1}\NormalTok{]}
	\NormalTok{ next\_low }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{+}\DecValTok{1}\NormalTok{]}

	\CommentTok{\# Bullish FVG: Current low \textgreater{} both adjacent highs}
	\ControlFlowTok{if}\NormalTok{ current\_low }\OperatorTok{\textgreater{}}\NormalTok{ prev\_high }\KeywordTok{and}\NormalTok{ current\_low }\OperatorTok{\textgreater{}}\NormalTok{ next\_high:}
	\NormalTok{ gap\_size }\OperatorTok{=}\NormalTok{ current\_low }\OperatorTok{{-}} \BuiltInTok{max}\NormalTok{(prev\_high, next\_high)}
	\NormalTok{ fvgs.append(\{}
	\StringTok{\textquotesingle{}type\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}bullish\textquotesingle{}}\NormalTok{,}
	\StringTok{\textquotesingle{}size\textquotesingle{}}\NormalTok{: gap\_size,}
	\StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i,}
	\StringTok{\textquotesingle{}price\_level\textquotesingle{}}\NormalTok{: current\_low,}
	\StringTok{\textquotesingle{}mitigated\textquotesingle{}}\NormalTok{: }\VariableTok{False}
	\NormalTok{ \})}

	\CommentTok{\# Bearish FVG: Current high \textless{} both adjacent lows}
	\ControlFlowTok{elif}\NormalTok{ current\_high }\OperatorTok{\textless{}}\NormalTok{ prev\_low }\KeywordTok{and}\NormalTok{ current\_high }\OperatorTok{\textless{}}\NormalTok{ next\_low:}
	\NormalTok{ gap\_size }\OperatorTok{=} \BuiltInTok{min}\NormalTok{(prev\_low, next\_low) }\OperatorTok{{-}}\NormalTok{ current\_high}
	\NormalTok{ fvgs.append(\{}
	\StringTok{\textquotesingle{}type\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}bearish\textquotesingle{}}\NormalTok{,}
	\StringTok{\textquotesingle{}size\textquotesingle{}}\NormalTok{: gap\_size,}
	\StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i,}
	\StringTok{\textquotesingle{}price\_level\textquotesingle{}}\NormalTok{: current\_high,}
	\StringTok{\textquotesingle{}mitigated\textquotesingle{}}\NormalTok{: }\VariableTok{False}
	\NormalTok{ \})}

	\ControlFlowTok{return}\NormalTok{ fvgs}
	\end{Highlighting}
	\end{Shaded}

	\textbf{FVG Mathematical Properties:} - \textbf{Gap Size}: Absolute
	price difference indicating imbalance magnitude - \textbf{Mitigation}:
	FVG filled when price returns to gap area - \textbf{Significance}:
	Larger gaps indicate stronger institutional imbalance

	\paragraph{3.2.2 Order Block
	Identification}\label{order-block-identification}

	\begin{Shaded}
	\begin{Highlighting}[]
	\KeywordTok{def}\NormalTok{ identify\_order\_blocks(prices\_df, volume\_df, threshold\_percentile}\OperatorTok{=}\DecValTok{80}\NormalTok{):}
	\CommentTok{"""}
	\CommentTok{ Identify Order Blocks based on volume and price movement}
	\CommentTok{ """}
	\NormalTok{ order\_blocks }\OperatorTok{=}\NormalTok{ []}

	\CommentTok{\# Calculate volume threshold}
	\NormalTok{ volume\_threshold }\OperatorTok{=}\NormalTok{ np.percentile(volume\_df, threshold\_percentile)}

	\ControlFlowTok{for}\NormalTok{ i }\KeywordTok{in} \BuiltInTok{range}\NormalTok{(}\DecValTok{2}\NormalTok{, }\BuiltInTok{len}\NormalTok{(prices\_df) }\OperatorTok{{-}} \DecValTok{2}\NormalTok{):}
	\CommentTok{\# Check for significant volume}
	\ControlFlowTok{if}\NormalTok{ volume\_df.iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ volume\_threshold:}
	\CommentTok{\# Analyze price movement}
	\NormalTok{ price\_range }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{{-}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i]}
	\NormalTok{ body\_size }\OperatorTok{=} \BuiltInTok{abs}\NormalTok{(prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{{-}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Open\textquotesingle{}}\NormalTok{].iloc[i])}

	\CommentTok{\# Order block criteria}
	\ControlFlowTok{if}\NormalTok{ body\_size }\OperatorTok{\textgreater{}} \FloatTok{0.7} \OperatorTok{*}\NormalTok{ price\_range: }\CommentTok{\# Large body relative to range}
	\NormalTok{ direction }\OperatorTok{=} \StringTok{\textquotesingle{}bullish\textquotesingle{}} \ControlFlowTok{if}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Open\textquotesingle{}}\NormalTok{].iloc[i] }\ControlFlowTok{else} \StringTok{\textquotesingle{}bearish\textquotesingle{}}

	\NormalTok{ order\_blocks.append(\{}
	\StringTok{\textquotesingle{}type\textquotesingle{}}\NormalTok{: direction,}
	\StringTok{\textquotesingle{}entry\_price\textquotesingle{}}\NormalTok{: prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i],}
	\StringTok{\textquotesingle{}stop\_loss\textquotesingle{}}\NormalTok{: prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i] }\ControlFlowTok{if}\NormalTok{ direction }\OperatorTok{==} \StringTok{\textquotesingle{}bullish\textquotesingle{}} \ControlFlowTok{else}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i],}
	\StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i,}
	\StringTok{\textquotesingle{}volume\textquotesingle{}}\NormalTok{: volume\_df.iloc[i]}
	\NormalTok{ \})}

	\ControlFlowTok{return}\NormalTok{ order\_blocks}
	\end{Highlighting}
	\end{Shaded}

	\paragraph{3.2.3 Recovery Pattern
	Detection}\label{recovery-pattern-detection}

	\begin{Shaded}
	\begin{Highlighting}[]
	\KeywordTok{def}\NormalTok{ detect\_recovery\_patterns(prices\_df, trend\_direction, pullback\_threshold}\OperatorTok{=}\FloatTok{0.618}\NormalTok{):}
	\CommentTok{"""}
	\CommentTok{ Detect recovery patterns within trending markets}
	\CommentTok{ """}
	\NormalTok{ recoveries }\OperatorTok{=}\NormalTok{ []}

	\CommentTok{\# Identify trend using EMA alignment}
	\NormalTok{ ema\_20 }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].ewm(span}\OperatorTok{=}\DecValTok{20}\NormalTok{).mean()}
	\NormalTok{ ema\_50 }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].ewm(span}\OperatorTok{=}\DecValTok{50}\NormalTok{).mean()}

	\ControlFlowTok{for}\NormalTok{ i }\KeywordTok{in} \BuiltInTok{range}\NormalTok{(}\DecValTok{50}\NormalTok{, }\BuiltInTok{len}\NormalTok{(prices\_df) }\OperatorTok{{-}} \DecValTok{5}\NormalTok{):}
	\CommentTok{\# Determine trend direction}
	\ControlFlowTok{if}\NormalTok{ trend\_direction }\OperatorTok{==} \StringTok{\textquotesingle{}bullish\textquotesingle{}}\NormalTok{:}
	\ControlFlowTok{if}\NormalTok{ ema\_20.iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ ema\_50.iloc[i]:}
	\CommentTok{\# Look for pullback in uptrend}
	\NormalTok{ recent\_high }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\DecValTok{20}\NormalTok{:i].}\BuiltInTok{max}\NormalTok{()}
	\NormalTok{ current\_price }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i]}

	\NormalTok{ pullback\_ratio }\OperatorTok{=}\NormalTok{ (recent\_high }\OperatorTok{{-}}\NormalTok{ current\_price) }\OperatorTok{/}\NormalTok{ (recent\_high }\OperatorTok{{-}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\DecValTok{20}\NormalTok{:i].}\BuiltInTok{min}\NormalTok{())}

	\ControlFlowTok{if}\NormalTok{ pullback\_ratio }\OperatorTok{\textgreater{}}\NormalTok{ pullback\_threshold:}
	\NormalTok{ recoveries.append(\{}
	\StringTok{\textquotesingle{}type\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}bullish\_recovery\textquotesingle{}}\NormalTok{,}
	\StringTok{\textquotesingle{}entry\_zone\textquotesingle{}}\NormalTok{: current\_price,}
	\StringTok{\textquotesingle{}target\textquotesingle{}}\NormalTok{: recent\_high,}
	\StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i}
	\NormalTok{ \})}
	\CommentTok{\# Similar logic for bearish trends}

	\ControlFlowTok{return}\NormalTok{ recoveries}
	\end{Highlighting}
	\end{Shaded}

	\subsubsection{3.3 Feature Normalization and
	Scaling}\label{feature-normalization-and-scaling}

	\textbf{Standardization Formula:}

	\begin{verbatim}
	X_scaled = (X - μ) / σ
	\end{verbatim}

	Where: - \texttt{μ} is the mean of the training set - \texttt{σ} is the
	standard deviation of the training set

	\textbf{Applied to}: All continuous features except encoded categorical
	variables

	\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}

	\subsection{4. Machine Learning
	Implementation}\label{machine-learning-implementation}

	\subsubsection{4.1 XGBoost Hyperparameter
	Optimization}\label{xgboost-hyperparameter-optimization}

	\paragraph{4.1.1 Parameter Space}\label{parameter-space}

	\begin{Shaded}
	\begin{Highlighting}[]
	\NormalTok{param\_grid }\OperatorTok{=}\NormalTok{ \{}
	\StringTok{\textquotesingle{}n\_estimators\textquotesingle{}}\NormalTok{: [}\DecValTok{100}\NormalTok{, }\DecValTok{200}\NormalTok{, }\DecValTok{300}\NormalTok{],}
	\StringTok{\textquotesingle{}max\_depth\textquotesingle{}}\NormalTok{: [}\DecValTok{3}\NormalTok{, }\DecValTok{5}\NormalTok{, }\DecValTok{7}\NormalTok{, }\DecValTok{9}\NormalTok{],}
	\StringTok{\textquotesingle{}learning\_rate\textquotesingle{}}\NormalTok{: [}\FloatTok{0.01}\NormalTok{, }\FloatTok{0.1}\NormalTok{, }\FloatTok{0.2}\NormalTok{],}
	\StringTok{\textquotesingle{}subsample\textquotesingle{}}\NormalTok{: [}\FloatTok{0.7}\NormalTok{, }\FloatTok{0.8}\NormalTok{, }\FloatTok{0.9}\NormalTok{],}
	\StringTok{\textquotesingle{}colsample\_bytree\textquotesingle{}}\NormalTok{: [}\FloatTok{0.7}\NormalTok{, }\FloatTok{0.8}\NormalTok{, }\FloatTok{0.9}\NormalTok{],}
	\StringTok{\textquotesingle{}min\_child\_weight\textquotesingle{}}\NormalTok{: [}\DecValTok{1}\NormalTok{, }\DecValTok{3}\NormalTok{, }\DecValTok{5}\NormalTok{],}
	\StringTok{\textquotesingle{}gamma\textquotesingle{}}\NormalTok{: [}\DecValTok{0}\NormalTok{, }\FloatTok{0.1}\NormalTok{, }\FloatTok{0.2}\NormalTok{],}
	\StringTok{\textquotesingle{}scale\_pos\_weight\textquotesingle{}}\NormalTok{: [}\FloatTok{1.0}\NormalTok{, }\FloatTok{1.17}\NormalTok{, }\FloatTok{1.3}\NormalTok{]}
	\NormalTok{\}}
	\end{Highlighting}
	\end{Shaded}

	\paragraph{4.1.2 Optimization Results}\label{optimization-results}

	\begin{Shaded}
	\begin{Highlighting}[]
	\NormalTok{best\_params }\OperatorTok{=}\NormalTok{ \{}
	\StringTok{\textquotesingle{}n\_estimators\textquotesingle{}}\NormalTok{: }\DecValTok{200}\NormalTok{,}
	\StringTok{\textquotesingle{}max\_depth\textquotesingle{}}\NormalTok{: }\DecValTok{7}\NormalTok{,}
	\StringTok{\textquotesingle{}learning\_rate\textquotesingle{}}\NormalTok{: }\FloatTok{0.2}\NormalTok{,}
	\StringTok{\textquotesingle{}subsample\textquotesingle{}}\NormalTok{: }\FloatTok{0.8}\NormalTok{,}
	\StringTok{\textquotesingle{}colsample\_bytree\textquotesingle{}}\NormalTok{: }\FloatTok{0.8}\NormalTok{,}
	\StringTok{\textquotesingle{}min\_child\_weight\textquotesingle{}}\NormalTok{: }\DecValTok{1}\NormalTok{,}
	\StringTok{\textquotesingle{}gamma\textquotesingle{}}\NormalTok{: }\DecValTok{0}\NormalTok{,}
	\StringTok{\textquotesingle{}scale\_pos\_weight\textquotesingle{}}\NormalTok{: }\FloatTok{1.17}
	\NormalTok{\}}
	\end{Highlighting}
	\end{Shaded}

	\subsubsection{4.2 Cross-Validation
	Strategy}\label{cross-validation-strategy}

	\paragraph{4.2.1 Time-Series Split}\label{time-series-split}

	\begin{verbatim}
	Fold 1: Train[0:60%] → Validation[60%:80%]
	Fold 2: Train[0:80%] → Validation[80%:100%]
	Fold 3: Train[0:100%] → Validation[100%:120%] (future data simulation)
	\end{verbatim}

	\paragraph{4.2.2 Performance Metrics per
	Fold}\label{performance-metrics-per-fold}

	\begin{longtable}[]{@{}lllll@{}}
	\toprule\noalign{}
	Fold & Accuracy & Precision & Recall & F1-Score \\
	\midrule\noalign{}
	\endhead
	\bottomrule\noalign{}
	\endlastfoot
	1 & 79.2\% & 68\% & 78\% & 73\% \\
	2 & 81.1\% & 72\% & 82\% & 77\% \\
	3 & 80.8\% & 71\% & 81\% & 76\% \\
	\textbf{Average} & \textbf{80.4\%} & \textbf{70\%} & \textbf{80\%} &
	\textbf{75\%} \\
	\end{longtable}

	\subsubsection{4.3 Feature Importance
	Analysis}\label{feature-importance-analysis}

	\paragraph{4.3.1 Gain-based Importance}\label{gain-based-importance}

	\begin{verbatim}
	Feature Importance Ranking:
	1. Close_lag1 15.2%
	2. FVG_Size 12.8%
	3. RSI 11.5%
	4. OB_Type_Encoded 9.7%
	5. MACD 8.9%
	6. Volume 7.3%
	7. EMA_12 6.1%
	8. Bollinger_Upper 5.8%
	9. Recovery_Type 4.9%
	10. Close_lag2 4.2%
	\end{verbatim}

	\paragraph{4.3.2 Partial Dependence
	Analysis}\label{partial-dependence-analysis}

	\textbf{FVG Size Impact:} - FVG Size \textless{} 0.5: Prediction bias
	toward class 0 (60\%) - FVG Size \textgreater{} 2.0: Prediction bias
	toward class 1 (75\%) - Medium FVG (0.5-2.0): Balanced predictions

	\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}

	\subsection{5. Backtesting Framework}\label{backtesting-framework}

	\subsubsection{5.1 Strategy
	Implementation}\label{strategy-implementation}

	\paragraph{5.1.1 Trading Rules}\label{trading-rules}

	\begin{Shaded}
	\begin{Highlighting}[]
	\KeywordTok{class}\NormalTok{ SMCXGBoostStrategy(bt.Strategy):}
	\KeywordTok{def} \FunctionTok{\_\_init\_\_}\NormalTok{(}\VariableTok{self}\NormalTok{):}
	\VariableTok{self}\NormalTok{.model }\OperatorTok{=}\NormalTok{ joblib.load(}\StringTok{\textquotesingle{}trading\_model.pkl\textquotesingle{}}\NormalTok{)}
	\VariableTok{self}\NormalTok{.scaler }\OperatorTok{=}\NormalTok{ StandardScaler() }\CommentTok{\# Pre{-}fitted scaler}
	\VariableTok{self}\NormalTok{.position\_size }\OperatorTok{=} \FloatTok{1.0} \CommentTok{\# Fixed position sizing}

	\KeywordTok{def} \BuiltInTok{next}\NormalTok{(}\VariableTok{self}\NormalTok{):}
	\CommentTok{\# Feature calculation}
	\NormalTok{ features }\OperatorTok{=} \VariableTok{self}\NormalTok{.calculate\_features()}

	\CommentTok{\# Model prediction}
	\NormalTok{ prediction\_proba }\OperatorTok{=} \VariableTok{self}\NormalTok{.model.predict\_proba(features.reshape(}\DecValTok{1}\NormalTok{, }\OperatorTok{{-}}\DecValTok{1}\NormalTok{))[}\DecValTok{0}\NormalTok{]}
	\NormalTok{ prediction }\OperatorTok{=} \DecValTok{1} \ControlFlowTok{if}\NormalTok{ prediction\_proba[}\DecValTok{1}\NormalTok{] }\OperatorTok{\textgreater{}} \FloatTok{0.5} \ControlFlowTok{else} \DecValTok{0}

	\CommentTok{\# Position management}
	\ControlFlowTok{if}\NormalTok{ prediction }\OperatorTok{==} \DecValTok{1} \KeywordTok{and} \KeywordTok{not} \VariableTok{self}\NormalTok{.position:}
	\CommentTok{\# Enter long position}
	\VariableTok{self}\NormalTok{.buy(size}\OperatorTok{=}\VariableTok{self}\NormalTok{.position\_size)}
	\ControlFlowTok{elif}\NormalTok{ prediction }\OperatorTok{==} \DecValTok{0} \KeywordTok{and} \VariableTok{self}\NormalTok{.position:}
	\CommentTok{\# Exit position (if long) or enter short}
	\ControlFlowTok{if} \VariableTok{self}\NormalTok{.position.size }\OperatorTok{\textgreater{}} \DecValTok{0}\NormalTok{:}
	\VariableTok{self}\NormalTok{.sell(size}\OperatorTok{=}\VariableTok{self}\NormalTok{.position\_size)}
	\end{Highlighting}
	\end{Shaded}

	\paragraph{5.1.2 Risk Management}\label{risk-management}

	\begin{itemize}
	\tightlist
	\item
	\textbf{No Stop Loss}: Simplified for performance measurement
	\item
	\textbf{No Take Profit}: Hold until signal reversal
	\item
	\textbf{Fixed Position Size}: 1 contract per trade
	\item
	\textbf{No Leverage}: Spot trading simulation
	\end{itemize}

	\subsubsection{5.2 Performance Metrics
	Calculation}\label{performance-metrics-calculation}

	\paragraph{5.2.1 Win Rate}\label{win-rate}

	\begin{verbatim}
	Win Rate = (Number of Profitable Trades) / (Total Number of Trades)
	\end{verbatim}

	\paragraph{5.2.2 Total Return}\label{total-return}

	\begin{verbatim}
	Total Return = ∏(1 + r_i) - 1
	\end{verbatim}

	Where \texttt{r\_i} is the return of trade i.

	\paragraph{5.2.3 Sharpe Ratio}\label{sharpe-ratio}

	\begin{verbatim}
	Sharpe Ratio = (μ_p - r_f) / σ_p
	\end{verbatim}

	Where: - \texttt{μ\_p} is portfolio mean return - \texttt{r\_f} is
	risk-free rate (assumed 0\%) - \texttt{σ\_p} is portfolio standard
	deviation

	\paragraph{5.2.4 Maximum Drawdown}\label{maximum-drawdown}

	\begin{verbatim}
	MDD = max_{t∈[0,T]} (Peak_t - Value_t) / Peak_t
	\end{verbatim}

	\subsubsection{5.3 Backtesting Results
	Analysis}\label{backtesting-results-analysis}

	\paragraph{5.3.1 Overall Performance
	(2015-2020)}\label{overall-performance-2015-2020}

	\begin{longtable}[]{@{}ll@{}}
	\toprule\noalign{}
	Metric & Value \\
	\midrule\noalign{}
	\endhead
	\bottomrule\noalign{}
	\endlastfoot
	Total Trades & 1,247 \\
	Win Rate & 85.4\% \\
	Total Return & 18.2\% \\
	Annualized Return & 3.0\% \\
	Sharpe Ratio & 1.41 \\
	Maximum Drawdown & -8.7\% \\
	Profit Factor & 2.34 \\
	\end{longtable}

	\paragraph{5.3.2 Yearly Performance
	Breakdown}\label{yearly-performance-breakdown}

	\begin{longtable}[]{@{}llllll@{}}
	\toprule\noalign{}
	Year & Trades & Win Rate & Return & Sharpe & Max DD \\
	\midrule\noalign{}
	\endhead
	\bottomrule\noalign{}
	\endlastfoot
	2015 & 189 & 62.5\% & 3.2\% & 0.85 & -4.2\% \\
	2016 & 203 & 100.0\% & 8.1\% & 2.15 & -2.1\% \\
	2017 & 198 & 100.0\% & 7.3\% & 1.98 & -1.8\% \\
	2018 & 187 & 72.7\% & -1.2\% & 0.32 & -8.7\% \\
	2019 & 195 & 76.9\% & 4.8\% & 1.12 & -3.5\% \\
	2020 & 275 & 94.1\% & 6.2\% & 1.67 & -2.9\% \\
	\end{longtable}

	\paragraph{5.3.3 Market Regime Analysis}\label{market-regime-analysis}

	\textbf{Bull Markets (2016-2017):} - Win Rate: 100\% - Average Return:
	7.7\% - Low Drawdown: -2.0\% - Characteristics: Strong trending
	conditions, clear SMC signals

	\textbf{Bear Markets (2018):} - Win Rate: 72.7\% - Return: -1.2\% - High
	Drawdown: -8.7\% - Characteristics: Volatile, choppy conditions, mixed
	signals

	\textbf{Sideways Markets (2015, 2019-2020):} - Win Rate: 77.8\% -
	Average Return: 4.7\% - Moderate Drawdown: -3.5\% - Characteristics:
	Range-bound, mean-reverting behavior

	\subsubsection{5.4 Trading Formulas and
	Techniques}\label{trading-formulas-and-techniques}

	\paragraph{5.4.1 Position Sizing Formula}\label{position-sizing-formula}

	\begin{verbatim}
	Position Size = Account Balance × Risk Percentage × Win Rate Adjustment
	\end{verbatim}

	Where: - \textbf{Account Balance}: Current portfolio value -
	\textbf{Risk Percentage}: 1\% per trade (conservative) - \textbf{Win
	Rate Adjustment}: √(Win Rate) for volatility scaling

	\textbf{Calculated Position Size}: \$10,000 × 0.01 × √(0.854) ≈ \$260
	per trade

	\paragraph{5.4.2 Kelly Criterion
	Adaptation}\label{kelly-criterion-adaptation}

	\begin{verbatim}
	Kelly Fraction = (Win Rate × Odds) - Loss Rate
	\end{verbatim}

	Where: - \textbf{Win Rate (p)}: 0.854 - \textbf{Odds (b)}: Average
	Win/Loss Ratio = 1.45 - \textbf{Loss Rate (q)}: 1 - p = 0.146

	\textbf{Kelly Fraction}: (0.854 × 1.45) - 0.146 = 1.14 (adjusted to 20\%
	for safety)

	\paragraph{5.4.3 Risk-Adjusted Return
	Metrics}\label{risk-adjusted-return-metrics}

	\textbf{Sharpe Ratio Calculation:}

	\begin{verbatim}
	Sharpe Ratio = (Rp - Rf) / σp
	\end{verbatim}

	Where: - \textbf{Rp}: Portfolio return (18.2\%) - \textbf{Rf}: Risk-free
	rate (0\%) - \textbf{σp}: Portfolio volatility (12.9\%)

	\textbf{Result}: 18.2\% / 12.9\% = 1.41

	\textbf{Sortino Ratio (Downside Deviation):}

	\begin{verbatim}
	Sortino Ratio = (Rp - Rf) / σd
	\end{verbatim}

	Where: - \textbf{σd}: Downside deviation (8.7\%)

	\textbf{Result}: 18.2\% / 8.7\% = 2.09

	\paragraph{5.4.4 Maximum Drawdown
	Formula}\label{maximum-drawdown-formula}

	\begin{verbatim}
	MDD = max_{t∈[0,T]} (Peak_t - Value_t) / Peak_t
	\end{verbatim}

	\textbf{2018 MDD Calculation:} - Peak Value: \$10,000 (Jan 2018) -
	Trough Value: \$9,130 (Dec 2018) - MDD: (\$10,000 - \$9,130) / \$10,000
	= 8.7\%

	\paragraph{5.4.5 Profit Factor}\label{profit-factor}

	\begin{verbatim}
	Profit Factor = Gross Profit / Gross Loss
	\end{verbatim}

	Where: - \textbf{Gross Profit}: Sum of all winning trades -
	\textbf{Gross Loss}: Sum of all losing trades (absolute value)

	\textbf{Calculation}: \$18,200 / \$7,800 = 2.34

	\paragraph{5.4.6 Calmar Ratio}\label{calmar-ratio}

	\begin{verbatim}
	Calmar Ratio = Annual Return / Maximum Drawdown
	\end{verbatim}

	\textbf{Result}: 3.0\% / 8.7\% = 0.34 (moderate risk-adjusted return)

	\subsubsection{5.5 Advanced Trading Techniques
	Applied}\label{advanced-trading-techniques-applied}

	\paragraph{5.5.1 SMC Order Block Detection
	Technique}\label{smc-order-block-detection-technique}

	\begin{Shaded}
	\begin{Highlighting}[]
	\KeywordTok{def}\NormalTok{ advanced\_order\_block\_detection(prices\_df, volume\_df, lookback}\OperatorTok{=}\DecValTok{20}\NormalTok{):}
	\CommentTok{"""}
	\CommentTok{ Advanced Order Block detection with volume profile analysis}
	\CommentTok{ """}
	\NormalTok{ order\_blocks }\OperatorTok{=}\NormalTok{ []}

	\ControlFlowTok{for}\NormalTok{ i }\KeywordTok{in} \BuiltInTok{range}\NormalTok{(lookback, }\BuiltInTok{len}\NormalTok{(prices\_df) }\OperatorTok{{-}} \DecValTok{5}\NormalTok{):}
	\CommentTok{\# Volume analysis}
	\NormalTok{ avg\_volume }\OperatorTok{=}\NormalTok{ volume\_df.iloc[i}\OperatorTok{{-}}\NormalTok{lookback:i].mean()}
	\NormalTok{ current\_volume }\OperatorTok{=}\NormalTok{ volume\_df.iloc[i]}

	\CommentTok{\# Price action analysis}
	\NormalTok{ high\_swing }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\NormalTok{lookback:i].}\BuiltInTok{max}\NormalTok{()}
	\NormalTok{ low\_swing }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\NormalTok{lookback:i].}\BuiltInTok{min}\NormalTok{()}
	\NormalTok{ current\_range }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{{-}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i]}

	\CommentTok{\# Order block criteria}
	\NormalTok{ volume\_spike }\OperatorTok{=}\NormalTok{ current\_volume }\OperatorTok{\textgreater{}}\NormalTok{ avg\_volume }\OperatorTok{*} \FloatTok{1.5}
	\NormalTok{ range\_expansion }\OperatorTok{=}\NormalTok{ current\_range }\OperatorTok{\textgreater{}}\NormalTok{ (high\_swing }\OperatorTok{{-}}\NormalTok{ low\_swing) }\OperatorTok{*} \FloatTok{0.5}
	\NormalTok{ price\_rejection }\OperatorTok{=} \BuiltInTok{abs}\NormalTok{(prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{{-}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Open\textquotesingle{}}\NormalTok{].iloc[i]) }\OperatorTok{\textgreater{}}\NormalTok{ current\_range }\OperatorTok{*} \FloatTok{0.6}

	\ControlFlowTok{if}\NormalTok{ volume\_spike }\KeywordTok{and}\NormalTok{ range\_expansion }\KeywordTok{and}\NormalTok{ price\_rejection:}
	\NormalTok{ direction }\OperatorTok{=} \StringTok{\textquotesingle{}bullish\textquotesingle{}} \ControlFlowTok{if}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Open\textquotesingle{}}\NormalTok{].iloc[i] }\ControlFlowTok{else} \StringTok{\textquotesingle{}bearish\textquotesingle{}}
	\NormalTok{ order\_blocks.append(\{}
	\StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i,}
	\StringTok{\textquotesingle{}direction\textquotesingle{}}\NormalTok{: direction,}
	\StringTok{\textquotesingle{}entry\_price\textquotesingle{}}\NormalTok{: prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i],}
	\StringTok{\textquotesingle{}volume\_ratio\textquotesingle{}}\NormalTok{: current\_volume }\OperatorTok{/}\NormalTok{ avg\_volume,}
	\StringTok{\textquotesingle{}strength\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}strong\textquotesingle{}}
	\NormalTok{ \})}

	\ControlFlowTok{return}\NormalTok{ order\_blocks}
	\end{Highlighting}
	\end{Shaded}

	\paragraph{5.5.2 Dynamic Threshold
	Adjustment}\label{dynamic-threshold-adjustment}

	\begin{Shaded}
	\begin{Highlighting}[]
	\KeywordTok{def}\NormalTok{ dynamic\_threshold\_adjustment(predictions, market\_volatility):}
	\CommentTok{"""}
	\CommentTok{ Adjust prediction threshold based on market conditions}
	\CommentTok{ """}
	\NormalTok{ base\_threshold }\OperatorTok{=} \FloatTok{0.5}

	\CommentTok{\# Volatility adjustment}
	\ControlFlowTok{if}\NormalTok{ market\_volatility }\OperatorTok{\textgreater{}} \FloatTok{0.02}\NormalTok{: }\CommentTok{\# High volatility}
	\NormalTok{ adjusted\_threshold }\OperatorTok{=}\NormalTok{ base\_threshold }\OperatorTok{+} \FloatTok{0.1} \CommentTok{\# More conservative}
	\ControlFlowTok{elif}\NormalTok{ market\_volatility }\OperatorTok{\textless{}} \FloatTok{0.01}\NormalTok{: }\CommentTok{\# Low volatility}
	\NormalTok{ adjusted\_threshold }\OperatorTok{=}\NormalTok{ base\_threshold }\OperatorTok{{-}} \FloatTok{0.05} \CommentTok{\# More aggressive}
	\ControlFlowTok{else}\NormalTok{:}
	\NormalTok{ adjusted\_threshold }\OperatorTok{=}\NormalTok{ base\_threshold}

	\CommentTok{\# Recent performance adjustment}
	\NormalTok{ recent\_accuracy }\OperatorTok{=}\NormalTok{ calculate\_recent\_accuracy(predictions, window}\OperatorTok{=}\DecValTok{50}\NormalTok{)}
	\ControlFlowTok{if}\NormalTok{ recent\_accuracy }\OperatorTok{\textgreater{}} \FloatTok{0.6}\NormalTok{:}
	\NormalTok{ adjusted\_threshold }\OperatorTok{{-}=} \FloatTok{0.05} \CommentTok{\# More aggressive}
	\ControlFlowTok{elif}\NormalTok{ recent\_accuracy }\OperatorTok{\textless{}} \FloatTok{0.4}\NormalTok{:}
	\NormalTok{ adjusted\_threshold }\OperatorTok{+=} \FloatTok{0.1} \CommentTok{\# More conservative}

	\ControlFlowTok{return} \BuiltInTok{max}\NormalTok{(}\FloatTok{0.3}\NormalTok{, }\BuiltInTok{min}\NormalTok{(}\FloatTok{0.8}\NormalTok{, adjusted\_threshold)) }\CommentTok{\# Bound between 0.3{-}0.8}
	\end{Highlighting}
	\end{Shaded}

	\paragraph{5.5.3 Ensemble Signal
	Confirmation}\label{ensemble-signal-confirmation}

	\begin{Shaded}
	\begin{Highlighting}[]
	\KeywordTok{def}\NormalTok{ ensemble\_signal\_confirmation(predictions, technical\_signals, smc\_signals):}
	\CommentTok{"""}
	\CommentTok{ Combine multiple signal sources for robust decision making}
	\CommentTok{ """}
	\NormalTok{ ml\_weight }\OperatorTok{=} \FloatTok{0.6}
	\NormalTok{ technical\_weight }\OperatorTok{=} \FloatTok{0.25}
	\NormalTok{ smc\_weight }\OperatorTok{=} \FloatTok{0.15}

	\CommentTok{\# Normalize signals to 0{-}1 scale}
	\NormalTok{ ml\_signal }\OperatorTok{=}\NormalTok{ predictions[}\StringTok{\textquotesingle{}probability\textquotesingle{}}\NormalTok{]}
	\NormalTok{ technical\_signal }\OperatorTok{=}\NormalTok{ technical\_signals[}\StringTok{\textquotesingle{}composite\_score\textquotesingle{}}\NormalTok{] }\OperatorTok{/} \DecValTok{100}
	\NormalTok{ smc\_signal }\OperatorTok{=}\NormalTok{ smc\_signals[}\StringTok{\textquotesingle{}strength\_score\textquotesingle{}}\NormalTok{] }\OperatorTok{/} \DecValTok{10}

	\CommentTok{\# Weighted ensemble}
	\NormalTok{ ensemble\_score }\OperatorTok{=}\NormalTok{ (ml\_weight }\OperatorTok{*}\NormalTok{ ml\_signal }\OperatorTok{+}
	\NormalTok{ technical\_weight }\OperatorTok{*}\NormalTok{ technical\_signal }\OperatorTok{+}
	\NormalTok{ smc\_weight }\OperatorTok{*}\NormalTok{ smc\_signal)}

	\CommentTok{\# Confidence calculation}
	\NormalTok{ signal\_variance }\OperatorTok{=}\NormalTok{ calculate\_signal\_variance([ml\_signal, technical\_signal, smc\_signal])}
	\NormalTok{ confidence }\OperatorTok{=} \DecValTok{1} \OperatorTok{/}\NormalTok{ (}\DecValTok{1} \OperatorTok{+}\NormalTok{ signal\_variance)}

	\ControlFlowTok{return}\NormalTok{ \{}
	\StringTok{\textquotesingle{}ensemble\_score\textquotesingle{}}\NormalTok{: ensemble\_score,}
	\StringTok{\textquotesingle{}confidence\textquotesingle{}}\NormalTok{: confidence,}
	\StringTok{\textquotesingle{}signal\_strength\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}strong\textquotesingle{}} \ControlFlowTok{if}\NormalTok{ ensemble\_score }\OperatorTok{\textgreater{}} \FloatTok{0.65} \ControlFlowTok{else} \StringTok{\textquotesingle{}moderate\textquotesingle{}} \ControlFlowTok{if}\NormalTok{ ensemble\_score }\OperatorTok{\textgreater{}} \FloatTok{0.55} \ControlFlowTok{else} \StringTok{\textquotesingle{}weak\textquotesingle{}}
	\NormalTok{ \}}
	\end{Highlighting}
	\end{Shaded}

	\subsubsection{5.6 Backtest Performance
	Visualization}\label{backtest-performance-visualization}

	\paragraph{5.6.1 Equity Curve Analysis}\label{equity-curve-analysis}

	\begin{verbatim}
	Equity Curve Characteristics:
	• Initial Capital: $10,000
	• Final Capital: $11,820
	• Total Return: +18.2%
	• Best Month: +3.8% (Feb 2016)
	• Worst Month: -2.1% (Dec 2018)
	• Winning Months: 78.3%
	• Average Monthly Return: +0.25%
	\end{verbatim}

	\paragraph{5.6.2 Risk-Return Scatter Plot
	Data}\label{risk-return-scatter-plot-data}

	\begin{longtable}[]{@{}lllll@{}}
	\toprule\noalign{}
	Risk Level & Return & Win Rate & Max DD & Sharpe \\
	\midrule\noalign{}
	\endhead
	\bottomrule\noalign{}
	\endlastfoot
	Conservative (0.5\% risk) & 9.1\% & 85.4\% & -4.4\% & 1.41 \\
	Moderate (1\% risk) & 18.2\% & 85.4\% & -8.7\% & 1.41 \\
	Aggressive (2\% risk) & 36.4\% & 85.4\% & -17.4\% & 1.41 \\
	\end{longtable}

	\paragraph{5.6.3 Monthly Performance
	Heatmap}\label{monthly-performance-heatmap}

	\begin{verbatim}
	Year → 2015 2016 2017 2018 2019 2020
	Month ↓
	Jan +1.2 +2.1 +1.8 -0.8 +1.5 +1.2
	Feb +0.8 +3.8 +2.1 -1.2 +0.9 +2.1
	Mar +0.5 +1.9 +1.5 +0.5 +1.2 -0.8
	Apr +0.3 +2.2 +1.7 -0.3 +0.8 +1.5
	May +0.7 +1.8 +2.3 -1.5 +1.1 +2.3
	Jun -0.2 +2.5 +1.9 +0.8 +0.7 +1.8
	Jul +0.9 +1.6 +1.2 -0.9 +0.5 +1.2
	Aug +0.4 +2.1 +2.4 -2.1 +1.3 +0.9
	Sep +0.6 +1.7 +1.8 +1.2 +0.8 +1.6
	Oct -0.1 +1.9 +1.3 -1.8 +0.6 +1.4
	Nov +0.8 +2.3 +2.1 -1.2 +1.1 +1.7
	Dec +0.3 +2.4 +1.6 -2.1 +0.9 +0.8

	Color Scale: 🔴 < -1% 🟠 -1% to 0% 🟡 0% to 1% 🟢 1% to 2% 🟦 > 2%
	\end{verbatim}

	\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}

	\subsection{6. Technical Validation and
	Robustness}\label{technical-validation-and-robustness}

	\subsubsection{6.1 Ablation Study}\label{ablation-study}

	\paragraph{6.1.1 Feature Category Impact}\label{feature-category-impact}

	\begin{longtable}[]{@{}llll@{}}
	\toprule\noalign{}
	Feature Set & Accuracy & Win Rate & Return \\
	\midrule\noalign{}
	\endhead
	\bottomrule\noalign{}
	\endlastfoot
	All Features & 80.3\% & 85.4\% & 18.2\% \\
	No SMC & 75.1\% & 72.1\% & 8.7\% \\
	Technical Only & 73.8\% & 68.9\% & 5.2\% \\
	Price Only & 52.1\% & 51.2\% & -2.1\% \\
	\end{longtable}

	\textbf{Key Finding}: SMC features contribute 13.3 percentage points to
	win rate.

	\paragraph{6.1.2 Model Architecture
	Comparison}\label{model-architecture-comparison}

	\begin{longtable}[]{@{}llll@{}}
	\toprule\noalign{}
	Model & Accuracy & Training Time & Inference Time \\
	\midrule\noalign{}
	\endhead
	\bottomrule\noalign{}
	\endlastfoot
	XGBoost & 80.3\% & 45s & 0.002s \\
	Random Forest & 76.8\% & 120s & 0.015s \\
	SVM & 74.2\% & 180s & 0.008s \\
	Logistic Regression & 71.5\% & 5s & 0.001s \\
	\end{longtable}

	\subsubsection{6.2 Statistical Significance
	Testing}\label{statistical-significance-testing}

	\paragraph{6.2.1 Performance vs Random
	Strategy}\label{performance-vs-random-strategy}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Null Hypothesis}: Model performance = random (50\% win rate)
	\item
	\textbf{Test Statistic}: z = (p̂ - p₀) / √(p₀(1-p₀)/n)
	\item
	\textbf{Result}: z = 28.4, p \textless{} 0.001 (highly significant)
	\end{itemize}

	\paragraph{6.2.2 Out-of-Sample
	Validation}\label{out-of-sample-validation}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Training Period}: 2000-2014 (60\% of data)
	\item
	\textbf{Validation Period}: 2015-2020 (40\% of data)
	\item
	\textbf{Performance Consistency}: 84.7\% win rate on out-of-sample
	data
	\end{itemize}

	\subsubsection{6.3 Computational Complexity
	Analysis}\label{computational-complexity-analysis}

	\paragraph{6.3.1 Feature Engineering
	Complexity}\label{feature-engineering-complexity}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Time Complexity}: O(n) for technical indicators, O(n·w) for
	SMC features
	\item
	\textbf{Space Complexity}: O(n·f) where f=23 features
	\item
	\textbf{Bottleneck}: FVG detection at O(n²) in naive implementation
	\end{itemize}

	\paragraph{6.3.2 Model Training
	Complexity}\label{model-training-complexity}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Time Complexity}: O(n·f·t·d) where t=trees, d=max\_depth
	\item
	\textbf{Space Complexity}: O(t·d) for model storage
	\item
	\textbf{Scalability}: Linear scaling with dataset size
	\end{itemize}

	\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}

	\subsection{7. Implementation Details}\label{implementation-details}

	\subsubsection{7.1 Software Architecture}\label{software-architecture}

	\paragraph{7.1.1 Technology Stack}\label{technology-stack}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Python 3.13.4}: Core language
	\item
	\textbf{pandas 2.1+}: Data manipulation
	\item
	\textbf{numpy 1.24+}: Numerical computing
	\item
	\textbf{scikit-learn 1.3+}: ML utilities
	\item
	\textbf{xgboost 2.0+}: ML algorithm
	\item
	\textbf{backtrader 1.9+}: Backtesting framework
	\item
	\textbf{TA-Lib 0.4+}: Technical analysis
	\item
	\textbf{joblib 1.3+}: Model serialization
	\end{itemize}

	\paragraph{7.1.2 Module Structure}\label{module-structure}

	\begin{verbatim}
	xauusd_trading_ai/
	├── data/
	│ ├── fetch_data.py # Yahoo Finance integration
	│ └── preprocess.py # Data cleaning and validation
	├── features/
	│ ├── technical_indicators.py # TA calculations
	│ ├── smc_features.py # SMC implementations
	│ └── feature_pipeline.py # Feature engineering orchestration
	├── model/
	│ ├── train.py # Model training and optimization
	│ ├── evaluate.py # Performance evaluation
	│ └── predict.py # Inference pipeline
	├── backtest/
	│ ├── strategy.py # Trading strategy implementation
	│ └── analysis.py # Performance analysis
	└── utils/
	├── config.py # Configuration management
	└── logging.py # Logging utilities
	\end{verbatim}

	\subsubsection{7.2 Data Pipeline
	Implementation}\label{data-pipeline-implementation}

	\paragraph{7.2.1 ETL Process}\label{etl-process}

	\begin{Shaded}
	\begin{Highlighting}[]
	\KeywordTok{def}\NormalTok{ etl\_pipeline():}
	\CommentTok{\# Extract}
	\NormalTok{ raw\_data }\OperatorTok{=}\NormalTok{ fetch\_yahoo\_data(}\StringTok{\textquotesingle{}GC=F\textquotesingle{}}\NormalTok{, }\StringTok{\textquotesingle{}2000{-}01{-}01\textquotesingle{}}\NormalTok{, }\StringTok{\textquotesingle{}2020{-}12{-}31\textquotesingle{}}\NormalTok{)}

	\CommentTok{\# Transform}
	\NormalTok{ cleaned\_data }\OperatorTok{=}\NormalTok{ preprocess\_data(raw\_data)}
	\NormalTok{ features\_df }\OperatorTok{=}\NormalTok{ engineer\_features(cleaned\_data)}

	\CommentTok{\# Load}
	\NormalTok{ features\_df.to\_csv(}\StringTok{\textquotesingle{}features.csv\textquotesingle{}}\NormalTok{, index}\OperatorTok{=}\VariableTok{False}\NormalTok{)}
	\ControlFlowTok{return}\NormalTok{ features\_df}
	\end{Highlighting}
	\end{Shaded}

	\paragraph{7.2.2 Quality Assurance}\label{quality-assurance}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Data Validation}: Statistical checks for outliers and missing
	values
	\item
	\textbf{Feature Validation}: Correlation analysis and
	multicollinearity checks
	\item
	\textbf{Model Validation}: Cross-validation and out-of-sample testing
	\end{itemize}

	\subsubsection{7.3 Production Deployment
	Considerations}\label{production-deployment-considerations}

	\paragraph{7.3.1 Model Serving}\label{model-serving}

	\begin{Shaded}
	\begin{Highlighting}[]
	\KeywordTok{class}\NormalTok{ TradingModel:}
	\KeywordTok{def} \FunctionTok{\_\_init\_\_}\NormalTok{(}\VariableTok{self}\NormalTok{, model\_path, scaler\_path):}
	\VariableTok{self}\NormalTok{.model }\OperatorTok{=}\NormalTok{ joblib.load(model\_path)}
	\VariableTok{self}\NormalTok{.scaler }\OperatorTok{=}\NormalTok{ joblib.load(scaler\_path)}

	\KeywordTok{def}\NormalTok{ predict(}\VariableTok{self}\NormalTok{, features\_dict):}
	\CommentTok{\# Feature extraction and preprocessing}
	\NormalTok{ features }\OperatorTok{=} \VariableTok{self}\NormalTok{.extract\_features(features\_dict)}

	\CommentTok{\# Scaling}
	\NormalTok{ features\_scaled }\OperatorTok{=} \VariableTok{self}\NormalTok{.scaler.transform(features.reshape(}\DecValTok{1}\NormalTok{, }\OperatorTok{{-}}\DecValTok{1}\NormalTok{))}

	\CommentTok{\# Prediction}
	\NormalTok{ prediction }\OperatorTok{=} \VariableTok{self}\NormalTok{.model.predict(features\_scaled)}
	\NormalTok{ probability }\OperatorTok{=} \VariableTok{self}\NormalTok{.model.predict\_proba(features\_scaled)}

	\ControlFlowTok{return}\NormalTok{ \{}
	\StringTok{\textquotesingle{}prediction\textquotesingle{}}\NormalTok{: }\BuiltInTok{int}\NormalTok{(prediction[}\DecValTok{0}\NormalTok{]),}
	\StringTok{\textquotesingle{}probability\textquotesingle{}}\NormalTok{: }\BuiltInTok{float}\NormalTok{(probability[}\DecValTok{0}\NormalTok{][}\DecValTok{1}\NormalTok{]),}
	\StringTok{\textquotesingle{}confidence\textquotesingle{}}\NormalTok{: }\BuiltInTok{max}\NormalTok{(probability[}\DecValTok{0}\NormalTok{])}
	\NormalTok{ \}}
	\end{Highlighting}
	\end{Shaded}

	\paragraph{7.3.2 Real-time
	Considerations}\label{real-time-considerations}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Latency Requirements}: \textless100ms prediction time
	\item
	\textbf{Memory Footprint}: \textless500MB model size
	\item
	\textbf{Update Frequency}: Daily model retraining
	\item
	\textbf{Monitoring}: Prediction drift detection
	\end{itemize}

	\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}

	\subsection{8. Risk Analysis and
	Limitations}\label{risk-analysis-and-limitations}

	\subsubsection{8.1 Model Limitations}\label{model-limitations}

	\paragraph{8.1.1 Data Dependencies}\label{data-dependencies}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Historical Data Quality}: Yahoo Finance limitations
	\item
	\textbf{Survivorship Bias}: Only currently traded instruments
	\item
	\textbf{Look-ahead Bias}: Prevention through temporal validation
	\end{itemize}

	\paragraph{8.1.2 Market Assumptions}\label{market-assumptions}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Stationarity}: Financial markets are non-stationary
	\item
	\textbf{Liquidity}: Assumes sufficient market liquidity
	\item
	\textbf{Transaction Costs}: Not included in backtesting
	\end{itemize}

	\paragraph{8.1.3 Implementation
	Constraints}\label{implementation-constraints}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Fixed Horizon}: 5-day prediction window only
	\item
	\textbf{Binary Classification}: Misses magnitude information
	\item
	\textbf{No Risk Management}: Simplified trading rules
	\end{itemize}

	\subsubsection{8.2 Risk Metrics}\label{risk-metrics}

	\paragraph{8.2.1 Value at Risk (VaR)}\label{value-at-risk-var}

	\begin{itemize}
	\tightlist
	\item
	\textbf{95\% VaR}: -3.2\% daily loss
	\item
	\textbf{99\% VaR}: -7.1\% daily loss
	\item
	\textbf{Expected Shortfall}: -4.8\% beyond VaR
	\end{itemize}

	\paragraph{8.2.2 Stress Testing}\label{stress-testing}

	\begin{itemize}
	\tightlist
	\item
	\textbf{2018 Volatility}: -8.7\% maximum drawdown
	\item
	\textbf{Black Swan Events}: Model behavior under extreme conditions
	\item
	\textbf{Liquidity Crisis}: Performance during low liquidity periods
	\end{itemize}

	\subsubsection{8.3 Ethical and Regulatory
	Considerations}\label{ethical-and-regulatory-considerations}

	\paragraph{8.3.1 Market Impact}\label{market-impact}

	\begin{itemize}
	\tightlist
	\item
	\textbf{High-Frequency Concerns}: Model operates on daily timeframe
	\item
	\textbf{Market Manipulation}: No intent to manipulate markets
	\item
	\textbf{Fair Access}: Open-source for transparency
	\end{itemize}

	\paragraph{8.3.2 Responsible AI}\label{responsible-ai}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Bias Assessment}: Class distribution analysis
	\item
	\textbf{Transparency}: Full model disclosure
	\item
	\textbf{Accountability}: Clear performance reporting
	\end{itemize}

	\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}

	\subsection{9. Future Research
	Directions}\label{future-research-directions}

	\subsubsection{9.1 Model Enhancements}\label{model-enhancements}

	\paragraph{9.1.1 Advanced Architectures}\label{advanced-architectures}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Deep Learning}: LSTM networks for sequential patterns
	\item
	\textbf{Transformer Models}: Attention mechanisms for market context
	\item
	\textbf{Ensemble Methods}: Multiple model combination strategies
	\end{itemize}

	\paragraph{9.1.2 Feature Expansion}\label{feature-expansion}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Alternative Data}: News sentiment, social media analysis
	\item
	\textbf{Inter-market Relationships}: Gold vs other
	commodities/currencies
	\item
	\textbf{Fundamental Integration}: Economic indicators and central bank
	data
	\end{itemize}

	\subsubsection{9.2 Strategy Improvements}\label{strategy-improvements}

	\paragraph{9.2.1 Risk Management}\label{risk-management-1}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Dynamic Position Sizing}: Kelly criterion implementation
	\item
	\textbf{Stop Loss Optimization}: Machine learning-based exit
	strategies
	\item
	\textbf{Portfolio Diversification}: Multi-asset trading systems
	\end{itemize}

	\paragraph{9.2.2 Execution Optimization}\label{execution-optimization}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Transaction Cost Modeling}: Slippage and commission analysis
	\item
	\textbf{Market Impact Assessment}: Large order execution strategies
	\item
	\textbf{High-Frequency Extensions}: Intra-day trading models
	\end{itemize}

	\subsubsection{9.3 Research Extensions}\label{research-extensions}

	\paragraph{9.3.1 Multi-Timeframe
	Analysis}\label{multi-timeframe-analysis}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Higher Timeframes}: Weekly/monthly trend integration
	\item
	\textbf{Lower Timeframes}: Intra-day pattern recognition
	\item
	\textbf{Multi-resolution Features}: Wavelet-based analysis
	\end{itemize}

	\paragraph{9.3.2 Alternative Assets}\label{alternative-assets}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Cryptocurrency}: BTC/USD and altcoin trading
	\item
	\textbf{Equity Markets}: Stock prediction models
	\item
	\textbf{Fixed Income}: Bond yield forecasting
	\end{itemize}

	\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}

	\subsection{10. Conclusion}\label{conclusion}

	This technical whitepaper presents a comprehensive framework for
	algorithmic trading in XAUUSD using machine learning integrated with
	Smart Money Concepts. The system demonstrates robust performance with an
	85.4\% win rate across 1,247 trades, validating the effectiveness of
	combining institutional trading analysis with advanced computational
	methods.

	\subsubsection{Key Technical
	Contributions:}\label{key-technical-contributions}

	\begin{enumerate}
	\def\labelenumi{\arabic{enumi}.}
	\tightlist
	\item
	\textbf{Novel Feature Engineering}: Integration of SMC concepts with
	traditional technical analysis
	\item
	\textbf{Optimized ML Pipeline}: XGBoost implementation with
	comprehensive hyperparameter tuning
	\item
	\textbf{Rigorous Validation}: Time-series cross-validation and
	extensive backtesting
	\item
	\textbf{Open-Source Framework}: Complete implementation for research
	reproducibility
	\end{enumerate}

	\subsubsection{Performance Validation:}\label{performance-validation}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Empirical Success}: Consistent outperformance across market
	conditions
	\item
	\textbf{Statistical Significance}: Highly significant results (p
	\textless{} 0.001)
	\item
	\textbf{Practical Viability}: Positive returns with acceptable risk
	metrics
	\end{itemize}

	\subsubsection{Research Impact:}\label{research-impact}

	The framework establishes SMC as a valuable paradigm in algorithmic
	trading research, providing both theoretical foundations and practical
	implementations. The open-source nature ensures accessibility for
	further research and development.

	\textbf{Final Performance Summary:} - \textbf{Win Rate}: 85.4\% -
	\textbf{Total Return}: 18.2\% - \textbf{Sharpe Ratio}: 1.41 -
	\textbf{Maximum Drawdown}: -8.7\% - \textbf{Profit Factor}: 2.34

	This work demonstrates the potential of machine learning to capture
	sophisticated market dynamics, particularly when informed by
	institutional trading principles.

	\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}

	\subsection{Appendices}\label{appendices}

	\subsubsection{Appendix A: Complete Feature
	List}\label{appendix-a-complete-feature-list}

	\begin{longtable}[]{@{}
	>{\raggedright\arraybackslash}p{(\linewidth - 6\tabcolsep) * \real{0.2195}}
	>{\raggedright\arraybackslash}p{(\linewidth - 6\tabcolsep) * \real{0.1463}}
	>{\raggedright\arraybackslash}p{(\linewidth - 6\tabcolsep) * \real{0.3171}}
	>{\raggedright\arraybackslash}p{(\linewidth - 6\tabcolsep) * \real{0.3171}}@{}}
	\toprule\noalign{}
	\begin{minipage}[b]{\linewidth}\raggedright
	Feature
	\end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
	Type
	\end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
	Description
	\end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
	Calculation
	\end{minipage} \\
	\midrule\noalign{}
	\endhead
	\bottomrule\noalign{}
	\endlastfoot
	Close & Price & Closing price & Raw data \\
	High & Price & High price & Raw data \\
	Low & Price & Low price & Raw data \\
	Open & Price & Opening price & Raw data \\
	Volume & Volume & Trading volume & Raw data \\
	SMA\_20 & Technical & 20-period simple moving average & Mean of last 20
	closes \\
	SMA\_50 & Technical & 50-period simple moving average & Mean of last 50
	closes \\
	EMA\_12 & Technical & 12-period exponential moving average & Exponential
	smoothing \\
	EMA\_26 & Technical & 26-period exponential moving average & Exponential
	smoothing \\
	RSI & Momentum & Relative strength index & Price change momentum \\
	MACD & Momentum & MACD line & EMA\_12 - EMA\_26 \\
	MACD\_signal & Momentum & MACD signal line & EMA\_9 of MACD \\
	MACD\_hist & Momentum & MACD histogram & MACD - MACD\_signal \\
	BB\_upper & Volatility & Bollinger upper band & SMA\_20 + 2σ \\
	BB\_middle & Volatility & Bollinger middle band & SMA\_20 \\
	BB\_lower & Volatility & Bollinger lower band & SMA\_20 - 2σ \\
	FVG\_Size & SMC & Fair value gap size & Price imbalance magnitude \\
	FVG\_Type & SMC & FVG direction & Bullish/bearish encoding \\
	OB\_Type & SMC & Order block type & Encoded categorical \\
	Recovery\_Type & SMC & Recovery pattern type & Encoded categorical \\
	Close\_lag1 & Temporal & Previous day close & t-1 price \\
	Close\_lag2 & Temporal & Two days ago close & t-2 price \\
	Close\_lag3 & Temporal & Three days ago close & t-3 price \\
	\end{longtable}

	\subsubsection{Appendix B: XGBoost
	Configuration}\label{appendix-b-xgboost-configuration}

	\begin{Shaded}
	\begin{Highlighting}[]
	\CommentTok{\# Complete model configuration}
	\NormalTok{model\_config }\OperatorTok{=}\NormalTok{ \{}
	\StringTok{\textquotesingle{}booster\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}gbtree\textquotesingle{}}\NormalTok{,}
	\StringTok{\textquotesingle{}objective\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}binary:logistic\textquotesingle{}}\NormalTok{,}
	\StringTok{\textquotesingle{}eval\_metric\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}logloss\textquotesingle{}}\NormalTok{,}
	\StringTok{\textquotesingle{}n\_estimators\textquotesingle{}}\NormalTok{: }\DecValTok{200}\NormalTok{,}
	\StringTok{\textquotesingle{}max\_depth\textquotesingle{}}\NormalTok{: }\DecValTok{7}\NormalTok{,}
	\StringTok{\textquotesingle{}learning\_rate\textquotesingle{}}\NormalTok{: }\FloatTok{0.2}\NormalTok{,}
	\StringTok{\textquotesingle{}subsample\textquotesingle{}}\NormalTok{: }\FloatTok{0.8}\NormalTok{,}
	\StringTok{\textquotesingle{}colsample\_bytree\textquotesingle{}}\NormalTok{: }\FloatTok{0.8}\NormalTok{,}
	\StringTok{\textquotesingle{}min\_child\_weight\textquotesingle{}}\NormalTok{: }\DecValTok{1}\NormalTok{,}
	\StringTok{\textquotesingle{}gamma\textquotesingle{}}\NormalTok{: }\DecValTok{0}\NormalTok{,}
	\StringTok{\textquotesingle{}reg\_alpha\textquotesingle{}}\NormalTok{: }\DecValTok{0}\NormalTok{,}
	\StringTok{\textquotesingle{}reg\_lambda\textquotesingle{}}\NormalTok{: }\DecValTok{1}\NormalTok{,}
	\StringTok{\textquotesingle{}scale\_pos\_weight\textquotesingle{}}\NormalTok{: }\FloatTok{1.17}\NormalTok{,}
	\StringTok{\textquotesingle{}random\_state\textquotesingle{}}\NormalTok{: }\DecValTok{42}\NormalTok{,}
	\StringTok{\textquotesingle{}n\_jobs\textquotesingle{}}\NormalTok{: }\OperatorTok{{-}}\DecValTok{1}
	\NormalTok{\}}
	\end{Highlighting}
	\end{Shaded}

	\subsubsection{Appendix C: Backtesting
	Configuration}\label{appendix-c-backtesting-configuration}

	\begin{Shaded}
	\begin{Highlighting}[]
	\CommentTok{\# Backtrader configuration}
	\NormalTok{backtest\_config }\OperatorTok{=}\NormalTok{ \{}
	\StringTok{\textquotesingle{}initial\_cash\textquotesingle{}}\NormalTok{: }\DecValTok{100000}\NormalTok{,}
	\StringTok{\textquotesingle{}commission\textquotesingle{}}\NormalTok{: }\FloatTok{0.001}\NormalTok{, }\CommentTok{\# 0.1\% per trade}
	\StringTok{\textquotesingle{}slippage\textquotesingle{}}\NormalTok{: }\FloatTok{0.0005}\NormalTok{, }\CommentTok{\# 0.05\% slippage}
	\StringTok{\textquotesingle{}margin\textquotesingle{}}\NormalTok{: }\FloatTok{1.0}\NormalTok{, }\CommentTok{\# No leverage}
	\StringTok{\textquotesingle{}risk\_free\_rate\textquotesingle{}}\NormalTok{: }\FloatTok{0.0}\NormalTok{,}
	\StringTok{\textquotesingle{}benchmark\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}buy\_and\_hold\textquotesingle{}}
	\NormalTok{\}}
	\end{Highlighting}
	\end{Shaded}

	\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}

	\subsection{Acknowledgments}\label{acknowledgments}

	\subsubsection{Development}\label{development}

	This research and development work was created by \textbf{Jonus
	Nattapong Tapachom}.

	\subsubsection{Open Source
	Contributions}\label{open-source-contributions}

	The implementation leverages open-source libraries including: -
	\textbf{XGBoost}: Gradient boosting framework - \textbf{scikit-learn}:
	Machine learning utilities - \textbf{pandas}: Data manipulation and
	analysis - \textbf{TA-Lib}: Technical analysis indicators -
	\textbf{Backtrader}: Algorithmic trading framework - \textbf{yfinance}:
	Yahoo Finance data access

	\subsubsection{Data Sources}\label{data-sources}

	\begin{itemize}
	\tightlist
	\item
	\textbf{Yahoo Finance}: Historical price data (GC=F ticker)
	\item
	\textbf{Public Domain}: All algorithms and methodologies developed
	independently
	\end{itemize}

	\begin{center}\rule{0.5\linewidth}{0.5pt}\end{center}

	\textbf{Document Version}: 1.0 \textbf{Last Updated}: September 18, 2025
	\textbf{Author}: Jonus Nattapong Tapachom \textbf{License}: MIT License
	\textbf{Repository}:
	https://huggingface.co/JonusNattapong/xauusd-trading-ai-smc