\section{XAUUSD Trading AI: Technical Whitepaper}\label{xauusd-trading-ai-technical-whitepaper} \subsection{Machine Learning Framework with Smart Money Concepts Integration}\label{machine-learning-framework-with-smart-money-concepts-integration} \textbf{Version 1.0} \textbar{} \textbf{Date: September 18, 2025} \textbar{} \textbf{Author: Jonus Nattapong Tapachom} \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center} \subsection{Executive Summary}\label{executive-summary} This technical whitepaper presents a comprehensive algorithmic trading framework for XAUUSD (Gold/USD futures) price prediction, integrating Smart Money Concepts (SMC) with advanced machine learning techniques. The system achieves an 85.4\% win rate across 1,247 trades in backtesting (2015-2020), with a Sharpe ratio of 1.41 and total return of 18.2\%. \textbf{Key Technical Achievements:} - \textbf{23-Feature Engineering Pipeline}: Combining traditional technical indicators with SMC-derived features - \textbf{XGBoost Optimization}: Hyperparameter-tuned gradient boosting with class balancing - \textbf{Time-Series Cross-Validation}: Preventing data leakage in temporal predictions - \textbf{Multi-Regime Robustness}: Consistent performance across bull, bear, and sideways markets \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center} \subsection{1. System Architecture}\label{system-architecture} \subsubsection{1.1 Core Components}\label{core-components} \begin{verbatim} ┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐ │ Data Pipeline │───▶│ Feature Engineer │───▶│ ML Model │ │ │ │ │ │ │ │ • Yahoo Finance │ │ • Technical │ │ • XGBoost │ │ • Preprocessing │ │ • SMC Features │ │ • Prediction │ │ • Quality Check │ │ • Normalization │ │ • Probability │ └─────────────────┘ └──────────────────┘ └─────────────────┘ │ ┌─────────────────┐ ┌──────────────────┐ ▼ │ Backtesting │◀───│ Strategy Engine │ ┌─────────────────┐ │ Framework │ │ │ │ Signal │ │ │ │ • Position │ │ Generation │ │ • Performance │ │ • Risk Mgmt │ │ │ │ • Metrics │ │ • Execution │ └─────────────────┘ └─────────────────┘ └──────────────────┘ \end{verbatim} \subsubsection{1.2 Data Flow Architecture}\label{data-flow-architecture} \begin{Shaded} \begin{Highlighting}[] \NormalTok{graph TD} \NormalTok{ A[Yahoo Finance API] {-}{-}\textgreater{} B[Raw Price Data]} \NormalTok{ B {-}{-}\textgreater{} C[Data Validation]} \NormalTok{ C {-}{-}\textgreater{} D[Technical Indicators]} \NormalTok{ D {-}{-}\textgreater{} E[SMC Feature Extraction]} \NormalTok{ E {-}{-}\textgreater{} F[Feature Normalization]} \NormalTok{ F {-}{-}\textgreater{} G[Train/Validation Split]} \NormalTok{ G {-}{-}\textgreater{} H[XGBoost Training]} \NormalTok{ H {-}{-}\textgreater{} I[Model Validation]} \NormalTok{ I {-}{-}\textgreater{} J[Backtesting Engine]} \NormalTok{ J {-}{-}\textgreater{} K[Performance Analysis]} \end{Highlighting} \end{Shaded} \subsubsection{1.3 Dataset Flow Diagram}\label{dataset-flow-diagram} \begin{Shaded} \begin{Highlighting}[] \NormalTok{graph TD} \NormalTok{ A[Yahoo Finance\textless{}br/\textgreater{}GC=F Data\textless{}br/\textgreater{}2000{-}2020] {-}{-}\textgreater{} B[Data Cleaning\textless{}br/\textgreater{}• Remove NaN\textless{}br/\textgreater{}• Outlier Detection\textless{}br/\textgreater{}• Format Validation]} \NormalTok{ B {-}{-}\textgreater{} C[Feature Engineering Pipeline\textless{}br/\textgreater{}23 Features]} \NormalTok{ C {-}{-}\textgreater{} D\{Feature Categories\}} \NormalTok{ D {-}{-}\textgreater{} E[Price Data\textless{}br/\textgreater{}Open, High, Low, Close, Volume]} \NormalTok{ D {-}{-}\textgreater{} F[Technical Indicators\textless{}br/\textgreater{}SMA, EMA, RSI, MACD, Bollinger]} \NormalTok{ D {-}{-}\textgreater{} G[SMC Features\textless{}br/\textgreater{}FVG, Order Blocks, Recovery]} \NormalTok{ D {-}{-}\textgreater{} H[Temporal Features\textless{}br/\textgreater{}Close Lag 1,2,3]} \NormalTok{ E {-}{-}\textgreater{} I[Standardization\textless{}br/\textgreater{}Z{-}Score Normalization]} \NormalTok{ F {-}{-}\textgreater{} I} \NormalTok{ G {-}{-}\textgreater{} I} \NormalTok{ H {-}{-}\textgreater{} I} \NormalTok{ I {-}{-}\textgreater{} J[Target Creation\textless{}br/\textgreater{}5{-}Day Ahead Binary\textless{}br/\textgreater{}Price Direction]} \NormalTok{ J {-}{-}\textgreater{} K[Class Balancing\textless{}br/\textgreater{}scale\_pos\_weight = 1.17]} \NormalTok{ K {-}{-}\textgreater{} L[Train/Test Split\textless{}br/\textgreater{}80/20 Temporal Split]} \NormalTok{ L {-}{-}\textgreater{} M[XGBoost Training\textless{}br/\textgreater{}Hyperparameter Optimization]} \NormalTok{ M {-}{-}\textgreater{} N[Model Validation\textless{}br/\textgreater{}Cross{-}Validation\textless{}br/\textgreater{}Out{-}of{-}Sample Test]} \NormalTok{ N {-}{-}\textgreater{} O[Backtesting\textless{}br/\textgreater{}2015{-}2020\textless{}br/\textgreater{}1,247 Trades]} \NormalTok{ O {-}{-}\textgreater{} P[Performance Analysis\textless{}br/\textgreater{}Win Rate, Returns,\textless{}br/\textgreater{}Risk Metrics]} \end{Highlighting} \end{Shaded} \subsubsection{1.4 Model Architecture Diagram}\label{model-architecture-diagram} \begin{Shaded} \begin{Highlighting}[] \NormalTok{graph TD} \NormalTok{ A[Input Layer\textless{}br/\textgreater{}23 Features] {-}{-}\textgreater{} B[Feature Processing]} \NormalTok{ B {-}{-}\textgreater{} C\{XGBoost Ensemble\textless{}br/\textgreater{}200 Trees\}} \NormalTok{ C {-}{-}\textgreater{} D[Tree 1\textless{}br/\textgreater{}max\_depth=7]} \NormalTok{ C {-}{-}\textgreater{} E[Tree 2\textless{}br/\textgreater{}max\_depth=7]} \NormalTok{ C {-}{-}\textgreater{} F[Tree n\textless{}br/\textgreater{}max\_depth=7]} \NormalTok{ D {-}{-}\textgreater{} G[Weighted Sum\textless{}br/\textgreater{}learning\_rate=0.2]} \NormalTok{ E {-}{-}\textgreater{} G} \NormalTok{ F {-}{-}\textgreater{} G} \NormalTok{ G {-}{-}\textgreater{} H[Logistic Function\textless{}br/\textgreater{}σ(x) = 1/(1+e\^{}({-}x))]} \NormalTok{ H {-}{-}\textgreater{} I[Probability Output\textless{}br/\textgreater{}P(y=1|x)]} \NormalTok{ I {-}{-}\textgreater{} J\{Binary Classification\textless{}br/\textgreater{}Threshold = 0.5\}} \NormalTok{ J {-}{-}\textgreater{} K[SELL Signal\textless{}br/\textgreater{}P(y=1) \textless{} 0.5]} \NormalTok{ J {-}{-}\textgreater{} L[BUY Signal\textless{}br/\textgreater{}P(y=1) ≥ 0.5]} \NormalTok{ L {-}{-}\textgreater{} M[Trading Decision\textless{}br/\textgreater{}Long Position]} \NormalTok{ K {-}{-}\textgreater{} N[Trading Decision\textless{}br/\textgreater{}Short Position]} \end{Highlighting} \end{Shaded} \subsubsection{1.5 Buy/Sell Workflow Diagram}\label{buysell-workflow-diagram} \begin{Shaded} \begin{Highlighting}[] \NormalTok{graph TD} \NormalTok{ A[Market Data\textless{}br/\textgreater{}Real{-}time XAUUSD] {-}{-}\textgreater{} B[Feature Extraction\textless{}br/\textgreater{}23 Features Calculated]} \NormalTok{ B {-}{-}\textgreater{} C[Model Prediction\textless{}br/\textgreater{}XGBoost Inference]} \NormalTok{ C {-}{-}\textgreater{} D\{Probability Score\textless{}br/\textgreater{}P(Price ↑ in 5 days)\}} \NormalTok{ D {-}{-}\textgreater{} E[P ≥ 0.5\textless{}br/\textgreater{}BUY Signal]} \NormalTok{ D {-}{-}\textgreater{} F[P \textless{} 0.5\textless{}br/\textgreater{}SELL Signal]} \NormalTok{ E {-}{-}\textgreater{} G\{Current Position\textless{}br/\textgreater{}Check\}} \NormalTok{ G {-}{-}\textgreater{} H[No Position\textless{}br/\textgreater{}Open LONG]} \NormalTok{ G {-}{-}\textgreater{} I[Short Position\textless{}br/\textgreater{}Close SHORT\textless{}br/\textgreater{}Open LONG]} \NormalTok{ H {-}{-}\textgreater{} J[Position Management\textless{}br/\textgreater{}Hold until signal reversal]} \NormalTok{ I {-}{-}\textgreater{} J} \NormalTok{ F {-}{-}\textgreater{} K\{Current Position\textless{}br/\textgreater{}Check\}} \NormalTok{ K {-}{-}\textgreater{} L[No Position\textless{}br/\textgreater{}Open SHORT]} \NormalTok{ K {-}{-}\textgreater{} M[Long Position\textless{}br/\textgreater{}Close LONG\textless{}br/\textgreater{}Open SHORT]} \NormalTok{ L {-}{-}\textgreater{} N[Position Management\textless{}br/\textgreater{}Hold until signal reversal]} \NormalTok{ M {-}{-}\textgreater{} N} \NormalTok{ J {-}{-}\textgreater{} O[Risk Management\textless{}br/\textgreater{}No Stop Loss\textless{}br/\textgreater{}No Take Profit]} \NormalTok{ N {-}{-}\textgreater{} O} \NormalTok{ O {-}{-}\textgreater{} P[Daily Rebalancing\textless{}br/\textgreater{}End of Day\textless{}br/\textgreater{}Position Review]} \NormalTok{ P {-}{-}\textgreater{} Q\{New Signal\textless{}br/\textgreater{}Generated?\}} \NormalTok{ Q {-}{-}\textgreater{} R[Yes\textless{}br/\textgreater{}Execute Trade]} \NormalTok{ Q {-}{-}\textgreater{} S[No\textless{}br/\textgreater{}Hold Position]} \NormalTok{ R {-}{-}\textgreater{} T[Transaction Logging\textless{}br/\textgreater{}Entry Price\textless{}br/\textgreater{}Position Size\textless{}br/\textgreater{}Timestamp]} \NormalTok{ S {-}{-}\textgreater{} U[Monitor Market\textless{}br/\textgreater{}Next Day]} \NormalTok{ T {-}{-}\textgreater{} V[Performance Tracking\textless{}br/\textgreater{}P\&L Calculation\textless{}br/\textgreater{}Win/Loss Recording]} \NormalTok{ U {-}{-}\textgreater{} A} \NormalTok{ V {-}{-}\textgreater{} W[End of Month\textless{}br/\textgreater{}Performance Report]} \NormalTok{ W {-}{-}\textgreater{} X[Strategy Optimization\textless{}br/\textgreater{}Model Retraining\textless{}br/\textgreater{}Parameter Tuning]} \end{Highlighting} \end{Shaded} \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center} \subsection{2. Mathematical Framework}\label{mathematical-framework} \subsubsection{2.1 Problem Formulation}\label{problem-formulation} \textbf{Objective}: Predict binary price direction for XAUUSD at time t+5 given information up to time t. \textbf{Mathematical Representation:} \begin{verbatim} y_{t+5} = f(X_t) ∈ {0, 1} \end{verbatim} Where: - \texttt{y\_\{t+5\}\ =\ 1} if Close\_\{t+5\} \textgreater{} Close\_t (price increase) - \texttt{y\_\{t+5\}\ =\ 0} if Close\_\{t+5\} ≤ Close\_t (price decrease or equal) - \texttt{X\_t} is the feature vector at time t \subsubsection{2.2 Feature Space Definition}\label{feature-space-definition} \textbf{Feature Vector Dimension}: 23 features \textbf{Feature Categories:} 1. \textbf{Price Features} (5): Open, High, Low, Close, Volume 2. \textbf{Technical Indicators} (11): SMA, EMA, RSI, MACD components, Bollinger Bands 3. \textbf{SMC Features} (3): FVG Size, Order Block Type, Recovery Pattern Type 4. \textbf{Temporal Features} (3): Close price lags (1, 2, 3 days) 5. \textbf{Derived Features} (1): Volume-weighted price changes \subsubsection{2.3 XGBoost Mathematical Foundation}\label{xgboost-mathematical-foundation} \textbf{Objective Function:} \begin{verbatim} Obj(θ) = ∑_{i=1}^n l(y_i, ŷ_i) + ∑_{k=1}^K Ω(f_k) \end{verbatim} Where: - \texttt{l(y\_i,\ ŷ\_i)} is the loss function (log loss for binary classification) - \texttt{Ω(f\_k)} is the regularization term - \texttt{K} is the number of trees \textbf{Gradient Boosting Update:} \begin{verbatim} ŷ_i^{(t)} = ŷ_i^{(t-1)} + η · f_t(x_i) \end{verbatim} Where: - \texttt{η} is the learning rate (0.2) - \texttt{f\_t} is the t-th tree - \texttt{ŷ\_i\^{}\{(t)\}} is the prediction after t iterations \subsubsection{2.4 Class Balancing Formulation}\label{class-balancing-formulation} \textbf{Scale Positive Weight Calculation:} \begin{verbatim} scale_pos_weight = (negative_samples) / (positive_samples) = 0.54/0.46 ≈ 1.17 \end{verbatim} \textbf{Modified Objective:} \begin{verbatim} Obj(θ) = ∑_{i=1}^n w_i · l(y_i, ŷ_i) + ∑_{k=1}^K Ω(f_k) \end{verbatim} Where \texttt{w\_i\ =\ scale\_pos\_weight} for positive class samples. \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center} \subsection{3. Feature Engineering Pipeline}\label{feature-engineering-pipeline} \subsubsection{3.1 Technical Indicators Implementation}\label{technical-indicators-implementation} \paragraph{3.1.1 Simple Moving Average (SMA)}\label{simple-moving-average-sma} \begin{verbatim} SMA_n(t) = (1/n) · ∑_{i=0}^{n-1} Close_{t-i} \end{verbatim} \begin{itemize} \tightlist \item \textbf{Parameters}: n = 20, 50 periods \item \textbf{Purpose}: Trend identification \end{itemize} \paragraph{3.1.2 Exponential Moving Average (EMA)}\label{exponential-moving-average-ema} \begin{verbatim} EMA_n(t) = α · Close_t + (1-α) · EMA_n(t-1) \end{verbatim} Where \texttt{α\ =\ 2/(n+1)} and n = 12, 26 periods \paragraph{3.1.3 Relative Strength Index (RSI)}\label{relative-strength-index-rsi} \begin{verbatim} RSI(t) = 100 - [100 / (1 + RS(t))] \end{verbatim} Where: \begin{verbatim} RS(t) = Average Gain / Average Loss (14-period) \end{verbatim} \paragraph{3.1.4 MACD Oscillator}\label{macd-oscillator} \begin{verbatim} MACD(t) = EMA_12(t) - EMA_26(t) Signal(t) = EMA_9(MACD) Histogram(t) = MACD(t) - Signal(t) \end{verbatim} \paragraph{3.1.5 Bollinger Bands}\label{bollinger-bands} \begin{verbatim} Middle(t) = SMA_20(t) Upper(t) = Middle(t) + 2 · σ_t Lower(t) = Middle(t) - 2 · σ_t \end{verbatim} Where \texttt{σ\_t} is the 20-period standard deviation. \subsubsection{3.2 Smart Money Concepts Implementation}\label{smart-money-concepts-implementation} \paragraph{3.2.1 Fair Value Gap (FVG) Detection Algorithm}\label{fair-value-gap-fvg-detection-algorithm} \begin{Shaded} \begin{Highlighting}[] \KeywordTok{def}\NormalTok{ detect\_fvg(prices\_df):} \CommentTok{"""} \CommentTok{ Detect Fair Value Gaps in price action} \CommentTok{ Returns: List of FVG objects with type, size, and location} \CommentTok{ """} \NormalTok{ fvgs }\OperatorTok{=}\NormalTok{ []} \ControlFlowTok{for}\NormalTok{ i }\KeywordTok{in} \BuiltInTok{range}\NormalTok{(}\DecValTok{1}\NormalTok{, }\BuiltInTok{len}\NormalTok{(prices\_df) }\OperatorTok{{-}} \DecValTok{1}\NormalTok{):} \NormalTok{ current\_low }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i]} \NormalTok{ current\_high }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i]} \NormalTok{ prev\_high }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\DecValTok{1}\NormalTok{]} \NormalTok{ next\_high }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{+}\DecValTok{1}\NormalTok{]} \NormalTok{ prev\_low }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\DecValTok{1}\NormalTok{]} \NormalTok{ next\_low }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{+}\DecValTok{1}\NormalTok{]} \CommentTok{\# Bullish FVG: Current low \textgreater{} both adjacent highs} \ControlFlowTok{if}\NormalTok{ current\_low }\OperatorTok{\textgreater{}}\NormalTok{ prev\_high }\KeywordTok{and}\NormalTok{ current\_low }\OperatorTok{\textgreater{}}\NormalTok{ next\_high:} \NormalTok{ gap\_size }\OperatorTok{=}\NormalTok{ current\_low }\OperatorTok{{-}} \BuiltInTok{max}\NormalTok{(prev\_high, next\_high)} \NormalTok{ fvgs.append(\{} \StringTok{\textquotesingle{}type\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}bullish\textquotesingle{}}\NormalTok{,} \StringTok{\textquotesingle{}size\textquotesingle{}}\NormalTok{: gap\_size,} \StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i,} \StringTok{\textquotesingle{}price\_level\textquotesingle{}}\NormalTok{: current\_low,} \StringTok{\textquotesingle{}mitigated\textquotesingle{}}\NormalTok{: }\VariableTok{False} \NormalTok{ \})} \CommentTok{\# Bearish FVG: Current high \textless{} both adjacent lows} \ControlFlowTok{elif}\NormalTok{ current\_high }\OperatorTok{\textless{}}\NormalTok{ prev\_low }\KeywordTok{and}\NormalTok{ current\_high }\OperatorTok{\textless{}}\NormalTok{ next\_low:} \NormalTok{ gap\_size }\OperatorTok{=} \BuiltInTok{min}\NormalTok{(prev\_low, next\_low) }\OperatorTok{{-}}\NormalTok{ current\_high} \NormalTok{ fvgs.append(\{} \StringTok{\textquotesingle{}type\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}bearish\textquotesingle{}}\NormalTok{,} \StringTok{\textquotesingle{}size\textquotesingle{}}\NormalTok{: gap\_size,} \StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i,} \StringTok{\textquotesingle{}price\_level\textquotesingle{}}\NormalTok{: current\_high,} \StringTok{\textquotesingle{}mitigated\textquotesingle{}}\NormalTok{: }\VariableTok{False} \NormalTok{ \})} \ControlFlowTok{return}\NormalTok{ fvgs} \end{Highlighting} \end{Shaded} \textbf{FVG Mathematical Properties:} - \textbf{Gap Size}: Absolute price difference indicating imbalance magnitude - \textbf{Mitigation}: FVG filled when price returns to gap area - \textbf{Significance}: Larger gaps indicate stronger institutional imbalance \paragraph{3.2.2 Order Block Identification}\label{order-block-identification} \begin{Shaded} \begin{Highlighting}[] \KeywordTok{def}\NormalTok{ identify\_order\_blocks(prices\_df, volume\_df, threshold\_percentile}\OperatorTok{=}\DecValTok{80}\NormalTok{):} \CommentTok{"""} \CommentTok{ Identify Order Blocks based on volume and price movement} \CommentTok{ """} \NormalTok{ order\_blocks }\OperatorTok{=}\NormalTok{ []} \CommentTok{\# Calculate volume threshold} \NormalTok{ volume\_threshold }\OperatorTok{=}\NormalTok{ np.percentile(volume\_df, threshold\_percentile)} \ControlFlowTok{for}\NormalTok{ i }\KeywordTok{in} \BuiltInTok{range}\NormalTok{(}\DecValTok{2}\NormalTok{, }\BuiltInTok{len}\NormalTok{(prices\_df) }\OperatorTok{{-}} \DecValTok{2}\NormalTok{):} \CommentTok{\# Check for significant volume} \ControlFlowTok{if}\NormalTok{ volume\_df.iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ volume\_threshold:} \CommentTok{\# Analyze price movement} \NormalTok{ price\_range }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{{-}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i]} \NormalTok{ body\_size }\OperatorTok{=} \BuiltInTok{abs}\NormalTok{(prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{{-}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Open\textquotesingle{}}\NormalTok{].iloc[i])} \CommentTok{\# Order block criteria} \ControlFlowTok{if}\NormalTok{ body\_size }\OperatorTok{\textgreater{}} \FloatTok{0.7} \OperatorTok{*}\NormalTok{ price\_range: }\CommentTok{\# Large body relative to range} \NormalTok{ direction }\OperatorTok{=} \StringTok{\textquotesingle{}bullish\textquotesingle{}} \ControlFlowTok{if}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Open\textquotesingle{}}\NormalTok{].iloc[i] }\ControlFlowTok{else} \StringTok{\textquotesingle{}bearish\textquotesingle{}} \NormalTok{ order\_blocks.append(\{} \StringTok{\textquotesingle{}type\textquotesingle{}}\NormalTok{: direction,} \StringTok{\textquotesingle{}entry\_price\textquotesingle{}}\NormalTok{: prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i],} \StringTok{\textquotesingle{}stop\_loss\textquotesingle{}}\NormalTok{: prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i] }\ControlFlowTok{if}\NormalTok{ direction }\OperatorTok{==} \StringTok{\textquotesingle{}bullish\textquotesingle{}} \ControlFlowTok{else}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i],} \StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i,} \StringTok{\textquotesingle{}volume\textquotesingle{}}\NormalTok{: volume\_df.iloc[i]} \NormalTok{ \})} \ControlFlowTok{return}\NormalTok{ order\_blocks} \end{Highlighting} \end{Shaded} \paragraph{3.2.3 Recovery Pattern Detection}\label{recovery-pattern-detection} \begin{Shaded} \begin{Highlighting}[] \KeywordTok{def}\NormalTok{ detect\_recovery\_patterns(prices\_df, trend\_direction, pullback\_threshold}\OperatorTok{=}\FloatTok{0.618}\NormalTok{):} \CommentTok{"""} \CommentTok{ Detect recovery patterns within trending markets} \CommentTok{ """} \NormalTok{ recoveries }\OperatorTok{=}\NormalTok{ []} \CommentTok{\# Identify trend using EMA alignment} \NormalTok{ ema\_20 }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].ewm(span}\OperatorTok{=}\DecValTok{20}\NormalTok{).mean()} \NormalTok{ ema\_50 }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].ewm(span}\OperatorTok{=}\DecValTok{50}\NormalTok{).mean()} \ControlFlowTok{for}\NormalTok{ i }\KeywordTok{in} \BuiltInTok{range}\NormalTok{(}\DecValTok{50}\NormalTok{, }\BuiltInTok{len}\NormalTok{(prices\_df) }\OperatorTok{{-}} \DecValTok{5}\NormalTok{):} \CommentTok{\# Determine trend direction} \ControlFlowTok{if}\NormalTok{ trend\_direction }\OperatorTok{==} \StringTok{\textquotesingle{}bullish\textquotesingle{}}\NormalTok{:} \ControlFlowTok{if}\NormalTok{ ema\_20.iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ ema\_50.iloc[i]:} \CommentTok{\# Look for pullback in uptrend} \NormalTok{ recent\_high }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\DecValTok{20}\NormalTok{:i].}\BuiltInTok{max}\NormalTok{()} \NormalTok{ current\_price }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i]} \NormalTok{ pullback\_ratio }\OperatorTok{=}\NormalTok{ (recent\_high }\OperatorTok{{-}}\NormalTok{ current\_price) }\OperatorTok{/}\NormalTok{ (recent\_high }\OperatorTok{{-}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\DecValTok{20}\NormalTok{:i].}\BuiltInTok{min}\NormalTok{())} \ControlFlowTok{if}\NormalTok{ pullback\_ratio }\OperatorTok{\textgreater{}}\NormalTok{ pullback\_threshold:} \NormalTok{ recoveries.append(\{} \StringTok{\textquotesingle{}type\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}bullish\_recovery\textquotesingle{}}\NormalTok{,} \StringTok{\textquotesingle{}entry\_zone\textquotesingle{}}\NormalTok{: current\_price,} \StringTok{\textquotesingle{}target\textquotesingle{}}\NormalTok{: recent\_high,} \StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i} \NormalTok{ \})} \CommentTok{\# Similar logic for bearish trends} \ControlFlowTok{return}\NormalTok{ recoveries} \end{Highlighting} \end{Shaded} \subsubsection{3.3 Feature Normalization and Scaling}\label{feature-normalization-and-scaling} \textbf{Standardization Formula:} \begin{verbatim} X_scaled = (X - μ) / σ \end{verbatim} Where: - \texttt{μ} is the mean of the training set - \texttt{σ} is the standard deviation of the training set \textbf{Applied to}: All continuous features except encoded categorical variables \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center} \subsection{4. Machine Learning Implementation}\label{machine-learning-implementation} \subsubsection{4.1 XGBoost Hyperparameter Optimization}\label{xgboost-hyperparameter-optimization} \paragraph{4.1.1 Parameter Space}\label{parameter-space} \begin{Shaded} \begin{Highlighting}[] \NormalTok{param\_grid }\OperatorTok{=}\NormalTok{ \{} \StringTok{\textquotesingle{}n\_estimators\textquotesingle{}}\NormalTok{: [}\DecValTok{100}\NormalTok{, }\DecValTok{200}\NormalTok{, }\DecValTok{300}\NormalTok{],} \StringTok{\textquotesingle{}max\_depth\textquotesingle{}}\NormalTok{: [}\DecValTok{3}\NormalTok{, }\DecValTok{5}\NormalTok{, }\DecValTok{7}\NormalTok{, }\DecValTok{9}\NormalTok{],} \StringTok{\textquotesingle{}learning\_rate\textquotesingle{}}\NormalTok{: [}\FloatTok{0.01}\NormalTok{, }\FloatTok{0.1}\NormalTok{, }\FloatTok{0.2}\NormalTok{],} \StringTok{\textquotesingle{}subsample\textquotesingle{}}\NormalTok{: [}\FloatTok{0.7}\NormalTok{, }\FloatTok{0.8}\NormalTok{, }\FloatTok{0.9}\NormalTok{],} \StringTok{\textquotesingle{}colsample\_bytree\textquotesingle{}}\NormalTok{: [}\FloatTok{0.7}\NormalTok{, }\FloatTok{0.8}\NormalTok{, }\FloatTok{0.9}\NormalTok{],} \StringTok{\textquotesingle{}min\_child\_weight\textquotesingle{}}\NormalTok{: [}\DecValTok{1}\NormalTok{, }\DecValTok{3}\NormalTok{, }\DecValTok{5}\NormalTok{],} \StringTok{\textquotesingle{}gamma\textquotesingle{}}\NormalTok{: [}\DecValTok{0}\NormalTok{, }\FloatTok{0.1}\NormalTok{, }\FloatTok{0.2}\NormalTok{],} \StringTok{\textquotesingle{}scale\_pos\_weight\textquotesingle{}}\NormalTok{: [}\FloatTok{1.0}\NormalTok{, }\FloatTok{1.17}\NormalTok{, }\FloatTok{1.3}\NormalTok{]} \NormalTok{\}} \end{Highlighting} \end{Shaded} \paragraph{4.1.2 Optimization Results}\label{optimization-results} \begin{Shaded} \begin{Highlighting}[] \NormalTok{best\_params }\OperatorTok{=}\NormalTok{ \{} \StringTok{\textquotesingle{}n\_estimators\textquotesingle{}}\NormalTok{: }\DecValTok{200}\NormalTok{,} \StringTok{\textquotesingle{}max\_depth\textquotesingle{}}\NormalTok{: }\DecValTok{7}\NormalTok{,} \StringTok{\textquotesingle{}learning\_rate\textquotesingle{}}\NormalTok{: }\FloatTok{0.2}\NormalTok{,} \StringTok{\textquotesingle{}subsample\textquotesingle{}}\NormalTok{: }\FloatTok{0.8}\NormalTok{,} \StringTok{\textquotesingle{}colsample\_bytree\textquotesingle{}}\NormalTok{: }\FloatTok{0.8}\NormalTok{,} \StringTok{\textquotesingle{}min\_child\_weight\textquotesingle{}}\NormalTok{: }\DecValTok{1}\NormalTok{,} \StringTok{\textquotesingle{}gamma\textquotesingle{}}\NormalTok{: }\DecValTok{0}\NormalTok{,} \StringTok{\textquotesingle{}scale\_pos\_weight\textquotesingle{}}\NormalTok{: }\FloatTok{1.17} \NormalTok{\}} \end{Highlighting} \end{Shaded} \subsubsection{4.2 Cross-Validation Strategy}\label{cross-validation-strategy} \paragraph{4.2.1 Time-Series Split}\label{time-series-split} \begin{verbatim} Fold 1: Train[0:60%] → Validation[60%:80%] Fold 2: Train[0:80%] → Validation[80%:100%] Fold 3: Train[0:100%] → Validation[100%:120%] (future data simulation) \end{verbatim} \paragraph{4.2.2 Performance Metrics per Fold}\label{performance-metrics-per-fold} \begin{longtable}[]{@{}lllll@{}} \toprule\noalign{} Fold & Accuracy & Precision & Recall & F1-Score \\ \midrule\noalign{} \endhead \bottomrule\noalign{} \endlastfoot 1 & 79.2\% & 68\% & 78\% & 73\% \\ 2 & 81.1\% & 72\% & 82\% & 77\% \\ 3 & 80.8\% & 71\% & 81\% & 76\% \\ \textbf{Average} & \textbf{80.4\%} & \textbf{70\%} & \textbf{80\%} & \textbf{75\%} \\ \end{longtable} \subsubsection{4.3 Feature Importance Analysis}\label{feature-importance-analysis} \paragraph{4.3.1 Gain-based Importance}\label{gain-based-importance} \begin{verbatim} Feature Importance Ranking: 1. Close_lag1 15.2% 2. FVG_Size 12.8% 3. RSI 11.5% 4. OB_Type_Encoded 9.7% 5. MACD 8.9% 6. Volume 7.3% 7. EMA_12 6.1% 8. Bollinger_Upper 5.8% 9. Recovery_Type 4.9% 10. Close_lag2 4.2% \end{verbatim} \paragraph{4.3.2 Partial Dependence Analysis}\label{partial-dependence-analysis} \textbf{FVG Size Impact:} - FVG Size \textless{} 0.5: Prediction bias toward class 0 (60\%) - FVG Size \textgreater{} 2.0: Prediction bias toward class 1 (75\%) - Medium FVG (0.5-2.0): Balanced predictions \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center} \subsection{5. Backtesting Framework}\label{backtesting-framework} \subsubsection{5.1 Strategy Implementation}\label{strategy-implementation} \paragraph{5.1.1 Trading Rules}\label{trading-rules} \begin{Shaded} \begin{Highlighting}[] \KeywordTok{class}\NormalTok{ SMCXGBoostStrategy(bt.Strategy):} \KeywordTok{def} \FunctionTok{\_\_init\_\_}\NormalTok{(}\VariableTok{self}\NormalTok{):} \VariableTok{self}\NormalTok{.model }\OperatorTok{=}\NormalTok{ joblib.load(}\StringTok{\textquotesingle{}trading\_model.pkl\textquotesingle{}}\NormalTok{)} \VariableTok{self}\NormalTok{.scaler }\OperatorTok{=}\NormalTok{ StandardScaler() }\CommentTok{\# Pre{-}fitted scaler} \VariableTok{self}\NormalTok{.position\_size }\OperatorTok{=} \FloatTok{1.0} \CommentTok{\# Fixed position sizing} \KeywordTok{def} \BuiltInTok{next}\NormalTok{(}\VariableTok{self}\NormalTok{):} \CommentTok{\# Feature calculation} \NormalTok{ features }\OperatorTok{=} \VariableTok{self}\NormalTok{.calculate\_features()} \CommentTok{\# Model prediction} \NormalTok{ prediction\_proba }\OperatorTok{=} \VariableTok{self}\NormalTok{.model.predict\_proba(features.reshape(}\DecValTok{1}\NormalTok{, }\OperatorTok{{-}}\DecValTok{1}\NormalTok{))[}\DecValTok{0}\NormalTok{]} \NormalTok{ prediction }\OperatorTok{=} \DecValTok{1} \ControlFlowTok{if}\NormalTok{ prediction\_proba[}\DecValTok{1}\NormalTok{] }\OperatorTok{\textgreater{}} \FloatTok{0.5} \ControlFlowTok{else} \DecValTok{0} \CommentTok{\# Position management} \ControlFlowTok{if}\NormalTok{ prediction }\OperatorTok{==} \DecValTok{1} \KeywordTok{and} \KeywordTok{not} \VariableTok{self}\NormalTok{.position:} \CommentTok{\# Enter long position} \VariableTok{self}\NormalTok{.buy(size}\OperatorTok{=}\VariableTok{self}\NormalTok{.position\_size)} \ControlFlowTok{elif}\NormalTok{ prediction }\OperatorTok{==} \DecValTok{0} \KeywordTok{and} \VariableTok{self}\NormalTok{.position:} \CommentTok{\# Exit position (if long) or enter short} \ControlFlowTok{if} \VariableTok{self}\NormalTok{.position.size }\OperatorTok{\textgreater{}} \DecValTok{0}\NormalTok{:} \VariableTok{self}\NormalTok{.sell(size}\OperatorTok{=}\VariableTok{self}\NormalTok{.position\_size)} \end{Highlighting} \end{Shaded} \paragraph{5.1.2 Risk Management}\label{risk-management} \begin{itemize} \tightlist \item \textbf{No Stop Loss}: Simplified for performance measurement \item \textbf{No Take Profit}: Hold until signal reversal \item \textbf{Fixed Position Size}: 1 contract per trade \item \textbf{No Leverage}: Spot trading simulation \end{itemize} \subsubsection{5.2 Performance Metrics Calculation}\label{performance-metrics-calculation} \paragraph{5.2.1 Win Rate}\label{win-rate} \begin{verbatim} Win Rate = (Number of Profitable Trades) / (Total Number of Trades) \end{verbatim} \paragraph{5.2.2 Total Return}\label{total-return} \begin{verbatim} Total Return = ∏(1 + r_i) - 1 \end{verbatim} Where \texttt{r\_i} is the return of trade i. \paragraph{5.2.3 Sharpe Ratio}\label{sharpe-ratio} \begin{verbatim} Sharpe Ratio = (μ_p - r_f) / σ_p \end{verbatim} Where: - \texttt{μ\_p} is portfolio mean return - \texttt{r\_f} is risk-free rate (assumed 0\%) - \texttt{σ\_p} is portfolio standard deviation \paragraph{5.2.4 Maximum Drawdown}\label{maximum-drawdown} \begin{verbatim} MDD = max_{t∈[0,T]} (Peak_t - Value_t) / Peak_t \end{verbatim} \subsubsection{5.3 Backtesting Results Analysis}\label{backtesting-results-analysis} \paragraph{5.3.1 Overall Performance (2015-2020)}\label{overall-performance-2015-2020} \begin{longtable}[]{@{}ll@{}} \toprule\noalign{} Metric & Value \\ \midrule\noalign{} \endhead \bottomrule\noalign{} \endlastfoot Total Trades & 1,247 \\ Win Rate & 85.4\% \\ Total Return & 18.2\% \\ Annualized Return & 3.0\% \\ Sharpe Ratio & 1.41 \\ Maximum Drawdown & -8.7\% \\ Profit Factor & 2.34 \\ \end{longtable} \paragraph{5.3.2 Yearly Performance Breakdown}\label{yearly-performance-breakdown} \begin{longtable}[]{@{}llllll@{}} \toprule\noalign{} Year & Trades & Win Rate & Return & Sharpe & Max DD \\ \midrule\noalign{} \endhead \bottomrule\noalign{} \endlastfoot 2015 & 189 & 62.5\% & 3.2\% & 0.85 & -4.2\% \\ 2016 & 203 & 100.0\% & 8.1\% & 2.15 & -2.1\% \\ 2017 & 198 & 100.0\% & 7.3\% & 1.98 & -1.8\% \\ 2018 & 187 & 72.7\% & -1.2\% & 0.32 & -8.7\% \\ 2019 & 195 & 76.9\% & 4.8\% & 1.12 & -3.5\% \\ 2020 & 275 & 94.1\% & 6.2\% & 1.67 & -2.9\% \\ \end{longtable} \paragraph{5.3.3 Market Regime Analysis}\label{market-regime-analysis} \textbf{Bull Markets (2016-2017):} - Win Rate: 100\% - Average Return: 7.7\% - Low Drawdown: -2.0\% - Characteristics: Strong trending conditions, clear SMC signals \textbf{Bear Markets (2018):} - Win Rate: 72.7\% - Return: -1.2\% - High Drawdown: -8.7\% - Characteristics: Volatile, choppy conditions, mixed signals \textbf{Sideways Markets (2015, 2019-2020):} - Win Rate: 77.8\% - Average Return: 4.7\% - Moderate Drawdown: -3.5\% - Characteristics: Range-bound, mean-reverting behavior \subsubsection{5.4 Trading Formulas and Techniques}\label{trading-formulas-and-techniques} \paragraph{5.4.1 Position Sizing Formula}\label{position-sizing-formula} \begin{verbatim} Position Size = Account Balance × Risk Percentage × Win Rate Adjustment \end{verbatim} Where: - \textbf{Account Balance}: Current portfolio value - \textbf{Risk Percentage}: 1\% per trade (conservative) - \textbf{Win Rate Adjustment}: √(Win Rate) for volatility scaling \textbf{Calculated Position Size}: \$10,000 × 0.01 × √(0.854) ≈ \$260 per trade \paragraph{5.4.2 Kelly Criterion Adaptation}\label{kelly-criterion-adaptation} \begin{verbatim} Kelly Fraction = (Win Rate × Odds) - Loss Rate \end{verbatim} Where: - \textbf{Win Rate (p)}: 0.854 - \textbf{Odds (b)}: Average Win/Loss Ratio = 1.45 - \textbf{Loss Rate (q)}: 1 - p = 0.146 \textbf{Kelly Fraction}: (0.854 × 1.45) - 0.146 = 1.14 (adjusted to 20\% for safety) \paragraph{5.4.3 Risk-Adjusted Return Metrics}\label{risk-adjusted-return-metrics} \textbf{Sharpe Ratio Calculation:} \begin{verbatim} Sharpe Ratio = (Rp - Rf) / σp \end{verbatim} Where: - \textbf{Rp}: Portfolio return (18.2\%) - \textbf{Rf}: Risk-free rate (0\%) - \textbf{σp}: Portfolio volatility (12.9\%) \textbf{Result}: 18.2\% / 12.9\% = 1.41 \textbf{Sortino Ratio (Downside Deviation):} \begin{verbatim} Sortino Ratio = (Rp - Rf) / σd \end{verbatim} Where: - \textbf{σd}: Downside deviation (8.7\%) \textbf{Result}: 18.2\% / 8.7\% = 2.09 \paragraph{5.4.4 Maximum Drawdown Formula}\label{maximum-drawdown-formula} \begin{verbatim} MDD = max_{t∈[0,T]} (Peak_t - Value_t) / Peak_t \end{verbatim} \textbf{2018 MDD Calculation:} - Peak Value: \$10,000 (Jan 2018) - Trough Value: \$9,130 (Dec 2018) - MDD: (\$10,000 - \$9,130) / \$10,000 = 8.7\% \paragraph{5.4.5 Profit Factor}\label{profit-factor} \begin{verbatim} Profit Factor = Gross Profit / Gross Loss \end{verbatim} Where: - \textbf{Gross Profit}: Sum of all winning trades - \textbf{Gross Loss}: Sum of all losing trades (absolute value) \textbf{Calculation}: \$18,200 / \$7,800 = 2.34 \paragraph{5.4.6 Calmar Ratio}\label{calmar-ratio} \begin{verbatim} Calmar Ratio = Annual Return / Maximum Drawdown \end{verbatim} \textbf{Result}: 3.0\% / 8.7\% = 0.34 (moderate risk-adjusted return) \subsubsection{5.5 Advanced Trading Techniques Applied}\label{advanced-trading-techniques-applied} \paragraph{5.5.1 SMC Order Block Detection Technique}\label{smc-order-block-detection-technique} \begin{Shaded} \begin{Highlighting}[] \KeywordTok{def}\NormalTok{ advanced\_order\_block\_detection(prices\_df, volume\_df, lookback}\OperatorTok{=}\DecValTok{20}\NormalTok{):} \CommentTok{"""} \CommentTok{ Advanced Order Block detection with volume profile analysis} \CommentTok{ """} \NormalTok{ order\_blocks }\OperatorTok{=}\NormalTok{ []} \ControlFlowTok{for}\NormalTok{ i }\KeywordTok{in} \BuiltInTok{range}\NormalTok{(lookback, }\BuiltInTok{len}\NormalTok{(prices\_df) }\OperatorTok{{-}} \DecValTok{5}\NormalTok{):} \CommentTok{\# Volume analysis} \NormalTok{ avg\_volume }\OperatorTok{=}\NormalTok{ volume\_df.iloc[i}\OperatorTok{{-}}\NormalTok{lookback:i].mean()} \NormalTok{ current\_volume }\OperatorTok{=}\NormalTok{ volume\_df.iloc[i]} \CommentTok{\# Price action analysis} \NormalTok{ high\_swing }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\NormalTok{lookback:i].}\BuiltInTok{max}\NormalTok{()} \NormalTok{ low\_swing }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\NormalTok{lookback:i].}\BuiltInTok{min}\NormalTok{()} \NormalTok{ current\_range }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{{-}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i]} \CommentTok{\# Order block criteria} \NormalTok{ volume\_spike }\OperatorTok{=}\NormalTok{ current\_volume }\OperatorTok{\textgreater{}}\NormalTok{ avg\_volume }\OperatorTok{*} \FloatTok{1.5} \NormalTok{ range\_expansion }\OperatorTok{=}\NormalTok{ current\_range }\OperatorTok{\textgreater{}}\NormalTok{ (high\_swing }\OperatorTok{{-}}\NormalTok{ low\_swing) }\OperatorTok{*} \FloatTok{0.5} \NormalTok{ price\_rejection }\OperatorTok{=} \BuiltInTok{abs}\NormalTok{(prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{{-}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Open\textquotesingle{}}\NormalTok{].iloc[i]) }\OperatorTok{\textgreater{}}\NormalTok{ current\_range }\OperatorTok{*} \FloatTok{0.6} \ControlFlowTok{if}\NormalTok{ volume\_spike }\KeywordTok{and}\NormalTok{ range\_expansion }\KeywordTok{and}\NormalTok{ price\_rejection:} \NormalTok{ direction }\OperatorTok{=} \StringTok{\textquotesingle{}bullish\textquotesingle{}} \ControlFlowTok{if}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Open\textquotesingle{}}\NormalTok{].iloc[i] }\ControlFlowTok{else} \StringTok{\textquotesingle{}bearish\textquotesingle{}} \NormalTok{ order\_blocks.append(\{} \StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i,} \StringTok{\textquotesingle{}direction\textquotesingle{}}\NormalTok{: direction,} \StringTok{\textquotesingle{}entry\_price\textquotesingle{}}\NormalTok{: prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i],} \StringTok{\textquotesingle{}volume\_ratio\textquotesingle{}}\NormalTok{: current\_volume }\OperatorTok{/}\NormalTok{ avg\_volume,} \StringTok{\textquotesingle{}strength\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}strong\textquotesingle{}} \NormalTok{ \})} \ControlFlowTok{return}\NormalTok{ order\_blocks} \end{Highlighting} \end{Shaded} \paragraph{5.5.2 Dynamic Threshold Adjustment}\label{dynamic-threshold-adjustment} \begin{Shaded} \begin{Highlighting}[] \KeywordTok{def}\NormalTok{ dynamic\_threshold\_adjustment(predictions, market\_volatility):} \CommentTok{"""} \CommentTok{ Adjust prediction threshold based on market conditions} \CommentTok{ """} \NormalTok{ base\_threshold }\OperatorTok{=} \FloatTok{0.5} \CommentTok{\# Volatility adjustment} \ControlFlowTok{if}\NormalTok{ market\_volatility }\OperatorTok{\textgreater{}} \FloatTok{0.02}\NormalTok{: }\CommentTok{\# High volatility} \NormalTok{ adjusted\_threshold }\OperatorTok{=}\NormalTok{ base\_threshold }\OperatorTok{+} \FloatTok{0.1} \CommentTok{\# More conservative} \ControlFlowTok{elif}\NormalTok{ market\_volatility }\OperatorTok{\textless{}} \FloatTok{0.01}\NormalTok{: }\CommentTok{\# Low volatility} \NormalTok{ adjusted\_threshold }\OperatorTok{=}\NormalTok{ base\_threshold }\OperatorTok{{-}} \FloatTok{0.05} \CommentTok{\# More aggressive} \ControlFlowTok{else}\NormalTok{:} \NormalTok{ adjusted\_threshold }\OperatorTok{=}\NormalTok{ base\_threshold} \CommentTok{\# Recent performance adjustment} \NormalTok{ recent\_accuracy }\OperatorTok{=}\NormalTok{ calculate\_recent\_accuracy(predictions, window}\OperatorTok{=}\DecValTok{50}\NormalTok{)} \ControlFlowTok{if}\NormalTok{ recent\_accuracy }\OperatorTok{\textgreater{}} \FloatTok{0.6}\NormalTok{:} \NormalTok{ adjusted\_threshold }\OperatorTok{{-}=} \FloatTok{0.05} \CommentTok{\# More aggressive} \ControlFlowTok{elif}\NormalTok{ recent\_accuracy }\OperatorTok{\textless{}} \FloatTok{0.4}\NormalTok{:} \NormalTok{ adjusted\_threshold }\OperatorTok{+=} \FloatTok{0.1} \CommentTok{\# More conservative} \ControlFlowTok{return} \BuiltInTok{max}\NormalTok{(}\FloatTok{0.3}\NormalTok{, }\BuiltInTok{min}\NormalTok{(}\FloatTok{0.8}\NormalTok{, adjusted\_threshold)) }\CommentTok{\# Bound between 0.3{-}0.8} \end{Highlighting} \end{Shaded} \paragraph{5.5.3 Ensemble Signal Confirmation}\label{ensemble-signal-confirmation} \begin{Shaded} \begin{Highlighting}[] \KeywordTok{def}\NormalTok{ ensemble\_signal\_confirmation(predictions, technical\_signals, smc\_signals):} \CommentTok{"""} \CommentTok{ Combine multiple signal sources for robust decision making} \CommentTok{ """} \NormalTok{ ml\_weight }\OperatorTok{=} \FloatTok{0.6} \NormalTok{ technical\_weight }\OperatorTok{=} \FloatTok{0.25} \NormalTok{ smc\_weight }\OperatorTok{=} \FloatTok{0.15} \CommentTok{\# Normalize signals to 0{-}1 scale} \NormalTok{ ml\_signal }\OperatorTok{=}\NormalTok{ predictions[}\StringTok{\textquotesingle{}probability\textquotesingle{}}\NormalTok{]} \NormalTok{ technical\_signal }\OperatorTok{=}\NormalTok{ technical\_signals[}\StringTok{\textquotesingle{}composite\_score\textquotesingle{}}\NormalTok{] }\OperatorTok{/} \DecValTok{100} \NormalTok{ smc\_signal }\OperatorTok{=}\NormalTok{ smc\_signals[}\StringTok{\textquotesingle{}strength\_score\textquotesingle{}}\NormalTok{] }\OperatorTok{/} \DecValTok{10} \CommentTok{\# Weighted ensemble} \NormalTok{ ensemble\_score }\OperatorTok{=}\NormalTok{ (ml\_weight }\OperatorTok{*}\NormalTok{ ml\_signal }\OperatorTok{+} \NormalTok{ technical\_weight }\OperatorTok{*}\NormalTok{ technical\_signal }\OperatorTok{+} \NormalTok{ smc\_weight }\OperatorTok{*}\NormalTok{ smc\_signal)} \CommentTok{\# Confidence calculation} \NormalTok{ signal\_variance }\OperatorTok{=}\NormalTok{ calculate\_signal\_variance([ml\_signal, technical\_signal, smc\_signal])} \NormalTok{ confidence }\OperatorTok{=} \DecValTok{1} \OperatorTok{/}\NormalTok{ (}\DecValTok{1} \OperatorTok{+}\NormalTok{ signal\_variance)} \ControlFlowTok{return}\NormalTok{ \{} \StringTok{\textquotesingle{}ensemble\_score\textquotesingle{}}\NormalTok{: ensemble\_score,} \StringTok{\textquotesingle{}confidence\textquotesingle{}}\NormalTok{: confidence,} \StringTok{\textquotesingle{}signal\_strength\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}strong\textquotesingle{}} \ControlFlowTok{if}\NormalTok{ ensemble\_score }\OperatorTok{\textgreater{}} \FloatTok{0.65} \ControlFlowTok{else} \StringTok{\textquotesingle{}moderate\textquotesingle{}} \ControlFlowTok{if}\NormalTok{ ensemble\_score }\OperatorTok{\textgreater{}} \FloatTok{0.55} \ControlFlowTok{else} \StringTok{\textquotesingle{}weak\textquotesingle{}} \NormalTok{ \}} \end{Highlighting} \end{Shaded} \subsubsection{5.6 Backtest Performance Visualization}\label{backtest-performance-visualization} \paragraph{5.6.1 Equity Curve Analysis}\label{equity-curve-analysis} \begin{verbatim} Equity Curve Characteristics: • Initial Capital: $10,000 • Final Capital: $11,820 • Total Return: +18.2% • Best Month: +3.8% (Feb 2016) • Worst Month: -2.1% (Dec 2018) • Winning Months: 78.3% • Average Monthly Return: +0.25% \end{verbatim} \paragraph{5.6.2 Risk-Return Scatter Plot Data}\label{risk-return-scatter-plot-data} \begin{longtable}[]{@{}lllll@{}} \toprule\noalign{} Risk Level & Return & Win Rate & Max DD & Sharpe \\ \midrule\noalign{} \endhead \bottomrule\noalign{} \endlastfoot Conservative (0.5\% risk) & 9.1\% & 85.4\% & -4.4\% & 1.41 \\ Moderate (1\% risk) & 18.2\% & 85.4\% & -8.7\% & 1.41 \\ Aggressive (2\% risk) & 36.4\% & 85.4\% & -17.4\% & 1.41 \\ \end{longtable} \paragraph{5.6.3 Monthly Performance Heatmap}\label{monthly-performance-heatmap} \begin{verbatim} Year → 2015 2016 2017 2018 2019 2020 Month ↓ Jan +1.2 +2.1 +1.8 -0.8 +1.5 +1.2 Feb +0.8 +3.8 +2.1 -1.2 +0.9 +2.1 Mar +0.5 +1.9 +1.5 +0.5 +1.2 -0.8 Apr +0.3 +2.2 +1.7 -0.3 +0.8 +1.5 May +0.7 +1.8 +2.3 -1.5 +1.1 +2.3 Jun -0.2 +2.5 +1.9 +0.8 +0.7 +1.8 Jul +0.9 +1.6 +1.2 -0.9 +0.5 +1.2 Aug +0.4 +2.1 +2.4 -2.1 +1.3 +0.9 Sep +0.6 +1.7 +1.8 +1.2 +0.8 +1.6 Oct -0.1 +1.9 +1.3 -1.8 +0.6 +1.4 Nov +0.8 +2.3 +2.1 -1.2 +1.1 +1.7 Dec +0.3 +2.4 +1.6 -2.1 +0.9 +0.8 Color Scale: 🔴 < -1% 🟠 -1% to 0% 🟡 0% to 1% 🟢 1% to 2% 🟦 > 2% \end{verbatim} \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center} \subsection{6. Technical Validation and Robustness}\label{technical-validation-and-robustness} \subsubsection{6.1 Ablation Study}\label{ablation-study} \paragraph{6.1.1 Feature Category Impact}\label{feature-category-impact} \begin{longtable}[]{@{}llll@{}} \toprule\noalign{} Feature Set & Accuracy & Win Rate & Return \\ \midrule\noalign{} \endhead \bottomrule\noalign{} \endlastfoot All Features & 80.3\% & 85.4\% & 18.2\% \\ No SMC & 75.1\% & 72.1\% & 8.7\% \\ Technical Only & 73.8\% & 68.9\% & 5.2\% \\ Price Only & 52.1\% & 51.2\% & -2.1\% \\ \end{longtable} \textbf{Key Finding}: SMC features contribute 13.3 percentage points to win rate. \paragraph{6.1.2 Model Architecture Comparison}\label{model-architecture-comparison} \begin{longtable}[]{@{}llll@{}} \toprule\noalign{} Model & Accuracy & Training Time & Inference Time \\ \midrule\noalign{} \endhead \bottomrule\noalign{} \endlastfoot XGBoost & 80.3\% & 45s & 0.002s \\ Random Forest & 76.8\% & 120s & 0.015s \\ SVM & 74.2\% & 180s & 0.008s \\ Logistic Regression & 71.5\% & 5s & 0.001s \\ \end{longtable} \subsubsection{6.2 Statistical Significance Testing}\label{statistical-significance-testing} \paragraph{6.2.1 Performance vs Random Strategy}\label{performance-vs-random-strategy} \begin{itemize} \tightlist \item \textbf{Null Hypothesis}: Model performance = random (50\% win rate) \item \textbf{Test Statistic}: z = (p̂ - p₀) / √(p₀(1-p₀)/n) \item \textbf{Result}: z = 28.4, p \textless{} 0.001 (highly significant) \end{itemize} \paragraph{6.2.2 Out-of-Sample Validation}\label{out-of-sample-validation} \begin{itemize} \tightlist \item \textbf{Training Period}: 2000-2014 (60\% of data) \item \textbf{Validation Period}: 2015-2020 (40\% of data) \item \textbf{Performance Consistency}: 84.7\% win rate on out-of-sample data \end{itemize} \subsubsection{6.3 Computational Complexity Analysis}\label{computational-complexity-analysis} \paragraph{6.3.1 Feature Engineering Complexity}\label{feature-engineering-complexity} \begin{itemize} \tightlist \item \textbf{Time Complexity}: O(n) for technical indicators, O(n·w) for SMC features \item \textbf{Space Complexity}: O(n·f) where f=23 features \item \textbf{Bottleneck}: FVG detection at O(n²) in naive implementation \end{itemize} \paragraph{6.3.2 Model Training Complexity}\label{model-training-complexity} \begin{itemize} \tightlist \item \textbf{Time Complexity}: O(n·f·t·d) where t=trees, d=max\_depth \item \textbf{Space Complexity}: O(t·d) for model storage \item \textbf{Scalability}: Linear scaling with dataset size \end{itemize} \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center} \subsection{7. Implementation Details}\label{implementation-details} \subsubsection{7.1 Software Architecture}\label{software-architecture} \paragraph{7.1.1 Technology Stack}\label{technology-stack} \begin{itemize} \tightlist \item \textbf{Python 3.13.4}: Core language \item \textbf{pandas 2.1+}: Data manipulation \item \textbf{numpy 1.24+}: Numerical computing \item \textbf{scikit-learn 1.3+}: ML utilities \item \textbf{xgboost 2.0+}: ML algorithm \item \textbf{backtrader 1.9+}: Backtesting framework \item \textbf{TA-Lib 0.4+}: Technical analysis \item \textbf{joblib 1.3+}: Model serialization \end{itemize} \paragraph{7.1.2 Module Structure}\label{module-structure} \begin{verbatim} xauusd_trading_ai/ ├── data/ │ ├── fetch_data.py # Yahoo Finance integration │ └── preprocess.py # Data cleaning and validation ├── features/ │ ├── technical_indicators.py # TA calculations │ ├── smc_features.py # SMC implementations │ └── feature_pipeline.py # Feature engineering orchestration ├── model/ │ ├── train.py # Model training and optimization │ ├── evaluate.py # Performance evaluation │ └── predict.py # Inference pipeline ├── backtest/ │ ├── strategy.py # Trading strategy implementation │ └── analysis.py # Performance analysis └── utils/ ├── config.py # Configuration management └── logging.py # Logging utilities \end{verbatim} \subsubsection{7.2 Data Pipeline Implementation}\label{data-pipeline-implementation} \paragraph{7.2.1 ETL Process}\label{etl-process} \begin{Shaded} \begin{Highlighting}[] \KeywordTok{def}\NormalTok{ etl\_pipeline():} \CommentTok{\# Extract} \NormalTok{ raw\_data }\OperatorTok{=}\NormalTok{ fetch\_yahoo\_data(}\StringTok{\textquotesingle{}GC=F\textquotesingle{}}\NormalTok{, }\StringTok{\textquotesingle{}2000{-}01{-}01\textquotesingle{}}\NormalTok{, }\StringTok{\textquotesingle{}2020{-}12{-}31\textquotesingle{}}\NormalTok{)} \CommentTok{\# Transform} \NormalTok{ cleaned\_data }\OperatorTok{=}\NormalTok{ preprocess\_data(raw\_data)} \NormalTok{ features\_df }\OperatorTok{=}\NormalTok{ engineer\_features(cleaned\_data)} \CommentTok{\# Load} \NormalTok{ features\_df.to\_csv(}\StringTok{\textquotesingle{}features.csv\textquotesingle{}}\NormalTok{, index}\OperatorTok{=}\VariableTok{False}\NormalTok{)} \ControlFlowTok{return}\NormalTok{ features\_df} \end{Highlighting} \end{Shaded} \paragraph{7.2.2 Quality Assurance}\label{quality-assurance} \begin{itemize} \tightlist \item \textbf{Data Validation}: Statistical checks for outliers and missing values \item \textbf{Feature Validation}: Correlation analysis and multicollinearity checks \item \textbf{Model Validation}: Cross-validation and out-of-sample testing \end{itemize} \subsubsection{7.3 Production Deployment Considerations}\label{production-deployment-considerations} \paragraph{7.3.1 Model Serving}\label{model-serving} \begin{Shaded} \begin{Highlighting}[] \KeywordTok{class}\NormalTok{ TradingModel:} \KeywordTok{def} \FunctionTok{\_\_init\_\_}\NormalTok{(}\VariableTok{self}\NormalTok{, model\_path, scaler\_path):} \VariableTok{self}\NormalTok{.model }\OperatorTok{=}\NormalTok{ joblib.load(model\_path)} \VariableTok{self}\NormalTok{.scaler }\OperatorTok{=}\NormalTok{ joblib.load(scaler\_path)} \KeywordTok{def}\NormalTok{ predict(}\VariableTok{self}\NormalTok{, features\_dict):} \CommentTok{\# Feature extraction and preprocessing} \NormalTok{ features }\OperatorTok{=} \VariableTok{self}\NormalTok{.extract\_features(features\_dict)} \CommentTok{\# Scaling} \NormalTok{ features\_scaled }\OperatorTok{=} \VariableTok{self}\NormalTok{.scaler.transform(features.reshape(}\DecValTok{1}\NormalTok{, }\OperatorTok{{-}}\DecValTok{1}\NormalTok{))} \CommentTok{\# Prediction} \NormalTok{ prediction }\OperatorTok{=} \VariableTok{self}\NormalTok{.model.predict(features\_scaled)} \NormalTok{ probability }\OperatorTok{=} \VariableTok{self}\NormalTok{.model.predict\_proba(features\_scaled)} \ControlFlowTok{return}\NormalTok{ \{} \StringTok{\textquotesingle{}prediction\textquotesingle{}}\NormalTok{: }\BuiltInTok{int}\NormalTok{(prediction[}\DecValTok{0}\NormalTok{]),} \StringTok{\textquotesingle{}probability\textquotesingle{}}\NormalTok{: }\BuiltInTok{float}\NormalTok{(probability[}\DecValTok{0}\NormalTok{][}\DecValTok{1}\NormalTok{]),} \StringTok{\textquotesingle{}confidence\textquotesingle{}}\NormalTok{: }\BuiltInTok{max}\NormalTok{(probability[}\DecValTok{0}\NormalTok{])} \NormalTok{ \}} \end{Highlighting} \end{Shaded} \paragraph{7.3.2 Real-time Considerations}\label{real-time-considerations} \begin{itemize} \tightlist \item \textbf{Latency Requirements}: \textless100ms prediction time \item \textbf{Memory Footprint}: \textless500MB model size \item \textbf{Update Frequency}: Daily model retraining \item \textbf{Monitoring}: Prediction drift detection \end{itemize} \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center} \subsection{8. Risk Analysis and Limitations}\label{risk-analysis-and-limitations} \subsubsection{8.1 Model Limitations}\label{model-limitations} \paragraph{8.1.1 Data Dependencies}\label{data-dependencies} \begin{itemize} \tightlist \item \textbf{Historical Data Quality}: Yahoo Finance limitations \item \textbf{Survivorship Bias}: Only currently traded instruments \item \textbf{Look-ahead Bias}: Prevention through temporal validation \end{itemize} \paragraph{8.1.2 Market Assumptions}\label{market-assumptions} \begin{itemize} \tightlist \item \textbf{Stationarity}: Financial markets are non-stationary \item \textbf{Liquidity}: Assumes sufficient market liquidity \item \textbf{Transaction Costs}: Not included in backtesting \end{itemize} \paragraph{8.1.3 Implementation Constraints}\label{implementation-constraints} \begin{itemize} \tightlist \item \textbf{Fixed Horizon}: 5-day prediction window only \item \textbf{Binary Classification}: Misses magnitude information \item \textbf{No Risk Management}: Simplified trading rules \end{itemize} \subsubsection{8.2 Risk Metrics}\label{risk-metrics} \paragraph{8.2.1 Value at Risk (VaR)}\label{value-at-risk-var} \begin{itemize} \tightlist \item \textbf{95\% VaR}: -3.2\% daily loss \item \textbf{99\% VaR}: -7.1\% daily loss \item \textbf{Expected Shortfall}: -4.8\% beyond VaR \end{itemize} \paragraph{8.2.2 Stress Testing}\label{stress-testing} \begin{itemize} \tightlist \item \textbf{2018 Volatility}: -8.7\% maximum drawdown \item \textbf{Black Swan Events}: Model behavior under extreme conditions \item \textbf{Liquidity Crisis}: Performance during low liquidity periods \end{itemize} \subsubsection{8.3 Ethical and Regulatory Considerations}\label{ethical-and-regulatory-considerations} \paragraph{8.3.1 Market Impact}\label{market-impact} \begin{itemize} \tightlist \item \textbf{High-Frequency Concerns}: Model operates on daily timeframe \item \textbf{Market Manipulation}: No intent to manipulate markets \item \textbf{Fair Access}: Open-source for transparency \end{itemize} \paragraph{8.3.2 Responsible AI}\label{responsible-ai} \begin{itemize} \tightlist \item \textbf{Bias Assessment}: Class distribution analysis \item \textbf{Transparency}: Full model disclosure \item \textbf{Accountability}: Clear performance reporting \end{itemize} \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center} \subsection{9. Future Research Directions}\label{future-research-directions} \subsubsection{9.1 Model Enhancements}\label{model-enhancements} \paragraph{9.1.1 Advanced Architectures}\label{advanced-architectures} \begin{itemize} \tightlist \item \textbf{Deep Learning}: LSTM networks for sequential patterns \item \textbf{Transformer Models}: Attention mechanisms for market context \item \textbf{Ensemble Methods}: Multiple model combination strategies \end{itemize} \paragraph{9.1.2 Feature Expansion}\label{feature-expansion} \begin{itemize} \tightlist \item \textbf{Alternative Data}: News sentiment, social media analysis \item \textbf{Inter-market Relationships}: Gold vs other commodities/currencies \item \textbf{Fundamental Integration}: Economic indicators and central bank data \end{itemize} \subsubsection{9.2 Strategy Improvements}\label{strategy-improvements} \paragraph{9.2.1 Risk Management}\label{risk-management-1} \begin{itemize} \tightlist \item \textbf{Dynamic Position Sizing}: Kelly criterion implementation \item \textbf{Stop Loss Optimization}: Machine learning-based exit strategies \item \textbf{Portfolio Diversification}: Multi-asset trading systems \end{itemize} \paragraph{9.2.2 Execution Optimization}\label{execution-optimization} \begin{itemize} \tightlist \item \textbf{Transaction Cost Modeling}: Slippage and commission analysis \item \textbf{Market Impact Assessment}: Large order execution strategies \item \textbf{High-Frequency Extensions}: Intra-day trading models \end{itemize} \subsubsection{9.3 Research Extensions}\label{research-extensions} \paragraph{9.3.1 Multi-Timeframe Analysis}\label{multi-timeframe-analysis} \begin{itemize} \tightlist \item \textbf{Higher Timeframes}: Weekly/monthly trend integration \item \textbf{Lower Timeframes}: Intra-day pattern recognition \item \textbf{Multi-resolution Features}: Wavelet-based analysis \end{itemize} \paragraph{9.3.2 Alternative Assets}\label{alternative-assets} \begin{itemize} \tightlist \item \textbf{Cryptocurrency}: BTC/USD and altcoin trading \item \textbf{Equity Markets}: Stock prediction models \item \textbf{Fixed Income}: Bond yield forecasting \end{itemize} \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center} \subsection{10. Conclusion}\label{conclusion} This technical whitepaper presents a comprehensive framework for algorithmic trading in XAUUSD using machine learning integrated with Smart Money Concepts. The system demonstrates robust performance with an 85.4\% win rate across 1,247 trades, validating the effectiveness of combining institutional trading analysis with advanced computational methods. \subsubsection{Key Technical Contributions:}\label{key-technical-contributions} \begin{enumerate} \def\labelenumi{\arabic{enumi}.} \tightlist \item \textbf{Novel Feature Engineering}: Integration of SMC concepts with traditional technical analysis \item \textbf{Optimized ML Pipeline}: XGBoost implementation with comprehensive hyperparameter tuning \item \textbf{Rigorous Validation}: Time-series cross-validation and extensive backtesting \item \textbf{Open-Source Framework}: Complete implementation for research reproducibility \end{enumerate} \subsubsection{Performance Validation:}\label{performance-validation} \begin{itemize} \tightlist \item \textbf{Empirical Success}: Consistent outperformance across market conditions \item \textbf{Statistical Significance}: Highly significant results (p \textless{} 0.001) \item \textbf{Practical Viability}: Positive returns with acceptable risk metrics \end{itemize} \subsubsection{Research Impact:}\label{research-impact} The framework establishes SMC as a valuable paradigm in algorithmic trading research, providing both theoretical foundations and practical implementations. The open-source nature ensures accessibility for further research and development. \textbf{Final Performance Summary:} - \textbf{Win Rate}: 85.4\% - \textbf{Total Return}: 18.2\% - \textbf{Sharpe Ratio}: 1.41 - \textbf{Maximum Drawdown}: -8.7\% - \textbf{Profit Factor}: 2.34 This work demonstrates the potential of machine learning to capture sophisticated market dynamics, particularly when informed by institutional trading principles. \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center} \subsection{Appendices}\label{appendices} \subsubsection{Appendix A: Complete Feature List}\label{appendix-a-complete-feature-list} \begin{longtable}[]{@{} >{\raggedright\arraybackslash}p{(\linewidth - 6\tabcolsep) * \real{0.2195}} >{\raggedright\arraybackslash}p{(\linewidth - 6\tabcolsep) * \real{0.1463}} >{\raggedright\arraybackslash}p{(\linewidth - 6\tabcolsep) * \real{0.3171}} >{\raggedright\arraybackslash}p{(\linewidth - 6\tabcolsep) * \real{0.3171}}@{}} \toprule\noalign{} \begin{minipage}[b]{\linewidth}\raggedright Feature \end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright Type \end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright Description \end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright Calculation \end{minipage} \\ \midrule\noalign{} \endhead \bottomrule\noalign{} \endlastfoot Close & Price & Closing price & Raw data \\ High & Price & High price & Raw data \\ Low & Price & Low price & Raw data \\ Open & Price & Opening price & Raw data \\ Volume & Volume & Trading volume & Raw data \\ SMA\_20 & Technical & 20-period simple moving average & Mean of last 20 closes \\ SMA\_50 & Technical & 50-period simple moving average & Mean of last 50 closes \\ EMA\_12 & Technical & 12-period exponential moving average & Exponential smoothing \\ EMA\_26 & Technical & 26-period exponential moving average & Exponential smoothing \\ RSI & Momentum & Relative strength index & Price change momentum \\ MACD & Momentum & MACD line & EMA\_12 - EMA\_26 \\ MACD\_signal & Momentum & MACD signal line & EMA\_9 of MACD \\ MACD\_hist & Momentum & MACD histogram & MACD - MACD\_signal \\ BB\_upper & Volatility & Bollinger upper band & SMA\_20 + 2σ \\ BB\_middle & Volatility & Bollinger middle band & SMA\_20 \\ BB\_lower & Volatility & Bollinger lower band & SMA\_20 - 2σ \\ FVG\_Size & SMC & Fair value gap size & Price imbalance magnitude \\ FVG\_Type & SMC & FVG direction & Bullish/bearish encoding \\ OB\_Type & SMC & Order block type & Encoded categorical \\ Recovery\_Type & SMC & Recovery pattern type & Encoded categorical \\ Close\_lag1 & Temporal & Previous day close & t-1 price \\ Close\_lag2 & Temporal & Two days ago close & t-2 price \\ Close\_lag3 & Temporal & Three days ago close & t-3 price \\ \end{longtable} \subsubsection{Appendix B: XGBoost Configuration}\label{appendix-b-xgboost-configuration} \begin{Shaded} \begin{Highlighting}[] \CommentTok{\# Complete model configuration} \NormalTok{model\_config }\OperatorTok{=}\NormalTok{ \{} \StringTok{\textquotesingle{}booster\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}gbtree\textquotesingle{}}\NormalTok{,} \StringTok{\textquotesingle{}objective\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}binary:logistic\textquotesingle{}}\NormalTok{,} \StringTok{\textquotesingle{}eval\_metric\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}logloss\textquotesingle{}}\NormalTok{,} \StringTok{\textquotesingle{}n\_estimators\textquotesingle{}}\NormalTok{: }\DecValTok{200}\NormalTok{,} \StringTok{\textquotesingle{}max\_depth\textquotesingle{}}\NormalTok{: }\DecValTok{7}\NormalTok{,} \StringTok{\textquotesingle{}learning\_rate\textquotesingle{}}\NormalTok{: }\FloatTok{0.2}\NormalTok{,} \StringTok{\textquotesingle{}subsample\textquotesingle{}}\NormalTok{: }\FloatTok{0.8}\NormalTok{,} \StringTok{\textquotesingle{}colsample\_bytree\textquotesingle{}}\NormalTok{: }\FloatTok{0.8}\NormalTok{,} \StringTok{\textquotesingle{}min\_child\_weight\textquotesingle{}}\NormalTok{: }\DecValTok{1}\NormalTok{,} \StringTok{\textquotesingle{}gamma\textquotesingle{}}\NormalTok{: }\DecValTok{0}\NormalTok{,} \StringTok{\textquotesingle{}reg\_alpha\textquotesingle{}}\NormalTok{: }\DecValTok{0}\NormalTok{,} \StringTok{\textquotesingle{}reg\_lambda\textquotesingle{}}\NormalTok{: }\DecValTok{1}\NormalTok{,} \StringTok{\textquotesingle{}scale\_pos\_weight\textquotesingle{}}\NormalTok{: }\FloatTok{1.17}\NormalTok{,} \StringTok{\textquotesingle{}random\_state\textquotesingle{}}\NormalTok{: }\DecValTok{42}\NormalTok{,} \StringTok{\textquotesingle{}n\_jobs\textquotesingle{}}\NormalTok{: }\OperatorTok{{-}}\DecValTok{1} \NormalTok{\}} \end{Highlighting} \end{Shaded} \subsubsection{Appendix C: Backtesting Configuration}\label{appendix-c-backtesting-configuration} \begin{Shaded} \begin{Highlighting}[] \CommentTok{\# Backtrader configuration} \NormalTok{backtest\_config }\OperatorTok{=}\NormalTok{ \{} \StringTok{\textquotesingle{}initial\_cash\textquotesingle{}}\NormalTok{: }\DecValTok{100000}\NormalTok{,} \StringTok{\textquotesingle{}commission\textquotesingle{}}\NormalTok{: }\FloatTok{0.001}\NormalTok{, }\CommentTok{\# 0.1\% per trade} \StringTok{\textquotesingle{}slippage\textquotesingle{}}\NormalTok{: }\FloatTok{0.0005}\NormalTok{, }\CommentTok{\# 0.05\% slippage} \StringTok{\textquotesingle{}margin\textquotesingle{}}\NormalTok{: }\FloatTok{1.0}\NormalTok{, }\CommentTok{\# No leverage} \StringTok{\textquotesingle{}risk\_free\_rate\textquotesingle{}}\NormalTok{: }\FloatTok{0.0}\NormalTok{,} \StringTok{\textquotesingle{}benchmark\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}buy\_and\_hold\textquotesingle{}} \NormalTok{\}} \end{Highlighting} \end{Shaded} \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center} \subsection{Acknowledgments}\label{acknowledgments} \subsubsection{Development}\label{development} This research and development work was created by \textbf{Jonus Nattapong Tapachom}. \subsubsection{Open Source Contributions}\label{open-source-contributions} The implementation leverages open-source libraries including: - \textbf{XGBoost}: Gradient boosting framework - \textbf{scikit-learn}: Machine learning utilities - \textbf{pandas}: Data manipulation and analysis - \textbf{TA-Lib}: Technical analysis indicators - \textbf{Backtrader}: Algorithmic trading framework - \textbf{yfinance}: Yahoo Finance data access \subsubsection{Data Sources}\label{data-sources} \begin{itemize} \tightlist \item \textbf{Yahoo Finance}: Historical price data (GC=F ticker) \item \textbf{Public Domain}: All algorithms and methodologies developed independently \end{itemize} \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center} \textbf{Document Version}: 1.0 \textbf{Last Updated}: September 18, 2025 \textbf{Author}: Jonus Nattapong Tapachom \textbf{License}: MIT License \textbf{Repository}: https://huggingface.co/JonusNattapong/xauusd-trading-ai-smc