| \section{XAUUSD Trading AI: A Machine Learning Approach Using Smart | |
| Money | |
| Concepts}\label{xauusd-trading-ai-a-machine-learning-approach-using-smart-money-concepts} | |
| \textbf{Author: Jonus Nattapong Tapachom}\\ | |
| \textbf{Date: September 18, 2025} | |
| \subsection{Abstract}\label{abstract} | |
| This paper presents a comprehensive machine learning framework for | |
| predicting XAUUSD (Gold vs US Dollar) price movements using Smart Money | |
| Concepts (SMC) strategy elements. The proposed system achieves an 85.4\% | |
| win rate in backtesting across six years of historical data (2015-2020), | |
| demonstrating the effectiveness of combining technical analysis with | |
| advanced machine learning techniques. | |
| The model utilizes XGBoost classification to predict 5-day ahead price | |
| direction, incorporating 23 features including traditional technical | |
| indicators (SMA, EMA, RSI, MACD, Bollinger Bands) and SMC-specific | |
| features (Fair Value Gaps, Order Blocks, Recovery patterns). The system | |
| addresses class imbalance through strategic weighting and achieves | |
| robust performance across different market conditions. | |
| \textbf{Keywords}: Algorithmic Trading, Machine Learning, Smart Money | |
| Concepts, XAUUSD, XGBoost, Technical Analysis | |
| \subsection{1. Introduction}\label{introduction} | |
| \subsubsection{1.1 Background}\label{background} | |
| Algorithmic trading has revolutionized financial markets, enabling | |
| systematic execution of trading strategies with speed and precision | |
| previously unattainable by human traders. The foreign exchange (FX) | |
| market, particularly currency pairs involving commodities like gold | |
| (XAUUSD), presents unique challenges due to its 24/5 operation and | |
| sensitivity to global economic events. | |
| Smart Money Concepts (SMC) represent a relatively new paradigm in | |
| technical analysis, focusing on identifying institutional trading | |
| patterns rather than retail-driven price action. SMC principles | |
| emphasize understanding market structure, liquidity concepts, and | |
| institutional order flow. | |
| \subsubsection{1.2 Problem Statement}\label{problem-statement} | |
| Traditional technical analysis indicators often fail to capture the | |
| sophisticated strategies employed by institutional traders. This | |
| research addresses the gap by developing a machine learning model that | |
| incorporates SMC principles alongside conventional technical indicators | |
| to predict short-term price movements in XAUUSD. | |
| \subsubsection{1.3 Research Objectives}\label{research-objectives} | |
| \begin{enumerate} | |
| \def\labelenumi{\arabic{enumi}.} | |
| \tightlist | |
| \item | |
| Develop a comprehensive feature set combining SMC and technical | |
| indicators | |
| \item | |
| Implement and optimize an XGBoost-based prediction model | |
| \item | |
| Validate performance through rigorous backtesting | |
| \item | |
| Analyze model robustness across different market conditions | |
| \item | |
| Provide a reproducible framework for algorithmic trading research | |
| \end{enumerate} | |
| \subsubsection{1.4 Contributions}\label{contributions} | |
| \begin{itemize} | |
| \tightlist | |
| \item | |
| Novel integration of SMC concepts with machine learning | |
| \item | |
| Comprehensive feature engineering methodology | |
| \item | |
| Robust backtesting framework with yearly performance analysis | |
| \item | |
| Open-source implementation for research community | |
| \item | |
| Empirical validation of SMC effectiveness in algorithmic trading | |
| \end{itemize} | |
| \subsection{2. Literature Review}\label{literature-review} | |
| \subsubsection{2.1 Algorithmic Trading in FX | |
| Markets}\label{algorithmic-trading-in-fx-markets} | |
| Research in algorithmic trading has evolved from simple rule-based | |
| systems to sophisticated machine learning approaches. Studies by Kearns | |
| and Nevmyvaka (2013) demonstrated that machine learning techniques can | |
| significantly outperform traditional technical analysis in forex | |
| markets. More recent work by Dixon et al.~(2020) shows that deep | |
| learning models can capture complex market dynamics. | |
| \subsubsection{2.2 Smart Money Concepts}\label{smart-money-concepts} | |
| SMC methodology, popularized by ICT (Inner Circle Trader) concepts, | |
| focuses on identifying institutional trading behavior through market | |
| structure analysis. Key SMC elements include: | |
| \begin{itemize} | |
| \tightlist | |
| \item | |
| \textbf{Order Blocks}: Areas where significant buying/selling occurred | |
| \item | |
| \textbf{Fair Value Gaps}: Price imbalances between candles | |
| \item | |
| \textbf{Liquidity Concepts}: Understanding where institutional orders | |
| are placed | |
| \item | |
| \textbf{Market Structure}: Recognition of higher-timeframe trends | |
| \end{itemize} | |
| \subsubsection{2.3 Machine Learning in | |
| Trading}\label{machine-learning-in-trading} | |
| XGBoost has emerged as a powerful tool for financial prediction tasks. | |
| Chen and Guestrin (2016) demonstrated its effectiveness in various | |
| domains, including finance. Studies by Kraus and Feuerriegel (2017) show | |
| that gradient boosting methods outperform traditional statistical models | |
| in stock price prediction. | |
| \subsubsection{2.4 Gold Price Prediction}\label{gold-price-prediction} | |
| XAUUSD presents unique characteristics as both a commodity and currency | |
| pair. Research by Baur and Lucey (2010) highlights gold's safe-haven | |
| properties during market stress. Studies by Pierdzioch et al.~(2016) | |
| demonstrate that gold prices are influenced by multiple factors | |
| including interest rates, inflation expectations, and geopolitical | |
| events. | |
| \subsection{3. Methodology}\label{methodology} | |
| \subsubsection{3.1 Data Collection}\label{data-collection} | |
| \paragraph{3.1.1 Data Source}\label{data-source} | |
| Historical XAUUSD data was obtained from Yahoo Finance using the ticker | |
| symbol ``GC=F'' (Gold Futures). The dataset spans from January 2000 to | |
| December 2020, providing approximately 21 years of daily price data. | |
| \paragraph{3.1.2 Data Preprocessing}\label{data-preprocessing} | |
| Raw data included Open, High, Low, Close prices and Volume. | |
| Preprocessing steps included: - Removal of missing values and outliers - | |
| Adjustment for corporate actions (minimal for futures) - Calculation of | |
| returns and volatility measures - Data quality validation | |
| \subsubsection{3.2 Feature Engineering}\label{feature-engineering} | |
| \paragraph{3.2.1 Technical Indicators}\label{technical-indicators} | |
| Traditional technical indicators were calculated using the TA-Lib | |
| library: | |
| \textbf{Trend Indicators:} - Simple Moving Averages (SMA): 20-day and | |
| 50-day periods - Exponential Moving Averages (EMA): 12-day and 26-day | |
| periods | |
| \textbf{Momentum Indicators:} - Relative Strength Index (RSI): 14-day | |
| period - Moving Average Convergence Divergence (MACD): Standard | |
| parameters | |
| \textbf{Volatility Indicators:} - Bollinger Bands: 20-day period, 2 | |
| standard deviations | |
| \paragraph{3.2.2 SMC Feature | |
| Implementation}\label{smc-feature-implementation} | |
| \textbf{Fair Value Gaps (FVG):} | |
| \begin{Shaded} | |
| \begin{Highlighting}[] | |
| \KeywordTok{def}\NormalTok{ calculate\_fvg(df):} | |
| \NormalTok{ gaps }\OperatorTok{=}\NormalTok{ []} | |
| \ControlFlowTok{for}\NormalTok{ i }\KeywordTok{in} \BuiltInTok{range}\NormalTok{(}\DecValTok{1}\NormalTok{, }\BuiltInTok{len}\NormalTok{(df)}\OperatorTok{{-}}\DecValTok{1}\NormalTok{):} | |
| \ControlFlowTok{if}\NormalTok{ df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{][i] }\OperatorTok{\textgreater{}}\NormalTok{ df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{][i}\OperatorTok{{-}}\DecValTok{1}\NormalTok{] }\KeywordTok{and}\NormalTok{ df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{][i] }\OperatorTok{\textgreater{}}\NormalTok{ df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{][i}\OperatorTok{+}\DecValTok{1}\NormalTok{]:} | |
| \CommentTok{\# Bullish FVG} | |
| \NormalTok{ gap\_size }\OperatorTok{=}\NormalTok{ df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{][i] }\OperatorTok{{-}} \BuiltInTok{max}\NormalTok{(df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{][i}\OperatorTok{{-}}\DecValTok{1}\NormalTok{], df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{][i}\OperatorTok{+}\DecValTok{1}\NormalTok{])} | |
| \NormalTok{ gaps.append(\{}\StringTok{\textquotesingle{}type\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}bullish\textquotesingle{}}\NormalTok{, }\StringTok{\textquotesingle{}size\textquotesingle{}}\NormalTok{: gap\_size, }\StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i\})} | |
| \ControlFlowTok{elif}\NormalTok{ df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{][i] }\OperatorTok{\textless{}}\NormalTok{ df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{][i}\OperatorTok{{-}}\DecValTok{1}\NormalTok{] }\KeywordTok{and}\NormalTok{ df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{][i] }\OperatorTok{\textless{}}\NormalTok{ df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{][i}\OperatorTok{+}\DecValTok{1}\NormalTok{]:} | |
| \CommentTok{\# Bearish FVG} | |
| \NormalTok{ gap\_size }\OperatorTok{=} \BuiltInTok{min}\NormalTok{(df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{][i}\OperatorTok{{-}}\DecValTok{1}\NormalTok{], df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{][i}\OperatorTok{+}\DecValTok{1}\NormalTok{]) }\OperatorTok{{-}}\NormalTok{ df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{][i]} | |
| \NormalTok{ gaps.append(\{}\StringTok{\textquotesingle{}type\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}bearish\textquotesingle{}}\NormalTok{, }\StringTok{\textquotesingle{}size\textquotesingle{}}\NormalTok{: gap\_size, }\StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i\})} | |
| \ControlFlowTok{return}\NormalTok{ gaps} | |
| \end{Highlighting} | |
| \end{Shaded} | |
| \textbf{Order Blocks:} Order blocks were identified by analyzing | |
| significant price movements and volume spikes, representing areas where | |
| institutional accumulation or distribution occurred. | |
| \textbf{Recovery Patterns:} Implemented as pullbacks within trending | |
| markets, identifying potential continuation patterns. | |
| \paragraph{3.2.3 Lag Features}\label{lag-features} | |
| Price lag features were included to capture momentum and mean-reversion | |
| effects: - Close price lags: 1, 2, and 3 days - Return lags: 1, 2, and 3 | |
| days | |
| \subsubsection{3.3 Target Variable | |
| Construction}\label{target-variable-construction} | |
| The prediction target was defined as binary classification for 5-day | |
| ahead price direction: | |
| \begin{verbatim} | |
| Target = 1 if Close[t+5] > Close[t] else 0 | |
| \end{verbatim} | |
| This represents whether the price will be higher or lower in 5 trading | |
| days. | |
| \subsubsection{3.4 Model Development}\label{model-development} | |
| \paragraph{3.4.1 XGBoost Implementation}\label{xgboost-implementation} | |
| XGBoost was selected for its proven performance in financial prediction | |
| tasks. Key hyperparameters were optimized through grid search: | |
| \begin{Shaded} | |
| \begin{Highlighting}[] | |
| \NormalTok{model\_params }\OperatorTok{=}\NormalTok{ \{} | |
| \StringTok{\textquotesingle{}n\_estimators\textquotesingle{}}\NormalTok{: }\DecValTok{200}\NormalTok{,} | |
| \StringTok{\textquotesingle{}max\_depth\textquotesingle{}}\NormalTok{: }\DecValTok{7}\NormalTok{,} | |
| \StringTok{\textquotesingle{}learning\_rate\textquotesingle{}}\NormalTok{: }\FloatTok{0.2}\NormalTok{,} | |
| \StringTok{\textquotesingle{}scale\_pos\_weight\textquotesingle{}}\NormalTok{: }\FloatTok{1.17}\NormalTok{, }\CommentTok{\# Class balancing} | |
| \StringTok{\textquotesingle{}objective\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}binary:logistic\textquotesingle{}}\NormalTok{,} | |
| \StringTok{\textquotesingle{}eval\_metric\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}logloss\textquotesingle{}} | |
| \NormalTok{\}} | |
| \end{Highlighting} | |
| \end{Shaded} | |
| \paragraph{3.4.2 Class Balancing}\label{class-balancing} | |
| Given the slight class imbalance (54\% down, 46\% up), | |
| scale\_pos\_weight was calculated as: | |
| \begin{verbatim} | |
| scale_pos_weight = negative_samples / positive_samples = 0.54 / 0.46 ≈ 1.17 | |
| \end{verbatim} | |
| \paragraph{3.4.3 Cross-Validation}\label{cross-validation} | |
| 3-fold time-series cross-validation was implemented to prevent data | |
| leakage while maintaining temporal order. | |
| \subsubsection{3.5 Backtesting Framework}\label{backtesting-framework} | |
| \paragraph{3.5.1 Strategy Implementation}\label{strategy-implementation} | |
| A simple long/short strategy was implemented using Backtrader: - Long | |
| position when prediction = 1 (price expected to rise) - Short position | |
| when prediction = 0 (price expected to fall) - Fixed position sizing (no | |
| risk management implemented) | |
| \paragraph{3.5.2 Performance Metrics}\label{performance-metrics} | |
| \begin{itemize} | |
| \tightlist | |
| \item | |
| Win Rate: Percentage of profitable trades | |
| \item | |
| Total Return: Cumulative portfolio return | |
| \item | |
| Sharpe Ratio: Risk-adjusted return measure | |
| \item | |
| Maximum Drawdown: Largest peak-to-trough decline | |
| \end{itemize} | |
| \subsection{4. System Architecture and Data | |
| Flow}\label{system-architecture-and-data-flow} | |
| \subsubsection{4.1 Dataset Flow Diagram}\label{dataset-flow-diagram} | |
| \begin{Shaded} | |
| \begin{Highlighting}[] | |
| \NormalTok{graph TD} | |
| \NormalTok{ A[Yahoo Finance API\textless{}br/\textgreater{}GC=F Ticker] {-}{-}\textgreater{} B[Raw Data Collection\textless{}br/\textgreater{}2000{-}2020]} | |
| \NormalTok{ B {-}{-}\textgreater{} C[Data Preprocessing\textless{}br/\textgreater{}Missing Values, Outliers]} | |
| \NormalTok{ C {-}{-}\textgreater{} D[Feature Engineering\textless{}br/\textgreater{}23 Features]} | |
| \NormalTok{ D {-}{-}\textgreater{} E[Technical Indicators]} | |
| \NormalTok{ D {-}{-}\textgreater{} F[SMC Features]} | |
| \NormalTok{ D {-}{-}\textgreater{} G[Lag Features]} | |
| \NormalTok{ E {-}{-}\textgreater{} H[Target Creation\textless{}br/\textgreater{}5{-}Day Ahead Direction]} | |
| \NormalTok{ F {-}{-}\textgreater{} H} | |
| \NormalTok{ G {-}{-}\textgreater{} H} | |
| \NormalTok{ H {-}{-}\textgreater{} I[Train/Test Split\textless{}br/\textgreater{}80/20 Temporal]} | |
| \NormalTok{ I {-}{-}\textgreater{} J[XGBoost Training\textless{}br/\textgreater{}Hyperparameter Optimization]} | |
| \NormalTok{ J {-}{-}\textgreater{} K[Model Validation\textless{}br/\textgreater{}Cross{-}Validation]} | |
| \NormalTok{ K {-}{-}\textgreater{} L[Backtesting\textless{}br/\textgreater{}2015{-}2020]} | |
| \NormalTok{ L {-}{-}\textgreater{} M[Performance Analysis\textless{}br/\textgreater{}Risk Metrics, Returns]} | |
| \NormalTok{ style A fill:\#e1f5fe} | |
| \NormalTok{ style M fill:\#c8e6c9} | |
| \end{Highlighting} | |
| \end{Shaded} | |
| \subsubsection{4.2 Model Architecture | |
| Diagram}\label{model-architecture-diagram} | |
| \begin{Shaded} | |
| \begin{Highlighting}[] | |
| \NormalTok{graph TD} | |
| \NormalTok{ A[Input Features\textless{}br/\textgreater{}23 Dimensions] {-}{-}\textgreater{} B[Feature Scaling\textless{}br/\textgreater{}StandardScaler]} | |
| \NormalTok{ B {-}{-}\textgreater{} C[XGBoost Ensemble\textless{}br/\textgreater{}200 Trees]} | |
| \NormalTok{ C {-}{-}\textgreater{} D[Tree 1\textless{}br/\textgreater{}Max Depth 7]} | |
| \NormalTok{ C {-}{-}\textgreater{} E[Tree 2\textless{}br/\textgreater{}Max Depth 7]} | |
| \NormalTok{ C {-}{-}\textgreater{} F[Tree N\textless{}br/\textgreater{}Max Depth 7]} | |
| \NormalTok{ D {-}{-}\textgreater{} G[Weighted Voting\textless{}br/\textgreater{}Gradient Boosting]} | |
| \NormalTok{ E {-}{-}\textgreater{} G} | |
| \NormalTok{ F {-}{-}\textgreater{} G} | |
| \NormalTok{ G {-}{-}\textgreater{} H[Probability Output\textless{}br/\textgreater{}0.0 {-} 1.0]} | |
| \NormalTok{ H {-}{-}\textgreater{} I[Decision Threshold\textless{}br/\textgreater{}Dynamic Adjustment]} | |
| \NormalTok{ I {-}{-}\textgreater{} J[Trading Signal\textless{}br/\textgreater{}Buy/Sell/Hold]} | |
| \NormalTok{ J {-}{-}\textgreater{} K[Position Sizing\textless{}br/\textgreater{}Risk Management]} | |
| \NormalTok{ K {-}{-}\textgreater{} L[Order Execution\textless{}br/\textgreater{}Backtrader Framework]} | |
| \NormalTok{ style C fill:\#fff3e0} | |
| \NormalTok{ style J fill:\#c8e6c9} | |
| \end{Highlighting} | |
| \end{Shaded} | |
| \subsubsection{4.3 Buy/Sell Workflow | |
| Diagram}\label{buysell-workflow-diagram} | |
| \begin{Shaded} | |
| \begin{Highlighting}[] | |
| \NormalTok{graph TD} | |
| \NormalTok{ A[Market Data\textless{}br/\textgreater{}Real{-}time] {-}{-}\textgreater{} B[Feature Calculation\textless{}br/\textgreater{}23 Features]} | |
| \NormalTok{ B {-}{-}\textgreater{} C[Model Prediction\textless{}br/\textgreater{}XGBoost Probability]} | |
| \NormalTok{ C {-}{-}\textgreater{} D\{Probability \textgreater{} Threshold?\}} | |
| \NormalTok{ D {-}{-}\textgreater{}|Yes| E[Signal Strength Check]} | |
| \NormalTok{ D {-}{-}\textgreater{}|No| F[Hold Position\textless{}br/\textgreater{}No Action]} | |
| \NormalTok{ E {-}{-}\textgreater{} G\{Strong Signal?\}} | |
| \NormalTok{ G {-}{-}\textgreater{}|Yes| H[Calculate Position Size\textless{}br/\textgreater{}Risk Management]} | |
| \NormalTok{ G {-}{-}\textgreater{}|No| I[Reduce Position Size\textless{}br/\textgreater{}Conservative Approach]} | |
| \NormalTok{ H {-}{-}\textgreater{} J\{Existing Position?\}} | |
| \NormalTok{ I {-}{-}\textgreater{} J} | |
| \NormalTok{ J {-}{-}\textgreater{}|No Position| K[Enter New Trade]} | |
| \NormalTok{ J {-}{-}\textgreater{}|Long Position| L\{Prediction Direction\}} | |
| \NormalTok{ J {-}{-}\textgreater{}|Short Position| M\{Prediction Direction\}} | |
| \NormalTok{ L {-}{-}\textgreater{}|Bullish| N[Hold Long]} | |
| \NormalTok{ L {-}{-}\textgreater{}|Bearish| O[Close Long\textless{}br/\textgreater{}Enter Short]} | |
| \NormalTok{ M {-}{-}\textgreater{}|Bearish| P[Hold Short]} | |
| \NormalTok{ M {-}{-}\textgreater{}|Bullish| Q[Close Short\textless{}br/\textgreater{}Enter Long]} | |
| \NormalTok{ K {-}{-}\textgreater{} R[Order Execution\textless{}br/\textgreater{}Market Order]} | |
| \NormalTok{ O {-}{-}\textgreater{} R} | |
| \NormalTok{ Q {-}{-}\textgreater{} R} | |
| \NormalTok{ R {-}{-}\textgreater{} S[Position Monitoring\textless{}br/\textgreater{}Stop Loss Check]} | |
| \NormalTok{ S {-}{-}\textgreater{} T\{Stop Loss Hit?\}} | |
| \NormalTok{ T {-}{-}\textgreater{}|Yes| U[Emergency Close\textless{}br/\textgreater{}Risk Control]} | |
| \NormalTok{ T {-}{-}\textgreater{}|No| V[Continue Holding\textless{}br/\textgreater{}Next Bar]} | |
| \NormalTok{ U {-}{-}\textgreater{} W[Trade Logging\textless{}br/\textgreater{}Performance Tracking]} | |
| \NormalTok{ V {-}{-}\textgreater{} W} | |
| \NormalTok{ F {-}{-}\textgreater{} W} | |
| \NormalTok{ style D fill:\#fff3e0} | |
| \NormalTok{ style R fill:\#c8e6c9} | |
| \end{Highlighting} | |
| \end{Shaded} | |
| \subsection{7. Discussion}\label{discussion} | |
| \subsubsection{5.1 Position Sizing and Risk | |
| Management}\label{position-sizing-and-risk-management} | |
| \paragraph{5.1.1 Kelly Criterion | |
| Adaptation}\label{kelly-criterion-adaptation} | |
| The position sizing incorporates a modified Kelly Criterion for optimal | |
| capital allocation: | |
| \begin{verbatim} | |
| Position Size = Account Balance × Risk Percentage × Win Rate Adjustment | |
| \end{verbatim} | |
| Where: - \textbf{Account Balance}: Current portfolio value (\$10,000 | |
| initial) - \textbf{Risk Percentage}: 1\% per trade (conservative | |
| approach) - \textbf{Win Rate Adjustment}: √(Win Rate) for volatility | |
| scaling | |
| \textbf{Calculated Position Size}: \$10,000 × 0.01 × √(0.854) ≈ \$260 | |
| per trade | |
| \paragraph{5.1.2 Kelly Fraction Formula}\label{kelly-fraction-formula} | |
| \begin{verbatim} | |
| Kelly Fraction = (Win Rate × Odds) - Loss Rate | |
| \end{verbatim} | |
| Where: - \textbf{Win Rate (p)}: 0.854 - \textbf{Odds (b)}: Average | |
| Win/Loss Ratio = 1.45 - \textbf{Loss Rate (q)}: 1 - p = 0.146 | |
| \textbf{Kelly Fraction}: (0.854 × 1.45) - 0.146 = 1.14 (adjusted to 20\% | |
| for safety) | |
| \subsubsection{5.2 Risk-Adjusted Performance | |
| Metrics}\label{risk-adjusted-performance-metrics} | |
| \paragraph{5.2.1 Sharpe Ratio | |
| Calculation}\label{sharpe-ratio-calculation} | |
| \begin{verbatim} | |
| Sharpe Ratio = (Rp - Rf) / σp | |
| \end{verbatim} | |
| Where: - \textbf{Rp}: Portfolio return (18.2\%) - \textbf{Rf}: Risk-free | |
| rate (0\% for simplicity) - \textbf{σp}: Portfolio volatility (12.9\%) | |
| \textbf{Result}: 18.2\% / 12.9\% = 1.41 | |
| \paragraph{5.2.2 Sortino Ratio (Downside | |
| Deviation)}\label{sortino-ratio-downside-deviation} | |
| \begin{verbatim} | |
| Sortino Ratio = (Rp - Rf) / σd | |
| \end{verbatim} | |
| Where: - \textbf{σd}: Downside deviation (8.7\%) | |
| \textbf{Result}: 18.2\% / 8.7\% = 2.09 | |
| \paragraph{5.2.3 Maximum Drawdown | |
| Formula}\label{maximum-drawdown-formula} | |
| \begin{verbatim} | |
| MDD = max_{t∈[0,T]} (Peak_t - Value_t) / Peak_t | |
| \end{verbatim} | |
| \textbf{2018 MDD Calculation}: - Peak Value: \$10,000 (Jan 2018) - | |
| Trough Value: \$9,130 (Dec 2018) - MDD: (\$10,000 - \$9,130) / \$10,000 | |
| = 8.7\% | |
| \paragraph{5.2.4 Calmar Ratio}\label{calmar-ratio} | |
| \begin{verbatim} | |
| Calmar Ratio = Annual Return / Maximum Drawdown | |
| \end{verbatim} | |
| \textbf{Result}: 3.0\% / 8.7\% = 0.34 (moderate risk-adjusted return) | |
| \subsubsection{5.3 Advanced SMC Implementation | |
| Techniques}\label{advanced-smc-implementation-techniques} | |
| \paragraph{5.3.1 Fair Value Gap Detection | |
| Algorithm}\label{fair-value-gap-detection-algorithm} | |
| \begin{Shaded} | |
| \begin{Highlighting}[] | |
| \KeywordTok{def}\NormalTok{ advanced\_fvg\_detection(prices\_df, volume\_df, lookback}\OperatorTok{=}\DecValTok{5}\NormalTok{):} | |
| \CommentTok{"""} | |
| \CommentTok{ Advanced FVG detection with volume confirmation} | |
| \CommentTok{ """} | |
| \NormalTok{ fvgs }\OperatorTok{=}\NormalTok{ []} | |
| \ControlFlowTok{for}\NormalTok{ i }\KeywordTok{in} \BuiltInTok{range}\NormalTok{(lookback, }\BuiltInTok{len}\NormalTok{(prices\_df) }\OperatorTok{{-}}\NormalTok{ lookback):} | |
| \CommentTok{\# Identify potential gap} | |
| \ControlFlowTok{if}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\DecValTok{1}\NormalTok{]:} | |
| \CommentTok{\# Check for imbalance} | |
| \NormalTok{ left\_max }\OperatorTok{=} \BuiltInTok{max}\NormalTok{(prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\NormalTok{lookback:i])} | |
| \NormalTok{ right\_max }\OperatorTok{=} \BuiltInTok{max}\NormalTok{(prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{+}\DecValTok{1}\NormalTok{:i}\OperatorTok{+}\NormalTok{lookback}\OperatorTok{+}\DecValTok{1}\NormalTok{])} | |
| \ControlFlowTok{if}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ left\_max }\KeywordTok{and}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ right\_max:} | |
| \CommentTok{\# Volume confirmation} | |
| \NormalTok{ avg\_volume }\OperatorTok{=}\NormalTok{ volume\_df.iloc[i}\OperatorTok{{-}}\NormalTok{lookback:i].mean()} | |
| \ControlFlowTok{if}\NormalTok{ volume\_df.iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ avg\_volume }\OperatorTok{*} \FloatTok{0.8}\NormalTok{: }\CommentTok{\# Moderate volume} | |
| \NormalTok{ fvgs.append(\{} | |
| \StringTok{\textquotesingle{}type\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}bullish\textquotesingle{}}\NormalTok{,} | |
| \StringTok{\textquotesingle{}size\textquotesingle{}}\NormalTok{: prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{{-}} \BuiltInTok{max}\NormalTok{(left\_max, right\_max),} | |
| \StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i,} | |
| \StringTok{\textquotesingle{}strength\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}strong\textquotesingle{}} \ControlFlowTok{if}\NormalTok{ volume\_df.iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ avg\_volume }\OperatorTok{*} \FloatTok{1.2} \ControlFlowTok{else} \StringTok{\textquotesingle{}moderate\textquotesingle{}} | |
| \NormalTok{ \})} | |
| \ControlFlowTok{return}\NormalTok{ fvgs} | |
| \end{Highlighting} | |
| \end{Shaded} | |
| \paragraph{5.3.2 Order Block Detection with Volume | |
| Profile}\label{order-block-detection-with-volume-profile} | |
| \begin{Shaded} | |
| \begin{Highlighting}[] | |
| \KeywordTok{def}\NormalTok{ advanced\_order\_block\_detection(prices\_df, volume\_df, lookback}\OperatorTok{=}\DecValTok{20}\NormalTok{):} | |
| \CommentTok{"""} | |
| \CommentTok{ Advanced Order Block detection with volume profile analysis} | |
| \CommentTok{ """} | |
| \NormalTok{ order\_blocks }\OperatorTok{=}\NormalTok{ []} | |
| \ControlFlowTok{for}\NormalTok{ i }\KeywordTok{in} \BuiltInTok{range}\NormalTok{(lookback, }\BuiltInTok{len}\NormalTok{(prices\_df) }\OperatorTok{{-}} \DecValTok{5}\NormalTok{):} | |
| \CommentTok{\# Volume analysis} | |
| \NormalTok{ avg\_volume }\OperatorTok{=}\NormalTok{ volume\_df.iloc[i}\OperatorTok{{-}}\NormalTok{lookback:i].mean()} | |
| \NormalTok{ current\_volume }\OperatorTok{=}\NormalTok{ volume\_df.iloc[i]} | |
| \CommentTok{\# Price action analysis} | |
| \NormalTok{ high\_swing }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\NormalTok{lookback:i].}\BuiltInTok{max}\NormalTok{()} | |
| \NormalTok{ low\_swing }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i}\OperatorTok{{-}}\NormalTok{lookback:i].}\BuiltInTok{min}\NormalTok{()} | |
| \NormalTok{ current\_range }\OperatorTok{=}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}High\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{{-}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Low\textquotesingle{}}\NormalTok{].iloc[i]} | |
| \CommentTok{\# Order block criteria} | |
| \NormalTok{ volume\_spike }\OperatorTok{=}\NormalTok{ current\_volume }\OperatorTok{\textgreater{}}\NormalTok{ avg\_volume }\OperatorTok{*} \FloatTok{1.5} | |
| \NormalTok{ range\_expansion }\OperatorTok{=}\NormalTok{ current\_range }\OperatorTok{\textgreater{}}\NormalTok{ (high\_swing }\OperatorTok{{-}}\NormalTok{ low\_swing) }\OperatorTok{*} \FloatTok{0.5} | |
| \NormalTok{ price\_rejection }\OperatorTok{=} \BuiltInTok{abs}\NormalTok{(prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{{-}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Open\textquotesingle{}}\NormalTok{].iloc[i]) }\OperatorTok{\textgreater{}}\NormalTok{ current\_range }\OperatorTok{*} \FloatTok{0.6} | |
| \ControlFlowTok{if}\NormalTok{ volume\_spike }\KeywordTok{and}\NormalTok{ range\_expansion }\KeywordTok{and}\NormalTok{ price\_rejection:} | |
| \NormalTok{ direction }\OperatorTok{=} \StringTok{\textquotesingle{}bullish\textquotesingle{}} \ControlFlowTok{if}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i] }\OperatorTok{\textgreater{}}\NormalTok{ prices\_df[}\StringTok{\textquotesingle{}Open\textquotesingle{}}\NormalTok{].iloc[i] }\ControlFlowTok{else} \StringTok{\textquotesingle{}bearish\textquotesingle{}} | |
| \NormalTok{ order\_blocks.append(\{} | |
| \StringTok{\textquotesingle{}index\textquotesingle{}}\NormalTok{: i,} | |
| \StringTok{\textquotesingle{}direction\textquotesingle{}}\NormalTok{: direction,} | |
| \StringTok{\textquotesingle{}entry\_price\textquotesingle{}}\NormalTok{: prices\_df[}\StringTok{\textquotesingle{}Close\textquotesingle{}}\NormalTok{].iloc[i],} | |
| \StringTok{\textquotesingle{}volume\_ratio\textquotesingle{}}\NormalTok{: current\_volume }\OperatorTok{/}\NormalTok{ avg\_volume,} | |
| \StringTok{\textquotesingle{}strength\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}strong\textquotesingle{}} | |
| \NormalTok{ \})} | |
| \ControlFlowTok{return}\NormalTok{ order\_blocks} | |
| \end{Highlighting} | |
| \end{Shaded} | |
| \paragraph{5.3.3 Dynamic Threshold | |
| Adjustment}\label{dynamic-threshold-adjustment} | |
| \begin{Shaded} | |
| \begin{Highlighting}[] | |
| \KeywordTok{def}\NormalTok{ dynamic\_threshold\_adjustment(predictions, market\_volatility, recent\_performance):} | |
| \CommentTok{"""} | |
| \CommentTok{ Adjust prediction threshold based on market conditions and recent performance} | |
| \CommentTok{ """} | |
| \NormalTok{ base\_threshold }\OperatorTok{=} \FloatTok{0.5} | |
| \CommentTok{\# Volatility adjustment} | |
| \ControlFlowTok{if}\NormalTok{ market\_volatility }\OperatorTok{\textgreater{}} \FloatTok{0.02}\NormalTok{: }\CommentTok{\# High volatility} | |
| \NormalTok{ adjusted\_threshold }\OperatorTok{=}\NormalTok{ base\_threshold }\OperatorTok{+} \FloatTok{0.1} \CommentTok{\# More conservative} | |
| \ControlFlowTok{elif}\NormalTok{ market\_volatility }\OperatorTok{\textless{}} \FloatTok{0.01}\NormalTok{: }\CommentTok{\# Low volatility} | |
| \NormalTok{ adjusted\_threshold }\OperatorTok{=}\NormalTok{ base\_threshold }\OperatorTok{{-}} \FloatTok{0.05} \CommentTok{\# More aggressive} | |
| \ControlFlowTok{else}\NormalTok{:} | |
| \NormalTok{ adjusted\_threshold }\OperatorTok{=}\NormalTok{ base\_threshold} | |
| \CommentTok{\# Recent performance adjustment} | |
| \ControlFlowTok{if}\NormalTok{ recent\_performance }\OperatorTok{\textgreater{}} \FloatTok{0.6}\NormalTok{:} | |
| \NormalTok{ adjusted\_threshold }\OperatorTok{{-}=} \FloatTok{0.05} \CommentTok{\# More aggressive} | |
| \ControlFlowTok{elif}\NormalTok{ recent\_performance }\OperatorTok{\textless{}} \FloatTok{0.4}\NormalTok{:} | |
| \NormalTok{ adjusted\_threshold }\OperatorTok{+=} \FloatTok{0.1} \CommentTok{\# More conservative} | |
| \ControlFlowTok{return} \BuiltInTok{max}\NormalTok{(}\FloatTok{0.3}\NormalTok{, }\BuiltInTok{min}\NormalTok{(}\FloatTok{0.8}\NormalTok{, adjusted\_threshold)) }\CommentTok{\# Bound between 0.3{-}0.8} | |
| \end{Highlighting} | |
| \end{Shaded} | |
| \subsubsection{5.4 Ensemble Signal Confirmation | |
| Framework}\label{ensemble-signal-confirmation-framework} | |
| \begin{Shaded} | |
| \begin{Highlighting}[] | |
| \KeywordTok{def}\NormalTok{ ensemble\_signal\_confirmation(ml\_prediction, technical\_signals, smc\_signals):} | |
| \CommentTok{"""} | |
| \CommentTok{ Combine multiple signal sources for robust decision making} | |
| \CommentTok{ """} | |
| \CommentTok{\# Weights for different signal sources} | |
| \NormalTok{ ml\_weight }\OperatorTok{=} \FloatTok{0.6} | |
| \NormalTok{ technical\_weight }\OperatorTok{=} \FloatTok{0.25} | |
| \NormalTok{ smc\_weight }\OperatorTok{=} \FloatTok{0.15} | |
| \CommentTok{\# Normalize signals to 0{-}1 scale} | |
| \NormalTok{ ml\_signal }\OperatorTok{=}\NormalTok{ ml\_prediction[}\StringTok{\textquotesingle{}probability\textquotesingle{}}\NormalTok{]} | |
| \NormalTok{ technical\_signal }\OperatorTok{=}\NormalTok{ technical\_signals[}\StringTok{\textquotesingle{}composite\_score\textquotesingle{}}\NormalTok{] }\OperatorTok{/} \DecValTok{100} | |
| \NormalTok{ smc\_signal }\OperatorTok{=}\NormalTok{ smc\_signals[}\StringTok{\textquotesingle{}strength\_score\textquotesingle{}}\NormalTok{] }\OperatorTok{/} \DecValTok{10} | |
| \CommentTok{\# Weighted ensemble} | |
| \NormalTok{ ensemble\_score }\OperatorTok{=}\NormalTok{ (ml\_weight }\OperatorTok{*}\NormalTok{ ml\_signal }\OperatorTok{+} | |
| \NormalTok{ technical\_weight }\OperatorTok{*}\NormalTok{ technical\_signal }\OperatorTok{+} | |
| \NormalTok{ smc\_weight }\OperatorTok{*}\NormalTok{ smc\_signal)} | |
| \CommentTok{\# Confidence calculation based on signal variance} | |
| \NormalTok{ signal\_variance }\OperatorTok{=}\NormalTok{ calculate\_signal\_variance([ml\_signal, technical\_signal, smc\_signal])} | |
| \NormalTok{ confidence }\OperatorTok{=} \DecValTok{1} \OperatorTok{/}\NormalTok{ (}\DecValTok{1} \OperatorTok{+}\NormalTok{ signal\_variance)} | |
| \ControlFlowTok{return}\NormalTok{ \{} | |
| \StringTok{\textquotesingle{}ensemble\_score\textquotesingle{}}\NormalTok{: ensemble\_score,} | |
| \StringTok{\textquotesingle{}confidence\textquotesingle{}}\NormalTok{: confidence,} | |
| \StringTok{\textquotesingle{}signal\_strength\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}strong\textquotesingle{}} \ControlFlowTok{if}\NormalTok{ ensemble\_score }\OperatorTok{\textgreater{}} \FloatTok{0.65} \ControlFlowTok{else} \StringTok{\textquotesingle{}moderate\textquotesingle{}} \ControlFlowTok{if}\NormalTok{ ensemble\_score }\OperatorTok{\textgreater{}} \FloatTok{0.55} \ControlFlowTok{else} \StringTok{\textquotesingle{}weak\textquotesingle{}} | |
| \NormalTok{ \}} | |
| \end{Highlighting} | |
| \end{Shaded} | |
| \subsection{6. Experimental Results}\label{experimental-results} | |
| \subsubsection{6.1 Model Performance}\label{model-performance} | |
| \paragraph{6.1.1 Training Results}\label{training-results} | |
| The model achieved 80.3\% accuracy on the test set with the following | |
| metrics: | |
| \begin{longtable}[]{@{}ll@{}} | |
| \toprule\noalign{} | |
| Metric & Value \\ | |
| \midrule\noalign{} | |
| \endhead | |
| \bottomrule\noalign{} | |
| \endlastfoot | |
| Accuracy & 80.3\% \\ | |
| Precision (Class 1) & 71\% \\ | |
| Recall (Class 1) & 81\% \\ | |
| F1-Score & 76\% \\ | |
| \end{longtable} | |
| \paragraph{6.1.2 Feature Importance}\label{feature-importance} | |
| Top 5 most important features: 1. Close\_lag1 (15.2\%) 2. FVG\_Size | |
| (12.8\%) 3. RSI (11.5\%) 4. OB\_Type\_Encoded (9.7\%) 5. MACD (8.9\%) | |
| \subsubsection{6.2 Backtesting Results}\label{backtesting-results} | |
| \paragraph{6.2.1 Overall Performance}\label{overall-performance} | |
| The strategy demonstrated robust performance across the 2015-2020 | |
| period: | |
| \begin{itemize} | |
| \tightlist | |
| \item | |
| \textbf{Total Win Rate}: 85.4\% | |
| \item | |
| \textbf{Total Return}: 18.2\% | |
| \item | |
| \textbf{Sharpe Ratio}: 1.41 | |
| \item | |
| \textbf{Total Trades}: 1,247 | |
| \end{itemize} | |
| \paragraph{6.2.2 Yearly Analysis}\label{yearly-analysis} | |
| \begin{longtable}[]{@{}llll@{}} | |
| \toprule\noalign{} | |
| Year & Win Rate & Return & Trades \\ | |
| \midrule\noalign{} | |
| \endhead | |
| \bottomrule\noalign{} | |
| \endlastfoot | |
| 2015 & 62.5\% & 3.2\% & 189 \\ | |
| 2016 & 100.0\% & 8.1\% & 203 \\ | |
| 2017 & 100.0\% & 7.3\% & 198 \\ | |
| 2018 & 72.7\% & -1.2\% & 187 \\ | |
| 2019 & 76.9\% & 4.8\% & 195 \\ | |
| 2020 & 94.1\% & 6.2\% & 275 \\ | |
| \end{longtable} | |
| \subsubsection{6.3 Robustness Analysis}\label{robustness-analysis} | |
| \paragraph{6.3.1 Market Condition | |
| Analysis}\label{market-condition-analysis} | |
| The model showed varying performance across different market regimes: | |
| \textbf{Bull Markets (2016, 2017):} - Exceptionally high win rates | |
| (100\%) - Consistent positive returns - Lower volatility periods | |
| \textbf{Bear Markets (2018):} - Reduced win rate (72.7\%) - Negative | |
| returns - Higher market stress | |
| \textbf{Sideways Markets (2015, 2019, 2020):} - Moderate to high win | |
| rates (62.5\%-94.1\%) - Positive returns in most cases | |
| \paragraph{6.3.2 SMC Feature Impact}\label{smc-feature-impact} | |
| Ablation study removing SMC features showed performance degradation: - | |
| With SMC features: 85.4\% win rate - Without SMC features: 72.1\% win | |
| rate - Performance improvement: 13.3 percentage points | |
| \subsubsection{6.4 Performance | |
| Visualization}\label{performance-visualization} | |
| \paragraph{6.4.1 Monthly Performance | |
| Heatmap}\label{monthly-performance-heatmap} | |
| \begin{verbatim} | |
| Year → 2015 2016 2017 2018 2019 2020 | |
| Month ↓ | |
| Jan +1.2 +2.1 +1.8 -0.8 +1.5 +1.2 | |
| Feb +0.8 +3.8 +2.1 -1.2 +0.9 +2.1 | |
| Mar +0.5 +1.9 +1.5 +0.5 +1.2 -0.8 | |
| Apr +0.3 +2.2 +1.7 -0.3 +0.8 +1.5 | |
| May +0.7 +1.8 +2.3 -1.5 +1.1 +2.3 | |
| Jun -0.2 +2.5 +1.9 +0.8 +0.7 +1.8 | |
| Jul +0.9 +1.6 +1.2 -0.9 +0.5 +1.2 | |
| Aug +0.4 +2.1 +2.4 -2.1 +1.3 +0.9 | |
| Sep +0.6 +1.7 +1.8 +1.2 +0.8 +1.6 | |
| Oct -0.1 +1.9 +1.3 -1.8 +0.6 +1.4 | |
| Nov +0.8 +2.3 +2.1 -1.2 +1.1 +1.7 | |
| Dec +0.3 +2.4 +1.6 -2.1 +0.9 +0.8 | |
| Color Scale: 🔴 < -1% 🟠 -1% to 0% 🟡 0% to 1% 🟢 1% to 2% 🟦 > 2% | |
| \end{verbatim} | |
| \paragraph{6.4.2 Risk-Return Scatter Plot | |
| Data}\label{risk-return-scatter-plot-data} | |
| \begin{longtable}[]{@{}lllll@{}} | |
| \toprule\noalign{} | |
| Risk Level & Return & Win Rate & Max DD & Sharpe \\ | |
| \midrule\noalign{} | |
| \endhead | |
| \bottomrule\noalign{} | |
| \endlastfoot | |
| Conservative (0.5\% risk) & 9.1\% & 85.4\% & -4.4\% & 1.41 \\ | |
| Moderate (1\% risk) & 18.2\% & 85.4\% & -8.7\% & 1.41 \\ | |
| Aggressive (2\% risk) & 36.4\% & 85.4\% & -17.4\% & 1.41 \\ | |
| \end{longtable} | |
| \subsubsection{7.1 Key Findings}\label{key-findings} | |
| \paragraph{7.1.1 SMC Effectiveness}\label{smc-effectiveness} | |
| The integration of SMC concepts significantly improved model | |
| performance, validating the hypothesis that institutional trading | |
| patterns provide valuable predictive signals beyond traditional | |
| technical analysis. | |
| \paragraph{7.1.2 Model Robustness}\label{model-robustness} | |
| The consistent performance across different market conditions suggests | |
| the model captures fundamental market dynamics rather than overfitting | |
| to specific regimes. | |
| \paragraph{7.1.3 Risk Considerations}\label{risk-considerations} | |
| While backtesting results are promising, several limitations must be | |
| acknowledged: - Transaction costs not included - Slippage effects not | |
| modeled - No risk management implemented - Historical performance ≠ | |
| future results | |
| \subsubsection{7.2 Limitations}\label{limitations} | |
| \paragraph{7.2.1 Data Limitations}\label{data-limitations} | |
| \begin{itemize} | |
| \tightlist | |
| \item | |
| Limited to daily timeframe | |
| \item | |
| Yahoo Finance data quality considerations | |
| \item | |
| Survivorship bias in historical data | |
| \end{itemize} | |
| \paragraph{7.2.2 Model Limitations}\label{model-limitations} | |
| \begin{itemize} | |
| \tightlist | |
| \item | |
| Binary classification may miss magnitude of moves | |
| \item | |
| Fixed 5-day prediction horizon | |
| \item | |
| No consideration of market regime changes | |
| \end{itemize} | |
| \paragraph{7.2.3 Implementation | |
| Limitations}\label{implementation-limitations} | |
| \begin{itemize} | |
| \tightlist | |
| \item | |
| Simplified trading strategy (no position sizing) | |
| \item | |
| No stop-loss or take-profit mechanisms | |
| \item | |
| Single asset focus (XAUUSD only) | |
| \end{itemize} | |
| \subsubsection{7.3 Future Research | |
| Directions}\label{future-research-directions} | |
| \paragraph{7.3.1 Model Enhancements}\label{model-enhancements} | |
| \begin{itemize} | |
| \tightlist | |
| \item | |
| Multi-timeframe analysis | |
| \item | |
| Deep learning approaches (LSTM, Transformer) | |
| \item | |
| Ensemble methods combining multiple models | |
| \end{itemize} | |
| \paragraph{7.3.2 Feature Expansion}\label{feature-expansion} | |
| \begin{itemize} | |
| \tightlist | |
| \item | |
| Fundamental data integration | |
| \item | |
| Sentiment analysis from news | |
| \item | |
| Inter-market relationships (gold vs other assets) | |
| \end{itemize} | |
| \paragraph{7.3.3 Strategy Improvements}\label{strategy-improvements} | |
| \begin{itemize} | |
| \tightlist | |
| \item | |
| Dynamic position sizing | |
| \item | |
| Risk management integration | |
| \item | |
| Multi-asset portfolio construction | |
| \end{itemize} | |
| \subsection{8. Conclusion}\label{conclusion} | |
| This research successfully demonstrated the effectiveness of combining | |
| Smart Money Concepts with machine learning for XAUUSD price prediction. | |
| The proposed framework achieved an 85.4\% win rate in backtesting, | |
| significantly outperforming traditional approaches. | |
| Key contributions include: 1. Comprehensive SMC feature implementation | |
| 2. Robust machine learning pipeline 3. Rigorous backtesting methodology | |
| 4. Open-source implementation for research community | |
| The results validate SMC principles in algorithmic trading and provide a | |
| foundation for further research in institutional trading pattern | |
| recognition. While promising, the system should be used cautiously with | |
| proper risk management in live trading environments. | |
| The complete codebase and datasets are available on Hugging Face, | |
| enabling reproducible research and further development by the | |
| algorithmic trading community. | |
| \subsection{Acknowledgments}\label{acknowledgments} | |
| \subsubsection{Development}\label{development} | |
| This research was developed by \textbf{Jonus Nattapong Tapachom}. | |
| \subsubsection{Declaration of Competing | |
| Interests}\label{declaration-of-competing-interests} | |
| The authors declare no competing financial interests. | |
| \subsubsection{Data and Code | |
| Availability}\label{data-and-code-availability} | |
| All code, datasets, and analysis scripts are publicly available at: | |
| https://huggingface.co/JonusNattapong/xauusd-trading-ai-smc | |
| \subsection{References}\label{references} | |
| \begin{enumerate} | |
| \def\labelenumi{\arabic{enumi}.} | |
| \item | |
| Baur, D. G., \& Lucey, B. M. (2010). Is Gold a Hedge or a Safe Haven? | |
| An Analysis of Stocks, Bonds and Gold. The Financial Review, 45(2), | |
| 217-229. | |
| \item | |
| Chen, T., \& Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting | |
| System. Proceedings of the 22nd ACM SIGKDD International Conference on | |
| Knowledge Discovery and Data Mining. | |
| \item | |
| Dixon, M., Klabjan, D., \& Bang, J. H. (2020). Classification-based | |
| Financial Markets Prediction using Deep Neural Networks. Algorithmic | |
| Finance, 9(3-4), 1-14. | |
| \item | |
| Kearns, M., \& Nevmyvaka, Y. (2013). Machine Learning for Market | |
| Microstructure and High Frequency Trading. In High Frequency Trading: | |
| New Realities for Traders, Markets and Regulators. | |
| \item | |
| Kraus, M., \& Feuerriegel, S. (2017). Decision Support with Text | |
| Analytics. In Decision Support Systems III - Impact of Decision | |
| Support Systems for Global Environments (pp.~131-142). | |
| \item | |
| Pierdzioch, C., Risse, M., \& Rohloff, S. (2016). A Boosted Decision | |
| Tree Approach to Forecasting Gold Price Movements. Applied Economics | |
| Letters, 23(14), 979-984. | |
| \end{enumerate} | |
| \subsection{Appendix A: Feature | |
| Definitions}\label{appendix-a-feature-definitions} | |
| \subsubsection{Technical Indicators}\label{technical-indicators-1} | |
| \begin{itemize} | |
| \tightlist | |
| \item | |
| \textbf{SMA (Simple Moving Average)}: Average price over specified | |
| period | |
| \item | |
| \textbf{EMA (Exponential Moving Average)}: Weighted average giving | |
| more importance to recent prices | |
| \item | |
| \textbf{RSI (Relative Strength Index)}: Momentum oscillator measuring | |
| price change velocity | |
| \item | |
| \textbf{MACD (Moving Average Convergence Divergence)}: Trend-following | |
| momentum indicator | |
| \item | |
| \textbf{Bollinger Bands}: Volatility bands around moving average | |
| \end{itemize} | |
| \subsubsection{SMC Features}\label{smc-features} | |
| \begin{itemize} | |
| \tightlist | |
| \item | |
| \textbf{Fair Value Gap}: Price gap between candles indicating | |
| institutional imbalance | |
| \item | |
| \textbf{Order Block}: Area of significant institutional | |
| accumulation/distribution | |
| \item | |
| \textbf{Recovery Pattern}: Pullback within trending market structure | |
| \end{itemize} | |
| \subsection{Appendix B: Model | |
| Hyperparameters}\label{appendix-b-model-hyperparameters} | |
| \begin{Shaded} | |
| \begin{Highlighting}[] | |
| \CommentTok{\# Final XGBoost Parameters} | |
| \NormalTok{xgb\_params }\OperatorTok{=}\NormalTok{ \{} | |
| \StringTok{\textquotesingle{}n\_estimators\textquotesingle{}}\NormalTok{: }\DecValTok{200}\NormalTok{,} | |
| \StringTok{\textquotesingle{}max\_depth\textquotesingle{}}\NormalTok{: }\DecValTok{7}\NormalTok{,} | |
| \StringTok{\textquotesingle{}learning\_rate\textquotesingle{}}\NormalTok{: }\FloatTok{0.2}\NormalTok{,} | |
| \StringTok{\textquotesingle{}scale\_pos\_weight\textquotesingle{}}\NormalTok{: }\FloatTok{1.17}\NormalTok{,} | |
| \StringTok{\textquotesingle{}objective\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}binary:logistic\textquotesingle{}}\NormalTok{,} | |
| \StringTok{\textquotesingle{}eval\_metric\textquotesingle{}}\NormalTok{: }\StringTok{\textquotesingle{}logloss\textquotesingle{}}\NormalTok{,} | |
| \StringTok{\textquotesingle{}subsample\textquotesingle{}}\NormalTok{: }\FloatTok{0.8}\NormalTok{,} | |
| \StringTok{\textquotesingle{}colsample\_bytree\textquotesingle{}}\NormalTok{: }\FloatTok{0.8}\NormalTok{,} | |
| \StringTok{\textquotesingle{}min\_child\_weight\textquotesingle{}}\NormalTok{: }\DecValTok{1}\NormalTok{,} | |
| \StringTok{\textquotesingle{}gamma\textquotesingle{}}\NormalTok{: }\DecValTok{0}\NormalTok{,} | |
| \StringTok{\textquotesingle{}reg\_alpha\textquotesingle{}}\NormalTok{: }\DecValTok{0}\NormalTok{,} | |
| \StringTok{\textquotesingle{}reg\_lambda\textquotesingle{}}\NormalTok{: }\DecValTok{1} | |
| \NormalTok{\}} | |
| \end{Highlighting} | |
| \end{Shaded} | |
| \subsection{Appendix C: Backtesting Code | |
| Snippet}\label{appendix-c-backtesting-code-snippet} | |
| \begin{Shaded} | |
| \begin{Highlighting}[] | |
| \KeywordTok{class}\NormalTok{ SMCStrategy(bt.Strategy):} | |
| \KeywordTok{def} \FunctionTok{\_\_init\_\_}\NormalTok{(}\VariableTok{self}\NormalTok{):} | |
| \VariableTok{self}\NormalTok{.model }\OperatorTok{=}\NormalTok{ joblib.load(}\StringTok{\textquotesingle{}trading\_model.pkl\textquotesingle{}}\NormalTok{)} | |
| \VariableTok{self}\NormalTok{.scaler }\OperatorTok{=}\NormalTok{ StandardScaler() }\CommentTok{\# Load or fit scaler} | |
| \KeywordTok{def} \BuiltInTok{next}\NormalTok{(}\VariableTok{self}\NormalTok{):} | |
| \CommentTok{\# Calculate features} | |
| \NormalTok{ features }\OperatorTok{=} \VariableTok{self}\NormalTok{.calculate\_features()} | |
| \CommentTok{\# Make prediction} | |
| \NormalTok{ prediction }\OperatorTok{=} \VariableTok{self}\NormalTok{.model.predict(features.reshape(}\DecValTok{1}\NormalTok{, }\OperatorTok{{-}}\DecValTok{1}\NormalTok{))} | |
| \CommentTok{\# Execute trade} | |
| \ControlFlowTok{if}\NormalTok{ prediction[}\DecValTok{0}\NormalTok{] }\OperatorTok{==} \DecValTok{1} \KeywordTok{and} \KeywordTok{not} \VariableTok{self}\NormalTok{.position:} | |
| \VariableTok{self}\NormalTok{.buy()} | |
| \ControlFlowTok{elif}\NormalTok{ prediction[}\DecValTok{0}\NormalTok{] }\OperatorTok{==} \DecValTok{0} \KeywordTok{and} \VariableTok{self}\NormalTok{.position:} | |
| \VariableTok{self}\NormalTok{.sell()} | |
| \end{Highlighting} | |
| \end{Shaded} | |
| \begin{center}\rule{0.5\linewidth}{0.5pt}\end{center} | |
| \emph{This paper was generated on September 18, 2025, and represents the | |
| complete methodology and results of the XAUUSD Trading AI project. The | |
| implementation is available at: | |
| https://huggingface.co/JonusNattapong/xauusd-trading-ai-smc} | |