PPO Model for XAUUSD Gold Trading
This repository contains a Reinforcement Learning model trained using Proximal Policy Optimization (PPO) for trading XAUUSD (Gold vs US Dollar) on 15-minute timeframes.
Model Details
- Model Type: PPO (Proximal Policy Optimization)
- Framework: Stable-Baselines3
- Environment: Custom Gym environment for XAUUSD trading
- Training Data: Historical XAUUSD data from 2004 to 2025 (resampled to 15-min bars)
- Total Timesteps: 1,000,000
- Position Sizing: Base 5.0 oz, Max 7.5 oz
- Initial Capital: 200 USD
- Transaction Cost: 0.65 USD per oz
Performance Metrics (Test Set)
- Average Daily Profit: 51.46 USD
- Win Rate: 69.0%
- Max Drawdown: 12.0%
- Sharpe Ratio: 7.56
- Average Trades per Day: 2.66
Features Used
- Log Return
- RSI (14-period)
- Moving Averages (short/long)
- Bollinger Bands
- MACD
- Volume indicators
Usage
Loading the Model
Below are two safe ways to load the trained policy depending on what you have available.
Option A โ Load the full Stable-Baselines3 model (.zip)
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import VecNormalize
import os
# Create or reconstruct an environment similar to the one used for training
# e.g. `env = make_your_env(...)` โ replace with your env factory
env = ...
# If you saved VecNormalize separately, load and wrap your env first
if os.path.exists("models/vecnormalize.pkl"):
vec = VecNormalize.load("models/vecnormalize.pkl", env)
vec.training = False
vec.norm_reward = False
env = vec
# Load the full model (policy + optimizer state)
model = PPO.load("models/ppo_xauusd.zip", env=env)
Option B โ Load weights saved as SafeTensors into a fresh PPO policy
from safetensors.torch import load_file
import torch
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import VecNormalize
import os
# Create or reconstruct the same environment used for training
env = ...
# If you have VecNormalize statistics, load them and wrap the env
if os.path.exists("models/vecnormalize.pkl"):
vec = VecNormalize.load("models/vecnormalize.pkl", env)
vec.training = False
vec.norm_reward = False
env = vec
# Instantiate a PPO model with the same policy architecture
model = PPO("MlpPolicy", env)
# Load SafeTensors state dict and convert values to torch.Tensor if needed
raw_state = load_file("models/ppo_xauusd.safetensors")
state_dict = {k: (torch.tensor(v) if not isinstance(v, torch.Tensor) else v) for k, v in raw_state.items()}
# Load weights into the policy
model.policy.load_state_dict(state_dict)
# Ensure the model has the same env wrapper
model.set_env(env)
Notes:
- Option A is preferred when
ppo_xauusd.zipis available (it contains the entire SB3 model). - Option B is useful when only the policy weights were exported as SafeTensors. Ensure the policy architecture and observation/action spaces match the original training setup.
- Always set
vec.training = Falseandvec.norm_reward = Falsewhen running inference.
For Full Inference
To use the model for trading, you'll need to:
- Set up the trading environment (
XAUUSDTradingEnv) - Load VecNormalize stats
- Run predictions
Note: This is a simulation model. Use with caution in real trading.
Training Configuration
- Learning Rate: 0.0003
- Batch Size: 256
- Gamma: 0.99
- GAE Lambda: 0.95
- Clip Range: 0.2
- Entropy Coefficient: 0.01
Files
ppo_xauusd.safetensors: Model weights in SafeTensors formatvecnormalize.pkl: VecNormalize statistics for observation normalization
License
MIT License
Disclaimer
This model is for educational and research purposes only. Trading involves risk, and past performance does not guarantee future results. Always backtest and validate before using in live trading.
- Downloads last month
- -