metadata
tags:
- Humanoid-v5
- reinforcement-learning
- sac
- humanoid
- mujoco
- gymnasium
- pytorch
model-index:
- name: SAC-MuJoCo-Humanoid-v5
results:
- task:
type: reinforcement-learning
name: reinforcement-learning
dataset:
name: Humanoid-v5
type: Humanoid-v5
metrics:
- type: mean_reward
value: 5718.56 +/- 5.62
name: mean_reward
verified: false
Soft Actor-Critic (SAC) Agent playing Humanoid-v5
This is a trained Soft Actor-Critic (SAC) agent for the MuJoCo Humanoid-v5 environment.
Model Details
The model was trained using the code available here.
Usage
To load and use this model for inference:
import torch
import json
import gymnasium as gym
from agent import SAC
from environment import make_env
#Load the configuration
with open("config.json", "r") as f:
config = json.load(f)
env_name = config["env_name"]
hidden_dim = config["hidden_dim"]
# Create environment. Get action and space dimensions
env, state_size, action_size = make_env(
env_name,
render_mode="human",
)
# Instantiate the agent and load the trained policy network
agent = SAC(state_dim, action_dim, hidden_dim)
agent.actor.load_state_dict(torch.load("model.pt"))
# Enjoy the agent!
state, _ = env.reset()
done = False
while not done:
action_tensor = agent.select_action(state, deterministic=True)
action = action_tensor.cpu().numpy().flatten()
state, reward, terminated, truncated, _ = env.step(action)
done = terminated or truncated
env.close()