Soft Actor-Critic (SAC) Agent playing Humanoid-v5

This is a trained Soft Actor-Critic (SAC) agent for the MuJoCo Humanoid-v5 environment.

Model Details

The model was trained using the code available here.

Usage

To load and use this model for inference:

import torch
import json
import gymnasium as gym

from agent import SAC 
from environment import make_env

#Load the configuration
with open("config.json", "r") as f:
    config = json.load(f)

env_name = config["env_name"]
hidden_dim = config["hidden_dim"]

# Create environment. Get action and space dimensions
env, state_size, action_size = make_env(
    env_name,
    render_mode="human",
)

# Instantiate the agent and load the trained policy network
agent = SAC(state_dim, action_dim, hidden_dim)

agent.actor.load_state_dict(torch.load("model.pt"))

# Enjoy the agent!
state, _ = env.reset()
done = False

while not done:
    action_tensor = agent.select_action(state, deterministic=True)
    action = action_tensor.cpu().numpy().flatten()
    
    state, reward, terminated, truncated, _ = env.step(action)

    done = terminated or truncated

env.close()

Downloads last month: 32

Video Preview

Reinforcement Learning

Evaluation results

mean_reward on Humanoid-v5
self-reported

5718.56 +/- 5.62

View on Papers With Code