--- license: mit language: - en library_name: stable-baselines3 tags: - reinforcement-learning - LunarLander-v3 model-index: - name: DQN results: - task: type: reinforcement-learning name: reinforcement-learning dataset: name: LunarLander-v3 type: LunarLander-v3 metrics: - type: mean_reward value: 218.56 +/- 63.62 name: mean_reward verified: false --- # **DQN** Agent playing **LunarLander-v3** - [Github Repository](https://github.com/kuds/rl-lunar-lander) - [Google Colab Notebook](https://colab.research.google.com/github/kuds/rl-lunar-lander/blob/main/%5BLunar%20Lander%5D%20Deep%20Q-Network%20(DQN).ipynb) - [Finding Theta - Blog Post](https://www.findingtheta.com/blog/solving-gymnasiums-lunar-lander-with-deep-q-learning-dqn) Then, you can load the model using the following Python code: ```python import gymnasium as gym from stable_baselines3 import DQN from stable_baselines3.common.env_util import make_vec_env # Load the trained model model = DQN.load("best-model.zip") # Create the environment env = make_vec_env("LunarLander-v3", n_envs=1) # Reset the environment obs, info = env.reset() # Enjoy the trained agent for _ in range(1000): action, _states = model.predict(obs, deterministic=True) obs, rewards, terminated, truncated, info = env.step(action) if terminated or truncated: obs, info = env.reset() env.render() env.close() ``` ### Hugging Face Hub You can also use the Hugging Face Hub to load the model. First, you need to install the Hugging Face Hub library: ```bash pip install huggingface_hub ``` Then, you can load the model from the hub using the following code: ```python from huggingface_hub import hf_hub_download import torch as th import gymnasium as gym from stable_baselines3 import DQN # Download the model from the Hub model_path = hf_hub_download(repo_id="kuds/lunar-lander-dqn", filename="best-model.zip") # Load the model model = DQN.load(model_path) # Create the environment env = make_vec_env("LunarLander-v3", n_envs=1) # Enjoy the trained agent obs = env.reset() for i in range(1000): action, _states = model.predict(obs, deterministic=True) obs, rewards, dones, info = env.step(action) env.render("human") ```