Stable Baselines3 PPO Training Issues with Car Racing Environment

I’m trying to build a reinforcement learning agent using stable-baselines3 PPO for the car racing environment in OpenAI Gym. I keep running into compatibility problems and errors with different package versions.

Here’s my current test code with random actions:

import gym
from stable_baselines3 import PPO

game_env = "CarRacing-v0"
racing_env = gym.make(game_env)

num_episodes = 5
for ep in range(1, num_episodes + 1):
    observation = racing_env.reset()
    finished = False
    total_reward = 0
    
    while not finished:
        racing_env.render()
        random_action = racing_env.action_space.sample()
        next_obs, reward_val, finished, episode_info = racing_env.step(random_action)
        total_reward += reward_val
    print(f'Episode: {ep}, Total Reward: {total_reward}')
racing_env.close()

I’m working on Ubuntu 20.04 using VSCode with Jupyter notebooks in a conda environment. The error happens right at the observation = racing_env.reset() line even with just random actions.

I’ve tried multiple versions of gym and related packages but can’t get a stable setup. Can someone help me get this working first with random actions and then move to PPO training? I just need a working configuration regardless of specific package versions.

i had major issues too with the car racing env. switching to CarRacing-v2 helped a ton. also, make sure u install gym[box2d] instead of just gym, that fixed it for me.

The issue you’re experiencing with the reset method is likely due to the version of gym you are using. In newer versions, the method signature has been updated. You can resolve this by modifying your reset call to observation, info = racing_env.reset() or, alternatively, you can downgrade to gym==0.21.0 along with stable-baselines3==1.6.2, which should provide a stable working environment. Additionally, ensure that you run pip install gym[box2d] to include the necessary Box2D dependencies. I faced similar challenges, and this downgrade made my PPO training run smoothly in the CarRacing environment.