I’m working on a reinforcement learning project and running into speed problems with my GPU setup.
Here’s what I’m using:
Set up a cloud VM with deep learning tools pre-installed
Added keras-rl and OpenAI gym libraries
Running the basic CartPole DQN example
Using an NVidia K80 GPU with proper drivers
The problem is that my GPU only hits about 20% usage and processes roughly 100 steps per second. This is actually 3 times slower than when I run the same code on my laptop CPU (i7-8750H). I turned off visualization to make sure that wasn’t the issue.
I checked the usual suspects like CPU load, RAM usage, and disk activity but everything looks normal there. Has anyone else seen this kind of performance gap between GPU and CPU for basic RL examples? Any ideas what might be causing the GPU to underperform so badly?
Yes, this is a common issue with DQN on simpler tasks like CartPole. The small neural network doesn’t utilize the GPU cores effectively, resulting in inefficiencies during data transfer between CPU and GPU memory without noticeable speed improvements. I’ve faced similar performance drops with my RTX 2070 when working on basic examples. The older K80 adds additional latency, which further exacerbates the issue. CPUs tend to perform better with small batch processing and straightforward operations. To leverage GPU advantages, consider increasing the batch size or moving to more complex tasks that require larger networks. However, for CartPole, it’s best to continue using the CPU.
the k80 is kinda old, dude. for somethin simple like cartpole, cpu might be just fine. try running a heavier game like atari, and you might see your gpu actually show its strength!