Created a GPU-parallelized gomoku environment for a faster reinforcement learning training pipeline Implemented PPO, DQN, independent RL and PSRO Trained agents to achieve human-level proficiency in Gomoku on a 15 × 15 board within hours Screenshot