OpenEnv-WolfeClick

Watch how our fine-tuned model plays Pokemon

Start the replay to watch the full battle automatically, or switch to frame-by-frame mode to inspect the exact state, legal actions, and decisions on each turn.

Turn 1 Replay

Model-visible strategic replay

OutcomeWON

Total Reward0.00

100%

100%

Turn Actions

Turn Commentary

Model Visible Team State

Your Team

Opponent Revealed So Far

Legal Actions This Turn

Environment Design

OpenEnv-WolfeClick wraps competitive Pokemon Showdown as an OpenEnv-compatible environment.

The model receives:

the active field
the full self roster
revealed opponent history
the exact legal actions for the turn

It must emit exactly one JSON action:

{"action": "move" | "switch", "choice": "Exact Name of Move or Pokemon"}

The training loop then collects real trajectories from live battles and applies GRPO to improve the action policy on those rollouts.

Get Started

Record a real battle

python record_battle.py --revision grpo-qwen3-4b-run3 --output battle_logs/raw_battle.json

Convert it for replay

python convert_battle_log.py --input battle_logs/raw_battle.json --output battle_logs/replay_battle.json

Launch locally

uv venv
source .venv/bin/activate
uv pip install -r requirements_space.txt
uvicorn space_app:app --reload --host 0.0.0.0 --port 7860

Included notebooks

trainer.ipynb
watch_battle.ipynb
benchmarks/benchmark.ipynb