Building a WoW Rotation Helper with OpenCV

I love World of Warcraft Classic. I hate pressing the same rotation keys 10,000 times a night. So I built a tool that watches my screen and presses keys for me.

Disclaimer: Yes, this probably violates ToS. No, I don't recommend using it on your main account. But building it taught me more about computer vision than any textbook.

The Problem

In WoW Classic, optimal DPS requires hitting abilities in a specific order. Addons like ConROc show you which spell to cast next with an icon overlay. But you still have to watch that icon and press the right key manually.

What if the computer could watch the icon instead?

OpenCV to the Rescue

The core idea is simple: capture a screen region, compare it against saved spell icons using template matching, and press the corresponding keybind when there's a match.

CLIENT A

◄─ Firestore ─►

CLIENT B

RACE CONDITION: Both write "GUESS" at t=100ms

▼

SOLUTION: Atomic Transaction

Fig 3.1: Distributed State Conflict

# Core detection loop
while running:
    # Capture the detection region
    screenshot = capture_screen_region(x, y, width, height)
    
    # Compare against each configured spell icon
    for spell in configured_spells:
        result = cv2.matchTemplate(screenshot, spell.icon, 
                                    cv2.TM_CCOEFF_NORMED)
        _, max_val, _, _ = cv2.minMaxLoc(result)
        
        if max_val > threshold:
            # Found a match! Press the keybind
            press_key(spell.keybind)
            break

Building the GUI

The hardest part wasn't the detection—it was making the tool user-friendly. I built a full GUI with:

Region Selection Tool: Click and drag to define the screen area to monitor
Image Capture Tool: Capture spell icons directly from your screen
Spell Configuration: Map spell names to keybinds
Live Preview: See what the detector sees in real-time
Monitoring Tab: View detection logs and statistics

Config is stored in spell_config.json:

{
  "spells": [
    {
      "name": "Fireball",
      "keybind": "1",
      "icon_path": "icons/fireball.png"
    },
    {
      "name": "Pyroblast",
      "keybind": "shift+2",
      "icon_path": "icons/pyroblast.png"
    }
  ],
  "detection_area": { "x": 960, "y": 540, "w": 64, "h": 64 },
  "threshold": 0.85,
  "hotkey": "ctrl+shift+x"
}

The Threshold Problem

Template matching sounds simple, but getting the threshold right is tricky. Too low and you get false positives (wrong spell detected). Too high and you miss actual matches (lighting changes, icon scaling).

I ended up implementing a confidence system where users can tune the threshold per-spell, plus a "minimum duration" setting to avoid rapid-fire keypresses.

What I Learned

Building ConROc Helper taught me that computer vision is often about constraints, not algorithms. The template matching itself is one line of OpenCV. Everything else—region selection, threshold tuning, keybind timing—is about making the system robust in the real world.

Also: automating games is surprisingly addictive. Once you start, you see automation opportunities everywhere.

Teaching My Computer to Play WoW: Building ConROc Helper

The Problem

OpenCV to the Rescue

Building the GUI

The Threshold Problem

What I Learned