This blog post is my study notes on OpenAI Five. I am not involved in the research effort.

Why Dota?

Previous work:

What is OpenAI Five?

Training Methods

Reward Functions

Training progression

teams: 1. Best OpenAI employee team: 2.5k MMR (46th percentile) 2. Best audience players watching OpenAI employee match (including Blitz, who commentated the first OpenAI employee match): 4-6k MMR (90th-99th percentile), though they’d never played as a team. 3. Valve employee team: 2.5-4k MMR (46th-90th percentile). 4. Amateur team: 4.2k MMR (93rd percentile), trains as a team. 5. Semi-pro team: 5.5k MMR (99th percentile), trains as a team.

Versions: * The April 23rd version of OpenAI Five was the first to beat OpenAI scripted baseline. * The May 15th version of OpenAI Five was evenly matched versus team 1, winning one game and losing another. * The June 6th version of OpenAI Five decisively won all its games versus teams 1-3. OpenAI set up informal scrims with teams 4 & 5, expecting to lose soundly, but OpenAI Five won two of its first three games versus both.

Observations: * Repeatedly sacrificed its own safe lane (top lane for dire; bottom lane for radiant) in exchange for controlling the enemy’s safe lane, forcing the fight onto the side that is harder for their opponent to defend. This strategy emerged in the professional scene in the last few years, and is now considered to be the prevailing tactic. * Pushed the transitions from early- to mid-game faster than its opponents. It did this by: (1) setting up successful ganks (when players move around the map to ambush an enemy hero — see animation) when players overextended in their lane, and (2) by grouping up to take towers before the opponents could organize a counterplay. * Deviated from current playstyle in a few areas, such as giving support heroes (which usually do not take priority for resources) lots of early experience and gold. OpenAI Five’s prioritization allows for its damage to peak sooner and push its advantage harder, winning team fights and capitalizing on mistakes to ensure a fast win.

Difference versus humans

Surprising Findings

next steps

discussions:

OpenAI Five | Hacker News https://www.reddit.com/r/programming/comments/8tse7u/openai_five_5v5_dota_2_bots/