Friday, March 11, 2016

SmartGo article: AlphaGo Don't Care

Bob Hearn shared this wonderful article on smartgo.com by Anders Kierulf: AlphaGo don't care

I loved this piece, mostly because it really hits on an important aspect of the theoretical study of combinatorial games.  There is a real "psychology" aspect to a lot of the strategies.  I am not familiar with the Go terminology of the different move/structure types in the article (e.g. "tenuki", "extend at the bottom", and "peep").  These are human ways to describe different strategies and patterns. 

As the article reiterates: "AlphaGo don't care."  It doesn't lose universal focus to get stuck in local duels.  Each turn, it reevaluates the gameboard as a whole.  It doesn't care about the order of the previous plays, data that doesn't change the outcome of the game.

The same is true in CGT.  The options from a position depend only on the information of that position, not the history of plays nor the psychological battle between the two players.  The value of the game is irrelevant (though it may be very hard to calculate exactly).

From the article:
Lee Sedol threatens the territory at the top with 166? AlphaGo don’t care, it just secures points in the center instead. Points are points, it doesn’t matter where on the board they are.
It doesn't matter that Lee Sedol last moved near the top, AlphaGo just goes wherever it thinks it can amass the most points, not where it thinks the other player is going to focus their efforts.  The actual temperature of the regions is more important than which pieces were the most recent plays.

The third game is tonight!

3 comments:

  1. "the order of the previous plays ... doesn't change the outcome of the game" is actually not quite true. In a ko fight, the previous moves (and positions) *can* matter, which is why Go is EXPTIME-complete instead of something easier like PSPACE- or NP-complete.

    ReplyDelete
  2. Good point. That can definitely matter.

    ReplyDelete
  3. Something that I never understood as well as I wanted about CGT is the difference in hot games between hotstrat and sentestrat. In hotstrat, you move in the hottest component. In sentestrat, you have some number called the ambient temperature, and you move in the hottest component unless the previous player's move raised the temperature of that component above the ambient temperature, in which case you respond in the same component. It turns out that hotstrat can be arbitrarily bad, whereas sentestrat, while not optimal, at least gives moves that are boundedly close to optimal.

    ReplyDelete