P != 1.0

Stockfish is the only unbeaten after 8 rounds of the 20th annual Top Chess Engine Championship, the computer chess tournament that most people forgot about after the effing things got so good.

The sponsoring organization put one rule in place to restore some interest for human observers: The computers begin play from positions that some human selected. That’s a really good idea; for example, days ago, I dug into the Stockfish-v-AlphaZero games from 2018 where they had to play Bishops’ Gambits. Would they play that fascinating opening on their own, no way.

The other thing that gives the current TCEC ‘season’ some actual drama is that while Stockfish has won the thing six out of the last eight years, the other two were won by the plucky underdog LC Zero, the open source descendant of AlphaZero.

If you’re not rooting for the open source engine for at least one of two reasons, you suck.

The first reason: open source.

The second reason is that the neural networks on which LC and Alpha rely are based on the ‘experience of their lives’, while Stockfish is a throwback (imagine that, talking about chess engines like old news) that makes a zillion evaluations of a tree search, and after 30 or 50 or whatever ply, it spits out a move that’s based on the inherently-flawed ‘pawn == 1’ table.

If you go back 40 years,, when chess computers were edging into mainstream consciousness, the computers were terrible, because evaluation algorithms were still in the relative infant state P == 1.0, Q == 9.0, K == 327.67.

The trouble with that (and this is why so many human players are terrible at evaluating positional imbalances) is: P != 1.0.

P never equals 1.0, so any evaluation algorithm that starts with that is flawed from the start. P == 1.0 is a bug.

AlphaZero and its offspring Leela Chess Zero don’t suffer from the P == 1.0 bug. The Zeros evaluate their tree as “this is likely to be successful 75% of the time” or “this should work 83% of the time”. When Stockfish and other old-fashioned engines return an evaluation, it’s expressed in “pawns”, but if you understand P != 1.00, you’ll agree that an evaluation in terms of P is, at heart, untrue.

The Zeros ‘think’ like humans; they don’t know how it’ll play out at the end, but their experience tells them ‘what the heck, this looks like a fun move’. Not only should we be rooting for the plucky #2 seed LC Zero for its human aspect that allows for creative play, we should be contributing to the project so it might overtake Stockfish and stay there.

This is from the Leela Chess Zero blog. Exactly what I’m talking about:

[Chess engine evaluation in terms of pawns] is however not the most logical way to score a position. Using pawns sounds arbitrary, and it’s also not linear and the same step has different meanings: while going from 0 to 1 pawn advantage keeps the position quite holdable, going from 1 to 2 usually makes it mostly lost. Going from 10 to 11 makes no difference at all.

Leave a Reply