How Many % Does A 2-Point Mistake Cost? (Research)

Today we have a very special article written by Viktor Lin 6D EGF player and manager of EGF Academy. He chose probably one of the hardest topics available out there, but also a topic which raises curiosity of many people. So, here you are, enjoy reading !

Recently, it was stated that a 2-point loss in fuseki translates to a 10% drop in Leela:

While this observation comes from an arguably reliable source, we1 found this statement overgeneralised and hardly applicable as a rule of thumb. The example shown above comes with various additional factors that interfere with the point-to-percentage correlation that, at first glance, seems to make a lot of sense:

-) Having the exchange also increases White’s eyespace, albeit very slightly.

-) It is not obvious at all to us mortals that the exchange is worth 2 points.

-) If one side keeps losing 2 points, they will eventually run out of percentages to drop. Thus we are expecting the drop to behave differently in accordance with the current win rate.

Therefore, we researched the percentage loss further with adjustable control variables in an artificial environment.

Approach #1:

For our research, we used Leela version 0.16 – 15 block trained on 40 blocks with Lizzie 0.7 and let it judge an artificial position in which one colour (White) controls a closed-off territory that can be adjusted without having to take into account additional factors, while leaving the rest of the board mostly open so that the position can still be considered as in the state of fuseki.

In order to ensure a fair Leela judgement for each position as best as possible, we let it go to roughly the same number of visits (50k) on the move that received the most visits.

First we started with an overwhelming advantage for White, at 92.7%. As the control factor, we added stones one-by-one to White’s territory with Black passing in-between, to simulate a loss of 1 point per stone.

After the first stone, White’s percentage actually went up to 94.5%. This can be interpreted as White holding onto the advantage, and the extra stone is solving the ambiguity Leela previously had about a black invasion.

However, several stones later, White’s win rate had still not dropped and we started doubting our sanity.

This is when it dawned on us that, under Chinese rules, filling your territory does not actually lose points, provided that the opponent plays an equally useless move (such as passing). We thus declared the first experiment a failure and moved on to a different method.

Approach #2:

For our second experiment, we kept the same setup as in approach #1. This time we manipulated White’s territory by letting Black dent it in on the first line. So each position is 2 points worse for White than the previous one. (We have not yet found a method to reliably lose only 1 point under Chinese rules. You are welcome to repeat our experiment and share your results~)

This time, we see a drop already after the first manipulation:

A drop from 92.5% to 92% is tiny, and we think it means that Leela values the two extra points in P0 almost the same as the better aji in P-2.

The perceivable drops happen from P-4 onwards:

We will spare you the plethora of screenshots we took and get right to the point. Below you will find a table of Leela’s percentages in various states of the board:

As we had expected, the closer the win rate gets to 0% or 100%, the less significant a 2-point loss is, while the largest drops can be found around the centre of the scale (in this case from 39.8% to 28.2%). We decided to stop at P-30 because the win rate has got reasonably low and the stones form a pretty rectangle.

If we had gone further (in both directions), the differences in win rate would have, presumably, gradually approached 0, until the non-deterministic nature of Leela judgements randomly made a few drops positive.

This suggests that, if plotted on a graph and paired with a moderate amount of experimenter bias, the drops in win rate resemble a Gaussian distribution with a peak around 10%.

We should stress here that, in order to reach a statistically sound conclusion, it is necessary to average the results from several repetitions and switch the colours, perhaps also consult different networks and vary the position, neither of which we did due to limited resources. The win rates we generated with Leela should be seen as random numbers that are somewhat close to what they represent.

So far we have vaguely confirmed the original statement, but only when the game is still mostly even. The more Leela favours one side, the less important a 2-point loss becomes. Thus, a 10-point mistake would not cost 50%, but significantly less. E.g., going from P0 to P-10 , then P-20 and P-30 in this particular position costs 21.1%, 43.2% and 25.8% respectively.

That said, we made another observation during our experiment that the readers may find more useful: From P-10 to P-12 in this particular position, Leela transitions its primary choice from high to low kakari.

Before the transition, Leela consistently favours the high one and vice versa. Can the high kakari be seen as a safer alternative when you have the edge, and should you play the low one when you’re not ahead yet?

This raises another question: Can this lab-created position really be seen as in fuseki? Otherwise, how far into the game should we consider it?

1Pluralis maiestatis.