“I experimented with many formulas for calculating ratings for the period 1994-2001.
MAIN CONCLUSIONS
I have four main recommendations:
1. Use a more dynamic K-factor.
I believe that the FIDE formula for calculating the rating is logical enough and you do not need to change it. Instead, the conservative K-factor equal to 10, which is currently used, needs to be changed to 24. This will make FIDE ratings twice as dynamic. In addition, the value of 24 is most accurate. Rating formulas using other K-factor values are not so good at predicting the outcomes of classical games.
2. Get rid of the confusing table of ELO.
From a complex and confusing table, the ELO should be abandoned in favor of a simple linear model in which the probability of White’s victory with a preponderance of 390 points (or more) is 100%, and accordingly with a deficit of 460 points (or more) is zero. The remaining expected results can be extrapolated as a straight line. Note that the holder of white figures is added 35 points. In other words, with a deficit of 35 points in the rating, the probability of a white player winning is 50%, and if the rivals’ ratings are equal, then 54%. This model is much more accurate than the ELO table. The theoretical Elo calculations do not correspond to those obtained empirically; Also, the color of the figures is not taken into account. In addition, there is a statistical deviation, which is not in favor of the holders of high ratings.
3. Include in the calculation of the party with accelerated control. At the same time, their “specific weight” should be smaller than the classical parties.
Classic parties should not suffer. The games played with “modern” FIDE control are not as important as the seven-hour games. They can assign a value of 83%. Accordingly, the significance of fast chess is 29%, blitz is 18%. By the way, the inclusion of all types of control in the calculation will allow for more accurate predictions of the results of games with classical control. The use of the so-called “specific gravity” will make the ratings more accurate. The values of 83%, 29%, and 18% were optimized for maximum accuracy and the most accurate prediction of the results of classical games.
4. Calculate the ratings every month, not quarterly.
I do not see any sense in the obsolete ratings. The monthly interval is quite practical, especially given the fact that it takes very little time to count the ratings. The popularity of professional ratings shows that chess players prefer a more dynamic and more frequently updated rating list.
According to the database of 266,000 parties for the period from 1994 to 2001, a straight line helps to predict the result better than the Elo table
Fig. 6.1. Accuracy of the forecast in chess
Look at the blue line on the chart. This straight line is backed by concrete batches and more accurately describes the situation, rather than the Elo curve. Unfortunately, in order to draw conclusions about the results beyond the range of +/- 400 data is not enough, but within the above-mentioned interval, 99% of all official batches fit. I have my own theory about what the error in the Elo calculations is. Be that as it may, one thing is quite obvious: the Elo formula can be significantly improved.
Fig. 6.2. Improved forecast
Why do so many people care? The chess player’s rating is subject to fluctuations depending on whether the chess player is performing better than expected. If you play with opponents of equal strength with you, you should gain about 50%. If you are in the black, then your rating increases, and vice versa. And what if your opponents are inferior to you in the rating of 80-120 points? The result of 60-65% points is better or worse than expected? More than half of chess players from the first two hundred have an advantage in the rating over their rivals at 80-120 points. So this is not an idle question.
Let’s slightly increase the previous schedule (lots of white and black will be considered together). The white curve shows the expected result based on the Elo tables. In this case, it is about the advantage of 200 or less points of the rating. On this line, the results of 266,000 parties are imposed for the period 1994-2001. The colors are the same as in the previous chart. Predictions based on Elo rating allow distortions in favor of a chess player with a lower rating »
Rating in sport: yesterday, today, tomorrow / AA Polozov. – M.: Soviet sports, 2007 – 316s.
Polozov A.A. Encyclopedia rating: economy, sport, society / AMKarminsky, A.A.Polozov, S.P. / M. Economics and Life. 455 s