A fragment of the book
Handbook of Ratings. Approaches to Ratings in the Economy, Sports, and Society / A. Karminsky, A.Polozov / International Publishing house “Springer”, 2016., 366c
CHAPTER 8. A UNIVERSAL SOLUTION TO THE PROBLEM OF RATINGS AND RANKINGS IN SPORTS
8.1. CONCEPT OF RATING IN SPORTS AND PRINCIPLES OF FORMING THE RATING SCALE
Hereinafter the concept of rating will be understood as the result of an athlete who took part in the general hypothetical round-robin macrotournament that lasted for a year, this result being shifted into the area of positive integers.
At the same time, we believe that the following principles are applicable.
Principle 1. The goal shall prevail over the point. The information basis for the rating shall be considered to include the following: the primary parameters of gaming activities in the form of scored (S) and (C) conceded goals, the number of implemented actions etc., all this being mentioned in the official rules of competitions. If the final score of a bout between boxers A and B was 12:8, then the rating may be calculated both as 12:8 and as 1:0. that is boxer A won. But it is not logical not to use such an opportunity when it’s possible to use it. The matter is that the case of win can be both by a score of 1:0 and 11:0. In the first case the opponents are equal, and the second case is just one boxer battering another. And the scores are identical. Then it will be found out that large groups of boxers at once have similar ratings because of such roughening of the assessment. In order to better differentiate them in relation to each other it will be necessary to encourage the boxers to participate in many tournaments. And the just because the balance of forces cannot be finely evaluated due to roughening the assessments. It must be assumed that dozens of millions of people could participate in the macrotournament and everybody has to be assessed.
Principle 2. The choice of the type of functional dependency. The function should:
2.1. Possess the property of anticommutation: F (S, C) = F (C, S).
2.2. Operate within the selected numerical interval, and not within the entire scale.
It is necessary to discard the results of games between the opponents with a difference in ratings of more than 1000 points.
2.3. Not to extend beyond four arithmetic operations and to ensure a minimum number of arithmetic operations when recalculating the rating.
2.4. Minimize the total difference between the results of the participants in the head-to-head game and their total results.
The first three clauses are a filter for functions, the last clause is a condition.
It is proposed to discard the results of the games between the opponents who have too great difference in the ratings (the next level used is more than 1000 points). In these games, there is no such thing as struggle for the result, and the weaker participants receive undeservedly inflated assessment, which distorts the balance of forces. The distortions here are usually high due to the high performance of such games. If we want to know the real balance of forces between such participants, this must be done through the participants with intermediate playing level, when the struggle for the result is more realistic. If the results of the matches against the opponents with the difference in the obtained ratings exceeding 1000 points are found in the source data, they should be excluded.
If the first three principles are filter for functions, the last principle is a condition for sustainability of the rating behaviour. We need such a rating model, in which the difference in nominal ratings of two opponents and the actual difference in a head-to-head game would be as similar as possible. If such a parameter turns out to be the same for several functions at once, then in this case the preference is given to the one with a minimal number of arithmetic operations. This is necessary in order to minimize the work on recalculating the rating.
Let us take a table of some round-robin tournament as a basis for the model. We’ll use a table of round-robin microtournament as a macrotournament model. Let us compare the results in head-to-head games and the parameters of overall performance in three kinds of sports with different performance. We are interested in the degree of convergence of the specific result of the match between the participants A and B with their general tournament achievements. If, for instance, player A beat player B by a score of 3:1, then the overall balance of the scored and conceded goals of players A and B per season should have a similar correlation. If we manage to find such a function, which will imply that general and specific balances of scored and conceded goals are equal, then the task will be accomplished.
The functions were selected on the basis of the reference book (Rybasenko et al., 1986); by means of enumerating the previously suggested functions; by means of enumerating possible options for the simplest function structures of the scored (S) and the conceded (C) goals (efficient actions). The levels of certainty of the match result for Futsal, hockey and football were compared by A.A. Polozov (2007) on the basis of various dependencies, selected in accordance with principles 1 and 2. It is shown that the following dependency can be chosen as a functional dependency Δ
where the factor 1000 sets the range of the rating scale.
Principle 3. The transitivity principle implies that if participant A is more preferable than participant B as per the total of the results, and participant B is similarly more preferable than participant C also as per the total of the results that had been recorded within the previous year, then the level of participant A is higher than that of participant C.
This principle allows to conduct the microtournament without all-to-all obligatory matches. Thus the opportunity is created to transform a round-robin macrotournament into a hypothetical one, when it is not necessary to play all the games for comparing the participants. The playing level that is defined on the basis of the gained part of the results, is extrapolated to the whole amount of games. The absence of this principle means that each participant has to play against all the other participants of the macrotournament, and that has no prospects.
Principle 4. The principle of in-depth translation is intended to ensure permanence and continuity of the ways to recalculate the ratings when moving from the macrolevel to the subsequent underlying levels: from the level of teams at the level of their players, from the level of players to the level of the the basic components of the game – and vice versa. It implies the possibility of replacing several opponents with one equivalent to them
S – C = S1-C1 + … + Sn-Cn ;
Here . The value i = (Si+Ci)(S+C) is the share of the given result’s participation in the overall assessment. As according to our definition, rating is a positive number, it is necessary to make an upward bias on the numeric scale is by such an amount which results in the positive rating value for the weakest participant:
Similarly, the overall team rating is decomposed into the rankings of its players. So, when you move to each successive layer, the form of recalculation is retained. Rejection of this principle leads to the loss of the interaction between different levels.
Principle 5. The principle of asymptotic stability of results means that regardless of their original values, only one solution in the distribution of ratings can be obtained on the basis of the gained results.
Any classification gives a formula for calculating the rating of the i-th participant. By drawing this formula for all the amount of n participants successively, we get a system of linear equations (SLE). It may either have solutions or not. More often than not, it has its solution, when the competitions are long and the entry list is limited. It is this solution of SLE, if it is used in such an implicit way, that actually provides some convergence of most of the rating suggestions.
The most convenient way to implement this principle is preparation and the subsequent solution of the corresponding system of linear equations (hereinafter referred to as SLE). When the determinant of SLE is not identical to zero, SLE always has only one solution. The absence of this principle leads to the existence of multiple solutions for the same results of macrotournament, which is equivalent to the absence of an actual solution.
Let us consider a completely filled table of any arbitrary macrotournament. Let us strike out any line considering it in the future as unknown. The missing information can be retrieved according to the appropriate column. This means that SLE that corresponds to the whole table has multiple solutions. In order for SLE to have only one solution, it is necessary to either replace any equation in it with some other one, or to just add this equation to those that are already available.
In reality, it is preferable to use (n +1) equation that determines the average rating of the given tournament by the ratings of all (or part) of its participants: . There is only one solution of SLE obtained after adding this (n +1) equation to the existing n ones
To explain the approach we use, let us consider a practical example of a round-robin tournament, the results of which are given in the table 8/3
Table 8.1. Round-robin tournament table
Team 1 2 3 S:C Rt
А 6:4 7:3 13:7 2200
B 4:6 6:4 10:10 2000
C 3:7 4:6 7:13 1800
The obtained ratings of participants are Rt (A) = 2200; Rt (B) = 2000; Rt (C) = 1800. The solution can be checked by the differences in ratings. As player A beat player B by a score of 6:4, this corresponds to the difference in 200 points. The fact that player A beat player C by a score of 7:3 corresponds to the difference in 400 points. The corresponding system of equations will have the following form
As a matter of principle, it is possible to calculate the rating by a simpler formula. Let us divide the macrotournament into two arbitrary microtournaments. Let us find the ratings of participants using the corresponding SLE and combine the results on the basis of the principle of in-depth translation:
Rt i = i1Rti1 + i2Rti2 . (8.5)
It is mathematically proven that the solutions obtained by means of solving SLE with respect to the microtournaments, and the solutions combined on the basis of the principle of in-depth solution are equivalent to the general solution of SLE as per the whole macrotournament. This allows us to calculate ratings using method of successive approximations. You may have j number of microtournaments with a solution of the respective SLE as per the i-th player as Rtij and there’s a new j +1 microtournament with a solution Rti (j +1).
Rt i=( ijRtij + i(j+1)Rti(j+1) )/( i(j+1) + ij)=
= Rtij +( i(j+1) /( i(j+1)+ ij))(Rti(j+1) -Rtij) (8.6)
The value (i(j+1) +ij) shall be equal to the average number of official matches per season. Solving the same practical example with players A, B and C, with the initial data as per the table. 11.1, we shall obtain the same result: Rt(A)=2200; Rt(B)=2000: Rt(C)=1800.
However, such a flash-like convergence should not be expected in everyday life. However, at end of the season the results of successive “manual” recalculation should not differ substantially from the solution of SLE. Thus, we have come to a formula similar to that by A. Elo, but without magic numbers, because this formula sets the system of linear equations in an implicit form.
A third way is also possible that implies neither a “manual” recalculation, nor a pure solution of SLE. If there are too many participants, and the tournaments bear unrhythmical character, then at some points there may be difficulties with the solution of SLE. In this case an intermediate solution is possible to calculate the rating at the federation level. Here SLE are solved on the basis of specific microtournaments, and ratings of all participants are obtained. Then the ratings of the opponents are put into the equation of the i-th player at the level of the macrotournament, and the sports federation. SLE of the macrotournament is solved by the method of successive approximations, which is justified when the number of participants is very large.
Principle 6. The average rating of the macrotournament is set in a way so that the rating of the weakest participant would be a positive value. The progress of many different participants is never simultaneous. The average rating of the macrotournament is adjusted as per the change of the average density of positioning the participants on the rating scale, which increases as per the logistic dependence for each kind of sports. The new participant is assigned a rating that is equal to the average rating of the macrotournament.
Let us illustrate this with an example of the analytical data of all previous FIFA World Cups (A.A. Polozov, 1995). The average rating of the macrotournament is adjusted as per the change of the average density of positioning the participants on the rating scale, which increases as per the logistic type of dependency at the initial development stage of a certain kind of sports. Figure 8.1. shows the ratings of the FIFA World Cup winners.
Principle 7. Factor compensation. There are factors that affect the final result and create unequal conditions for the participants. Identifying the value of any factor involves the comparison of the participant’s results before and after its impact, all the other factors being dropped out. The compensation for the sum of such independent, non-interacting factors should be equal to the sum of their compensations.
Then the rating of the participant, compensated as per all the selected factors will be the official outcome of the competition . Home ground factor in games, white chess factor, serve factor in tennis, go handicap factor, gender, and age are the examples of factors that create inequalities.
J. Sonas assesses the advantage of playing white chess pieces in assigning a value of 35 rating points. In cross-country skiing, the athlete who starts later has the advantage. There is one obvious imperfection of the formula for conducting the final competitions in football. The final stage is played in a knockout format. Hence, playing defensively, the weaker team can exert moral coercion on the opponent with a series of penalty shootouts, where, as we know, the chances for both teams are nearly equal. The stronger team should go ahead, and be open in order to avoid the postgame lottery. The compensation is symmetrical – the amount added for away playing is equal to the amount deducted from the opponents.
Let us mention the following conditions of correctness of the macrotournament results:
1. The absence of isolated microtournaments.
2. The results having the difference of Rti-Rtj ≥ 1000 are excluded from consideration.
3. The macrotournament continues until the average density of the results becomes stable.
4. The participant’s rating error 2000 ⁄ (S+C) < ρ should be smaller than the average interval of their location.
5. The results are rounded to the values corresponding to the density.
In conclusion, let us summarize the rating calculation options.
1. Direct solution of a system of linear equations Problems may arise with obtaining such a solution in a certain gaming environment, and when a number of participants is too big.
2. Iterative solution of a system of linear equations It is a successive manual recalculation of the subsequent ratings on the basis of the previous ones. Is a procedure for averaging the last result with all the earlier results in this season.
Rt i=( ijRtij + i(j+1)Rti(j+1) )/( i(j+1) + ij)=
= Rtij +( i(j+1) /( i(j+1) + ij))(Rti(j+1) -Rtij)
3. Solution of the system of linear equations within the local microtournaments and merging the obtained solutions as per the macrotournament as a whole based on the principle of in-depth translation:
Rt i = i1Rti1 + i2Rti2 . (8.7)
4. Another version of the iterative solution. The solution of a linear system of equations by a simple substitution into each of the current ratings of opponents. Such a successive substitution of current ratings of the opponents into the participant's equation that was changed after another official competitions gives almost the same results as pure solution of SLE, but without the costs that always happen when solving SLE for a very large number of participants.
8.2. PRACTICAL EXAMPLE OF USING A UNIVERSAL SYSTEM
Sport implies struggling for the result. If understanding of the result is the difference between scored (S) and conceded (C) goals (implemented actions), than the sense of the game is to create a positive difference. The participants can be arranged in descending order of the difference created by them in a game against an average virtual opponent. The difference in their ratings should correspond to the actual result of the head-to-head game. The rating model should provide for this convergence. For this purpose, the participant's rating should be represented as his result in the yearly macrotournament, which actually consists of a total of microtournaments.
All those involved in this kind of sports in Holland, France, Russia, Georgia, the United States and other countries played all-to-all matches. Tournament within one country, division into weight categories – all these are examples of a macrotournament prototype. Macrotournament is a hypothetical concept. It is impossible to implement it using round-robin system. Let us use the principle of transitivity – if player A had been playing better than players B and C for the whole season, and player B had been playing better than player C, then in the final ranking list player A will stand higher than players B and C, and in that case, why should player A play against player C?
The current results of its participant allow to determine the playing level, according to which it is easy to predict the results of the macrotournament matches that hadn't been played. Although there is always a doubting Thomas, who wants to play and check out the convergence of actual and expected results.
A kind of rating functional dependency plays a crucial role in the convergence of the model. It is the search of function with the greatest convergence in various kinds of sports that is the most time-consuming thing. As a result, we managed to bypass the famous Elo's table of factors in it, though not much.
The convergence also depends on the form of recalculation. If we write a formula for one participant, and then for another, and another, then we successively put down a system of linear equations, which either has a solution or not. Many calculation schemes (Elo's rating, tennis rating) ensure the relative spontaneous convergence due to a kind of safety net from the actually obtained system of equations. Let us illustrate this with a few examples.
Let us consider the example from checkers. Suppose you have 10 matches in the tournament. If some of your opponents have not been rated yet, then an average value of 2200 will be assigned to it. Let the average rating of your opponents be 2500. You won by a score of 7:3. Your rating for this tournament will be:
But this is a rating in this particular tournament. And you've already played 30 matches for this season in general. And let's say, you got 2700. Then your rating for the season will be:
Let your team have played some hockey matches. The sum of scored and conceded goals in the previous matches of the season is equal to 30, the rating for a season being 2700. You won another match against the opponent with a rating of 2500 by a score 7:3 . The seasonal rating of your team changed in the very same way after the match.
Suppose you're going in for boxing, and did not calculate the rating for a season, but now all of a sudden had a desire to do it. You had three bouts against a boxer, whose current rating is 2500, two bouts against a boxer, with a rating of 2420, and only one bout against a boxer with a rating of 2300. The total score of all six bouts defined by the referees is 30:20. Your opponent was some virtual boxer, composed of 3/6 shares of rating of 2500, of 2/6 shares of rating of 2420, and 1/6 shares of rating of 2300. The strength of this generalized virtual opponent is equal to
You are stronger than this virtual opponent by 200 points.
This is the way the participants of the competition will calculate the rating. The organisers would better create a programme that will solve SLE consisting of typical equations similar to the example above and daily publish the results in the Internet. The system is transparent. Any participant at any given time, may easily check the organisers by creating his own equation similar to the previous example and checking the rating calculated by him against the official data. There is no need to invent any arbitrary adjustable factors, which enhances the quality of the model.
In boxing, the bouts can be subdivided into long range bouts, middle range bouts, and infightings. These are specific components already. If the boxer has the overall correlation of 30:20, distributed by the bout distance as 10:2, 10:10, 10:8 respectively, then from this it can be clearly seen that a long range bout is preferable. The convergence of general and specific sums ensures the continuity of general and specific ratings. They are calculated in a similar way. The only difference is that the athlete or his coach will have to accumulate the information themselves.
It is mathematically proved that there is no difference in the results, no matter if SLE are solved as per the macrotournament in general, or calculated at each tournament (component), and the obtained results are combined in proportion to their specific weight. Calculating the specific ratings of the opponents, it is possible to calculate the ratio of punches in the bout that has not taken place yet, and make adjustments. As far as wrestling is concerned, here we can consider holds, throws, and pins in a similar way. In the same way you can also find out who will be the champion in chess, for example, in the Sicilian Defence.
As far as team sports are concerned, the rating of a team is similarly resolved first by players, and then by components. The difference in the ratings of players A and B corresponds to the difference in the scores of the teams that consist only of players A and players B.
The performance of most kinds of sports tends to reduce. The first final match of the FIFA World Cup ended by the score of 8:3. Nowadays goalless draws in football became a norm. In boxing the situation is similar. Since the beginning of the century the number of punches recorded by the referees in boxing in the Olympics, has reduced by fifty percent. How will rating react to such a change? Analysis of the number of active actions – punches – can be considered as one of the ways for such a reaction. If the correlation of punches does not change, then the rating will not be in any way influenced. Another thing is that the price of a single punch increases.