A fragment of the book

Handbook of Ratings. Approaches to Ratings in the Economy, Sports, and Society / A. Karminsky, A.Polozov / International Publishing house “Springer”, 2016., 366c

CHAPTER 7. EVOLUTION OF IDEAS ABOUT RATING AND RANKING IN SPORTS

7.1. HOW DO DIFFERENT CLASSIFICATIONS DEFINE THE CONCEPT OF RATING?

Richard Feynman, a Nobel Prize Winner, said that two-thirds of any science lies in the concepts it uses. In the film “Vermont Recluse” directed by Stanislav Govorukhin, which tells about Alexander Solzhenitsyn, the protagonist of the film talks about the words that pollute the Russian language. And the word “rating” is the first word in his list.

The majority of the people understand rating as empirical estimates or some quantitative parameters of the objects being ranked. Here are the most common definitions of rating in sports. It is natural that the corresponding analogies should be also possible in other types of activity.

Rating is an individual numerical factor (Elo, 1963). “The individual factor (IF) of a chess player is a measure of his practical strength expressed in numerical form”. (Chess, 2003).

Rating is a result in a total macrotournament. This is the result of a participant of a general hypothetical round-robin yearly macrotournament, shifted to the area of positive integers (Polozov, 1995).

Rating is the power of the game, the skill. Such an understanding of the rating is fixed in the regulations on table tennis, gliding, chess, ice climbing, etc. (Polozov, 2007). The rating of a player is a numerical expression of the playing strength, in which a higher rating implies a stronger game. One of the greatest fascinations of tournament chess players and competitors of other games is the measurement of playing strength” (Glickman., 1998).

Rating is a public recognition. As a tool to assess one player in relation to another, rating is used in many industries and kinds of activity. (Bakhareva, 2003). Rating is the evaluation of public recognition of the business consistency of the person (Malygin, 2003).

Rating is the average score given by the group of experts. In many spheres of human activity we can come across values (features, options) that allegedly possess numerical nature, however, the exact meaning of these values cannot be directly measured physically. Such values need expert evaluation methods, when a group of “experts” gives an opinion on the distribution of the value on a selected scale of numerical values. Here we can draw the examples not only of the assessment of athletes in such competitions as artistic gymnastics and rhythmic gymnastics, figure skating, diving, freestyle, but of the popularity ratings for politicians, particular actors and on-stage performance groups, as well as evaluation of academic progress in education (Pavlov, 2004).

Rating is the share of the conquered information space. The rating of a TV programme (e.g., sports programme) is the ratio of audience of the given programme to the total number of viewers at the given moment.

Rating is the labour input. Rating determines the quality of preparing an athlete (student) in all areas (events), considering them as being of equal importance.

Rating is an incentive. The following principle serves as a basis for the R-rating: success or failure of teams in the competitions that are over should not be fixed, but instead the increase of the teams’ class in the current competitions should be encouraged. For example, in badminton a player’s position in the ranking list is determined “in order to assist the organizers of the competition in preparing tables, casting lots, determining the order of numbers in teams, and stimulating athletes to compete and improve their skills”.

Rating is the position occupied by an athlete. Rating is the ranking of the athletes according to the level of the sports results they had demonstrated (Krasilnikov, 1998). For example, according to the regulations on ice climbing, the rating is set to determine the order of distributing Russian athletes in terms of their skills, according to difficulty and speed of the competition respectively. Ratings should reflect the achievements of teams not for one last month or year, but at least for a few last years (Bozhkov, 2004).

Rating is the process of revealing the strongest athletes for the team. The objective of the rating is “to define a group of the strongest Russian athletes according to the results of foot orienteering competitions”.

Similarly to physical measurements rating in sports can be considered as an evaluation of some random parameter that reflects the playing strength and the skill level of an athlete or a team. Here the evaluation usually “estimates” (in some sense approximates) a particular parameter of the distribution of the variable in question.

Rating helps to establish a specific internal order. That has always been a function and an attribute of some authoritarian power. Obviously, Player A rated 2398 is unlikely to think that he is weaker than Player B rated 2403. But if the decision of the authorities any benefits finish at the value of 2400 and according to the pre-defined rules, the same power determines that player’s A rating is 2398, whereas player’s B rating is 2403, then player A can only complain about luck, and finally about himself, but still he eventually has to accept the situation. And this will be despite the fact that everyone – both players A and B, and the authorities – understand that the rating is pretty inaccurate thing and cannot be essentially precise. But even with all its disadvantages, the rating system, combined with the authoritarian power, provides for this order (Korsak 2004).

Therefore, rating is both a measure of fitness, and a self-assessment tool, and a reference point in terms of improving sporting skills. On the other hand, rating provides for an objective criterion for coaches and experts for them to be able to select players for various teams, or candidates for participating in the prestigious tournaments. Ranking helps the organisers of tournaments to form the initial groups as per the playing strength, to cast lots in the tournament and in general to create as equal performance conditions for all participants as possible, thereby enhancing the quality of refereeing and organising tournaments on the whole. And finally, rating helps everyone – both experts, athletes, and spectators – to predict the results of the players’ performance in the competition. (Pavlov, 2004).

Rating systems are needed to reflect the balance of forces, to “rank” the competitors, and to dynamically track changes in this ratio, expressed in the distribution of the numerical values of some conventional parameter when there are no direct methods of physical measurement of the assessed value in a particular sphere of activity . Another equally important goal of the rating system is the prediction of future results, that is, mathematically reasonable prediction, with which the Elo rating system has been successfully coping for half a century throughout its existence (Cipli, 2003).

Summing up, we can say that there is a set of definitions of rating, their components reflecting various aspects of this integrated concept. But an integrative component is lacking in almost all of them. The definition should form the main key meaning of the word “rating” and thus predetermine the direction of the development of this topic. It should lead us to an information reference point and therefore should not be mysterious.

All of the definitions given above characterise rating in some way. However, most of them look too specific and do not solve the main problem. Rating should be the expert opinion only when the solution to a problem is unknown. Rating also cannot be represented by some unknown specific numerical factor. Of course, rating is a public recognition. But first you need to get the rating, and the recognition will be its consequence.

Similar thing can be said of the rating as the process of conquering a certain information space. Rating can be an incentive for that, if it is clear what should be stimulated. Rating can by no means represent the position occupied by the athletes. Position is determined according to the rating, not vice versa. Rendering assistance in the process of selecting the athletes for the team is not a definition also, but a consequence. Rating as an internal order is more like a spell, for which the order needs first to be determined. Thus, all of the aforementioned ideas about the rating do not bring us any closer to unveiling a secret of its phenomenon.

The idea that rating is the playing strength, or the skill can be considered more acceptable. However, this definition, though essentially correct, does not give us anything practical. Defining rating by the playing strength, or the skill is the right direction for further reflection, but not their final result. It is just some intermediate stage.

With regard to sports, definition of rating as a result of the total macrotournament participant summarizes all of the opinions above. Rating is both playing strength, and recognition, and selection to the team, and position of an athlete, etc. At the same time, using the word “macrotournament” gives us the opportunity to use the existing knowledge on the basis of local tournaments. The space for further creativity remains wide, because macrotournament can be imagined in different ways. However, the field for searching possible choices for alternatives narrows significantly.

7.2. CORRELATION BETWEEN EXPECTED AND ACTUAL RESULTS AS THE MAIN CRITERION OF QUALITY OF THE RATING MODEL

The problem of quality can be solved in numerous ways. Both mathematical and qualitative criteria can be used. Chess are number one as per the number of innovations and developments. This kind of sport unites the members of intellectual sport elite, and this is where the most well-founded rankings are tried.

Let’s take the FIFA / Coca-Cola ranking as an example to illustrate the problem of correlation of existing rating models. The 2004 UEFA European Championship, like the previous ones, outlined a great number of problems. The FIFA system of ranking the participants (the so-called FIFA / Coca-Cola ranking), which was left behind the scenes, became the biggest misunderstanding of this Championship. The teams of Portugal and Greece ranked by FIFA as 22nd and 35th respectively met in the final match. Despite the apparent improvement of playing strength, the Russian team for some reason did not rise to higher level after G. Yartsev became their coach. On the contrary, the Russian team fell by 7 positions compared with the end of last year. How then could the 31st strongest team (21st strongest, if we do not count non-European teams) become one out of 16 strongest national teams?

As you know, the European Championship draw was conducted in accordance with the allocation of the teams to pots, their contents having been defined on the basis of the very same rating. Can our group be considered equal to other groups if besides the Russian team it included two would-be final players and the national team of Spain, the third strongest team? How it is that the Czech Republic, whose reserve team beats Germany, is nevertheless lower in the ranking?

It is technically impossible to compare the teams that had not officially played any matches against each other. If the teams from Africa have not played against the teams from Europe and America for four years, then there is no cause for including them into the general list of teams. The suggested ranking did not have any statistical justification. According to E. Potyomkin, the FIFA / Coca-Cola ranking is about the same as selecting the winner of the beauty contest just with the help of weighing.

In tennis the rankings are changed frequently, but it’s good. They are looking for such an option, which would imply that in ninety-nine cases out of a hundred eight strongest players of the tournament would qualify for the quarter finals. They are concerned with the convergence of the chosen model. The FIFA / Coca-Cola Ranking is not concerned with convergence and once again brought discredit on itself by the superiority of lower rated opponents over the higher rated ones. The amateurish level of ranking does not correspond to the level of competitions. Why should the teams pay for the commercial interests of FIFA, and its flirtation with powerful corporations in the form of giving them the possibility of “ruling” with the help of ranking?

Market issues and availability of bonuses for higher rating both for athletes and coaches have a considerable influence on the issue of ranking. The position of an expert coach allows to promote the right people. The representatives of weaker teams should really be happy with the fact that ranking creates “groups of death” and groups that are easy to get out of. For such weaker teams it is a chance to outflank more powerful opponents. This is why weak rankings may exist for a very long time, despite a large number of pratfalls that are obvious for everybody.

The actual ranking value is determined by the convergence of expected and actual results. It is obvious that absolute convergence will never be possible. Sport is especially good when there are surprises in it. If the strongest players always win, then any sport will just die. In addition, too many factors affect the results in sports. Rating can be calculated in many ways, and with the help of different models. But only a more universal and converse model will remain. Let’s assume that rating is the result of a participant in the global macrotournament. However, all-to-all round-robin tournament is impossible, because there are too many participants. Therefore, a ranking model is needed, which would reflect all the results of the macrotournament on the basis of the part of its results. In this case, the question arises – how accurate is the reproduction of the macrotournament part that had not been played? Can we trust a model, which assigns you a defeat in cases where you had won? The quality of the model is judged by the convergence of expected and actual results. The participant of the macrotournament is interested in the ranking system that assesses him with maximum accuracy, or, otherwise speaking, with minimum errors. The models are changing towards the greatest convergence of expected and actual results.

7.3. DEVELOPMENT OF RATING CLASSIFICATIONS IN SPORTS

Now understanding the words of Helvetius that the knowledge of some principles easily compensates for the ignorance of some facts, let’s look at how rating classifications were developed.

RATING AS AN EXPERT GROUP JUDGEMENT

According to this approach, a specific group of experts is gathered for each event, and they “weigh” the participants of this event. Thus, for each boxer the rating factor is calculated as a ratio of the sum of all the wins of the opponents he had overcome, to the sum of all their defeats. (Telebox, 2004).

Such classifications can be described with the words “a little bird told me”. The only difference is that not only the impartial judges but the real opponents can perform the role of such a “bird”. The expert group judgement is used where the algorithm for solving the problem is not even noticeable. The subjective opinion of the referees is used in gymnastics, figure skating, and other kinds of sports.

INFORMATION MIXTURE

According to this approach, all available information about an object is dumped into a total stack, and priority will be given to the object with more information. Selection of such information and specific weights of certain parameters is usually done by the expert group.

The number of points N, gained by the team for the match is calculated by the following formula (Bozhkov, 2004)

N = M * P * R + B, (7.1)

where M is the amount of points gained for the result of the match (this is a plus number for a win or a draw in away matches, and a minus number for a loss or a draw in home matches),

P is a factor that takes into account the place where the match was played (i.e. whether it was a home or away match, or a match played on neutral ground), R is a factor that takes account the goal difference, B stands for bonus points that take into account the tournament level and round (final, semi-finals, etc.)

The main problem of such classifications is that rating has no physical meaning and its compound ingredients tend to interact non-linearly, throwing out to the top these, those and the other. In 1998 the International Federation of Football History and Statistics from Germany, having taken such a rating as a basis, ranked FC Barcelona first among the Spanish football clubs, whereas this team had lost both the matches against “Dynamo Kiev” that year (the scores in the matches were 0:3; 0:4), and nearly all the matches in the Champions League.

BONUS RATING CLASSIFICATIONS

According to the bonus, points are given for each place taken by the athlete in the competition, and at the end of the year all the points are summarised. Thus, the final rating is formed. Let’s consider a few examples. Table. 9.1 reflects bonuses in bowling. “6. The results of rock climbers are assessed according to table 9.2 (Rock Climbing, 2004).

Table 7.1. Bonuses in bowling (Ukraine)

Scoring system in bowling (Ukraine) (place-points)

Women Men

1 20 1 40

2 19 2 39

3 18 3 38

4 14 4 31

5 13 5 30

6 12 6 29

7 8 7 28

8 7 8 27

9 6 9 26

10 5 10 25

11 4 11 24

12 3 12 23

13 16

14 15

15 14

16 13

If we are talking about the bonus approach, the scoring system is transformed into the bonus system. This is a more differentiated approach. Its main disadvantage is that the rating is determined according to the position whereas it should be vice versa – the position should be determined in accordance with the rating. On the other hand, such classifications are only for the narrow group of elite. Other participants remain completely unrated.

Tennis is another kind sport where rankings are very extensively used. Atari-ATP, the most famous international ranking, has been used in professional tennis since 1979. Here we are talking about implementing a system of bonuses, typically used in business, into sports rating.

Each player is originally assessed as per the number of gained points, divided by the number of tournaments they played. These points depend both on the tournament prize and on the entry list. The richest “harvest” is reaped in Grand Slam tournaments. Besides, a tennis player can get the so-called bonuses. Having beaten the World No. 1 player in tennis, the player will receive extra 50 points. If the tennis player beats the opponents that are ranked second to fifth in the world list, the player will be awarded 45 points. However, if the tennis player beats the opponents that are ranked 150-200 in the world list, he can gain just one point.

Every participant of the qualifying tournament that after it qualifies for the main draw gains one point. Each time he is awarded another point if in the qualification tournament he beats the opponent ranked among the first 150 in the world list. However, the total of points after the qualification tournament may not exceed three. In the satellite tournaments bonus points are awarded only to the players qualified for the finals, regardless of their categories.

A tennis player can play as much as he wishes, but for him only the results, which he had shown in 14 most successful competitions over the past 52 weeks, count. The points are kept for a year. So when E. Kafelnikov in early 1997 has not played for three months, his ranking did not change for the worse. Note that the Atari-ATP ranking system is a kind of transformation of the traditional ranking system, and points there are awarded before the bonus classification.

RATING AS A RESULT OF THE FORMULA FOR SUCCESS

The gist of this approach is that it takes the success indicators which are measured cumulatively according to some formula (conventionally named “formula for success”). This approach differs from the “information mixture” because its formula is based on multiple regression from the parameters correlated with the collective success. These formulas should be controlled and changed in the course of time, otherwise their performance gradually decreases.

Such an approach was used in basketball in Russia. There the record of the match is made, where the following parameters are recorded (in brackets the weighting factors are indicated, and in the total score the parameters are taken into account together with them): gained points (1), assists (1), steals (1,4), blocked shots (1,2), defensive rebounds (1,2), offensive rebounds (1,4), opponents’ fouls (0,5), number of inaccurate double shootouts (-1), number of inaccurate three-point shootouts (-1,5), number of inaccurate foul shots (-0,8), turnovers (-1,4), technical fouls (-1), fouls (-1). The result is divided by time the player spent on the court, assessing the player’s efficiency per every minute of their stay on the court.

The analysis shows that the “formulas for success” can only work in the field where nothing changes for a long time, because they do not have any inverse relationships with these changes.

CONSEQUENT RECALCULATION OF THE RATING TOWARDS THE GREATEST BALANCE

The classifications, similar to the Elo system, implicitly use a solution of a linear system of equations. S.V. Pavlov, the Chairman of the Russian Go Federation Rating Commission (2003) managed to improve the suggestion of A. Elo towards even greater convergence of results. They were suggested to use the generalized formula by A. Elo to recalculate the rating:

РК = РКstart + SUM (Ki · (Ri – Pi)), (7.2)

where Ri is the result of the i-th game (1 or 0), Pi is the probability of winning in this game, and Ki is the dynamic factor for the given game.

Let’s take the so-called “popular rating” by E.I. Potyomkin (2004) as another example. It is called “popular”, because it is necessary to know just two mathematical operations – addition and subtraction – to calculate it. And it is also necessary to strike out the last significant figure to define the bet for the game. Each of the teams has 100 points at the beginning of the championship. This is their starting rating, or strength. The teams make bets for each game, bets being equal to one tenth of their strength. In the first round, all ratings are equal and the bets are equal, too. From 100 rating points the team makes a bet in the amount of ten points. The winner takes the bet of the loser. After the first round all the winners will have 110 points, and all the losers will have 90 points. In the second round the winners bet 11 points, and the losers bet only 9 points. In case of a draw the teams exchange bets. In case in the second round there is match between the winner and the loser of the previous round, the rating of the first one is 110 points, and they make a bet in the amount of 11 points, whereas the rating of the second team is 90 points, and they make a bet in the amount of just 9 points.

If such a classification were used, then according to the popular (proportional) rating, “Lokomotiv” would have become the winner of the football national championship in 2003. This reflects very good results of this team in the last rounds. The team was trying to prove to themselves and to their fans that they deserve more than just a formal fourth place, which had been determined in accordance with the gained points.

Classifications of this kind are made with an attempt to “improve”, to “master” the formula by A. Elo. As a result they resemble a kind of hut made of patches. Everybody wants to repair. And who is going to build?

RATING AS A RESULT OF A PARTICPANT OF A HYPOTHETICAL GLOBAL MACROTOURNAMENT

The result of the participant of the hypothetical global chaotic macrotournament is defined through the explicit solution of systems of linear equations (hereinafter referred to as SLE) where the participant is compensated for all the factors that create unequal conditions. The suggestion made by A. Elo in 1963 in the “Chess live” magazine is a way to solve the system of linear equations with the help of the method of successive approximations or recalculations. The researchers of ratings always forget that writing out the equation for participants successively, they use a system of linear equations which either may have solutions or not.

A. Sukhov, the creator of the rating classification in table tennis in the Russian Federation, used a theory of graphs instead of SLE. The joint studies revealed not more than 3-5% differences in the solutions in similar situations. Due to the fact that SLE was used, they managed to find a linear solution for the typically non-linear problem.

Here is an example of SLE constructed by E.L. Potyomkin. In this example the author successfully avoided the need to define a type of functional dependency due to the fact that he identified the ratings with the possibility of winning head-to-head game. The connection between the pair ratings and the number of wins and defeats for each of the opponents is defined as

Rij / Rji = Wij/Wji, (7.3)

where Wij stands for the wins of the i-th opponent over the j-th one. The absolute value of the pair ratings Rij and Rji has not been determined and is irrelevant yet.

The described approaches represent an attempt to reduce the problem to the linear model. A large number of options for the compilation of a linear system of equations did not lead to filling the concept of rating with specific physical meaning.

REAL GLOBAL MACROTOURNAMENT

All kinds of sports evolve towards the international championship. However, there is no formula necessary for it yet. All-to-all round-robin global macrotournament is impossible, because there are too many participants. Therefore, a ranking model is needed, which would reproduce the playing level (ranking) of the participants on the basis of the part the results of the macrotournament. On the basis of the correlation of these data the results of all matches, which had been and hadn’t been played, could be understandable. The difference of the obtained rankings of two participants corresponds to the result of their head-to-head game. A Swiss system is a prototype of such a macrotournament.

The real global macrotournament will “take place” only provided there is a provision for convergence of expected and actual results. If the difference in ratings makes it clear that you will beat the opponent by a score of 2:1 and you really beat the opponent by this score, then there arises a question – what was the use of playing? The convergence allows you to not play some games of the macrotournament and due to that make it real.

The rating formula suggested in (Polozov A.A., 1996) (described below) is similar to the Swiss system. But according to this classification, at the next round of the tournament the matches are held not just between the participants that have most similar forces. Here the pair matches of all participants of two microtournaments take place, these microtournaments having been isolated before. Here the ratings of all players can be calculated in the team sports.

7.4. STRUCTURAL CONTRADICTIONS OF MODERN CLASSIFICATIONS

Modern classifications of ratings are quite diverse. Sports differ from other spheres where ratings are applied because the results here are more transparent and that holds out a hope of making a clearer decision compared to areas where it will be hardly possible to get rid of the expert features for a long time to come.

DISCUSSION ON THE STRUCTURAL FEATURES OF MODERN CLASSIFICATIONS

Besides the definition of the concept of rating, the major differences between the existing rating classification are based around the following questions and answers:

Table 7.2. Basic contradictions in the classifications and the expected answers to them

Basic contradictions Possible answer

what should be taken as the information system of the ranking? The balance between scored and allowed goals is suggested etc.

the scope of rating scale operation: every point gained, game, set, match, tournament party. … It is suggested that it should be each gained point

what period of the competition should be assessed with a rating – should it be a month, a year, or a decade? It is suggested that it should be one year

what properties should the function, selected for calculating the rating, possess? It is suggested that it should be anticommutativity.

Should the distribution necessary for calculating the rating be specified as a function or as a table of values? It is suggested that it should be a function.

is the distribution of a function for the probability of winning “normal”? Yes.

what is the minimum number of games to be played by the participant for obtaining a ranking? In order the rating determination error should not be lower than necessary level

are an increase of ranking one of the opponents and a decrease of the ranking of the other equal? Yes.

is it possible to calculate ratings of all participants of two isolated events? No

Should the calculation take into account the results of matches between the opponents of different strength? No

what is the original average value of rankings in different classifications? In order the rating of the weakest party to be above zero.

Should the average rating of all participants be adjusted or should it always be permanent? It changes because of the development of sport

is the transitivity principle applicable to rating in sports? Yes, if this principle is generalised for all the results for the year.

how the only possible distribution of ratings can be ensured? It is supposed to be done by solving a system of linear equations

are changes in the rating defined after each match, or after the competition as a whole? they are defined on-line after each serve.

is it necessary to consider in the rating those factors that create unequal conditions? Yes It is necessary to make equal conditions for the participants

is the value of rating expressed by the exact number, or we are talking about a certain range of the rating scale? By the exact number

do the results of official participants of the competition correspond to the position of participants in the ranking? It depends on the correctness of the ranking itself

should the rules be changed in order to achieve more reliable ranking results? It will have to be done

do the isolated microtournaments create any distortions? Yes.

what kind of paradoxes arise while calculating the rating?

Those that use a gradual recalculation of ratings

can the age factor be compensated for?

should the participant be punished by lowering the rating for skipping another competition? It is possible, but not feasible.

No

should there be a connection between categories, titles and rankings? Rating is a more accurate assessment

if we assume that rating is the product of parameters of all aspects of the competition, then which of these aspects are included in the product most often? Difference between the scored and conceded goals

what should the author of the rating provide the public with in the first place: calculation formulas or calculation principles? Principles

what is the connection between the ratings of participants and the result of their head-to-head game? It is for these reasons that the formula is selected.

PARADOXES IN CALCULATING THE RATING

1. Unreasonable simplification of rules for counting the expected result. Here is an example from chess. An average rating of opponents is calculated instead of summing up the expected sum the results for all games, and it is believed that all games are played against such an “average” opponent. This leads to a violation of the “conservation” law, i.e. the sum of ratings before and after the tournament are not equal (without taking the roundings into account). Imagine a hypothetical tournament involving three athletes, two of whom having equal rating, whereas the rating of the third participant is much lower. After such a tournament, the total amount of ratings will decrease by an amount close to 5 points (if the rating of the third participant decreases).

2. In case of a large number of games the rating can be indefinitely changed. Let two players rated 2400 play a match consisting of a large number of games, the first player gaining 1.5 points in every two games. Then after every two games the rating will increase by 5 points, and will reach 3000 points after 240 games. A fair conclusion is forced upon you: the rating should be calculated after each game (this refers not only to games, but also to tournaments). Then in this example, the strongest player’s rating is going to set at 2500, and the weakest player’s rating will be equal to 2300.

Of course, anyone will hardly calculate the ratings specifically after each game, and the order of the games cannot always be defined. Therefore a simple way out of the situation can be found: the match (tournament) is still calculated as a whole; however, it is made not in a single run, but for n runs where number 10 in the formula for calculating the changes of ratings R = 10 * (P-E) is replaced with 10/n (n is the maximum number of games played by a person in the tournament (match)). With the help of the programme made by V. Shulyupov the ratings are calculated within the accuracy of 0.1. (Stepanchuk, 2004). However, if the number of games in the match or a tournament does not exceed 20-25 (and this almost never happens) then there won’t be any misunderstandings (Gik, 1976).

3. Some anecdotal evidences of the results. When calculating the ratings of teams participating in the 1982 World Cup, Italy played in one preliminary round group with Cameroon. During the preliminary round the Italian team played two matches by a score of 0:0 and 1:1, including a head-to-head game with the national team of Cameroon that ended by a score of 1:1. The Cameroon national team played two other matches in the round played only by a score of 0:0. It is natural that these scores should be equal to the fact that the Cameroon team had not played against anyone except for the Italian team. And since they played 1:1, the Cameroon team was doomed to have the same rating as that of the Italian team. And as we know, the Italian team became champions that year. From the point of view of the macrotournament the first place in the calculations for 1982 was shared by the teams of Italy and Cameroon (Polozov, 2004).

4. Doubling the parameters for the gaming activities. If you consider the difference between scored and conceded goals, and also the gained points, then the dependency of the rating becomes non-linear due to high degree of correlation between these parameters. This can lead to instability of the final outcomes. In the end, changing the factors inevitably turns into a permanent process. Such informal classifications can be considered as a temporary compensation for the lack of the official ones (Polozov, 2000).

All the paradoxes of the ratings’ recalculation may be connected with the absurdity of the situation itself, or with the inadequate use of fixed factors in the formula and with the fact of going beyond the respective system of linear equations. It is the arbitrariness in the arrangement and the recalculation that results in the loss of the correct decision.