Home

Using world ranking to predict the results of the 2019 Rugby World Cup pool stages

This year (2019) is a rugby world cup year. I like data visualisation, and I like rugby. So here's my primitive attempt to calculate the results of the pool stages of the 2019 World Cup. My simple heuristic is that world ranking (as of Aug 22nd 2019) is a predictor of a team's success. In effect, my simple algorithm states that a team with a higher ranking will always beat a team with a lower ranking.

Just here for the predictions? skip to the end

Note: I'm not trying to predict who will win a match, just who is expected to win. When the inevitable "upsets" occur (like Japan vs. South Africa in 2015), I want to be able to say "wow! They only had a {N}% chance of winning, but they did it!"

So let's translate that calculation into code:

const chanceOfWinning = (teamRank, oppositionRank) => {
    const combinedRanks = teamRank + oppositionRank;
    const invertedRanking = combinedRanks - teamRank; // Becasue a rank of 1 is the best
    const percentage = (invertedRanking / combinedRanks) * 100;
    return percentage.toFixed(1); // Round to 1 decimal place
}

chanceOfWinning(2,13); // 86.7
// NZL are ranked `2`, and ITA are ranked `13`
// Therefore NZL have an 86.7% chance of beating ITA

With this primitive algorithm, I can produce a "likelihood of winning" percentage for any pairing of teams. And on first inspection it looks pretty good (based on my own subjective opinion of who should win a given match). New Zealand are currently ranked #2 in the world, and should be expected to crush a #13 side like Italy. Samoa and Russia (#16 and #20 respectively) should be a much closer match, but you'd expect Samoa to emerge victorious.

2nzl
86.7%
13.3%
ita13
16sam
55.6%
44.4%
rus20

There are problems with using rank

But this method starts to look a bit shaky when we include the #1 ranked team in the world (a crown recently claimed by Wales at the time of writing). Not because Wales are particularly special, but because this algorithm massively favours lower rankings. I'd expect Wales to crush Uraguay (#19 in the world), but would not expect them to have such an easy time against Australia (ranked #6). The ranking-based algorithm predicts both matches would be walkovers:

1wal
95%
5%
ura19
1wal
85.7%
14.3%
aus6

And there's another problem with using world rankings. Rankings, by their very nature, are ordinal. By ranking alone, the difference between #1 and #2 is the same as the difference between #2 and #3, and so on... Whereas in reality, some teams are much closer than their mere ranking would suggest.

Using points rather than ranking

A better metric to use as the base for our calculation would be points. Word Rugby, the sport's governing body, uses a points system to determine the world rankings. These points are based on match performance, and range from zero to one hundred (the top side generally has a rating of somewhere near 90 points). In late August 2019, Wales have 89.43 points and New Zealand have 89.40 - it's tight at the top! Australia are on 84.05 and Uraguay have 65.18 points.

Using points rather than rank changes our algorithm slightly (we no longer need to invert the team's value, as higher points are better).

const chanceOfWinning = (teamPoints, oppositionPoints) => {
    const combinedRanks = teamPoints + oppositionPoints;
    const percentage = (teamPoints / combinedRanks) * 100;
    return percentage.toFixed(1);
}

Plumbing our examples into this calculation produces a much tighter set of matches. The end results are still the same (in this system, a team with higher points will always beat a team with lower points, in just the same way as the team with the better ranking always wins).

2nzl
55.4%
44.6%
ita13
16sam
51.6%
48.4%
rus20
1wal
57.8%
42.2%
ura19
1wal
51.6%
48.4%
aus6

These results look a little better than the ranking-only method. The delta between WAL/URA and WAL/AUS looks more realistic, and whoever is in the #1 spot has less of an unfair advantage. But now the amounts look wrong. Any theory that gives Italy a 44.6% of beating New Zealand must be inaccurate.

Increasing the weighting

The points-based system is a better reflection of the team's relative chance of winning, but to my eyes the results aren't extreme enough. It gives too much credit to the lower-tier teams, and not enough to the top-tier ones. For the calculation to better match my expectations, it needs to favour the teams at the top of the rankings. Not only that, but it needs to do it progressively - so a team in the middle gets a bit of a boost, but not as much as those at the top get.

I need to write a function that will adjust the points value of each team. The easiest way to get the result I'm after is to multiply each team's points by a power.

const adjustment = num => Math.pow(num, 5);
0510152060657075808590
Actual rankings
Adjusted rankings [normalized]
Linear (n) vs. power (n^5) rankings
0510152060657075808590
Linear (n)
Power (n^5) [normalized]
Simple linear (n) vs. power (n^5)
const chanceOfWinning = (teamPoints, oppositionPoints) => {
    const combinedRanks = adjustment(teamPoints) + adjustment(oppositionPoints);
    const percentage = (adjustment(teamPoints) / combinedRanks) * 100;
    return percentage.toFixed(1);
}

I started with 2 as the exponent, and that was better than nothing, but still not enough. 10 was too extreme, and in the end I settled on 5. Increasing each team's points by a power of 5 gave me a set of probabilities that looked about right. That formula added just enough of a notch in the middle of the graph - and thereby increasing the likelihood of a top-tier team beating a lower-tier one.

2nzl
74.6%
25.4%
ita13
16sam
57.9%
42.1%
rus20
1wal
82.9%
17.1%
ura19
1wal
57.7%
42.3%
aus6

Results for all the pools

This is of course only based on my experience of rugby and my own highly subjective opinions. But it is still anchored in reality because I'm using the points as a starting point, and treating each team equally (as much as I want to give England a boost, the algorithm doesn't support it).

Ironically, this calculation shows that the draw for this world cup does give England a slight boost. The top 8 teams make it through to the quarter finals as you would expect. But when it comes to the semis, 4th ranked South Africa miss out, while 5th ranked England manage to sneak in. A side-effect of the pools being drawn years before the event. On the other hand, it probably shows that the draw-process works fairly well if, given all the top 8 make it into the quarters (or at least shows that the rankings have been comparatively static).

I'm not expecting these predictions to come true - there's a lot more to success in rugby than simple rankings. But I do find this kind of objective analysis useful for setting expectations. Looking at these predictions, I'll make more of an effort to see matches I might otherwise have passed on. Tonga vs. USA, for instance, looks like it'll be a close one. As do Scotland vs. Japan and New Zealand vs. South Africa (although after this year's Championship you don't need an algorithm to tell you that'll be a real grudge match!).

Pool A matches

9jap
70.59%
29.41%
rus20
3ire
64.06%
35.94%
sco8
16sam
57.91%
42.09%
rus20
3ire
66.67%
33.33%
jap9
8sco
66.18%
33.82%
sam16
3ire
82.76%
17.24%
rus20
9jap
63.56%
36.44%
sam16
8sco
72.92%
27.08%
rus20
3ire
77.72%
22.28%
sam16
8sco
52.88%
47.12%
jap9

Pool A results

  1. 1Ireland4 wins
  2. 2Scotland3 wins
  3. 3Japan2 wins
  4. 4Samoa1 wins
  5. 5Russia0 wins

Pool B matches

2nzl
53.84%
46.16%
rsa4
13ita
69.65%
30.35%
nam23
13ita
69.05%
30.95%
can21
4rsa
85.28%
14.72%
nam23
2nzl
86.78%
13.22%
can21
4rsa
71.62%
28.38%
ita13
2nzl
87.11%
12.89%
nam23
4rsa
84.91%
15.09%
can21
2nzl
74.64%
25.36%
ita13
21can
50.71%
49.29%
nam23

Pool B results

  1. 1New Zealand4 wins
  2. 2South Africa3 wins
  3. 3Italy2 wins
  4. 4Canada1 wins
  5. 5Namibia0 wins

Pool C matches

7fra
56.8%
43.2%
arg11
5eng
72.51%
27.49%
ton15
5eng
71.89%
28.11%
usa14
11arg
58.05%
41.95%
ton15
7fra
63.83%
36.17%
usa14
5eng
65.58%
34.42%
arg11
7fra
64.53%
35.47%
ton15
11arg
57.3%
42.7%
usa14
5eng
59.18%
40.82%
fra7
14usa
50.77%
49.23%
ton15

Pool C results

  1. 1England4 wins
  2. 2France3 wins
  3. 3Argentina2 wins
  4. 4United States1 wins
  5. 5Tonga0 wins

Pool D matches

6aus
60.81%
39.19%
fij10
1wal
71.48%
28.52%
geo12
10fij
69.68%
30.32%
ura19
12geo
65.99%
34.01%
ura19
1wal
57.69%
42.31%
aus6
10fij
54.22%
45.78%
geo12
6aus
78.1%
21.9%
ura19
1wal
67.91%
32.09%
fij10
6aus
64.76%
35.24%
geo12
1wal
82.94%
17.06%
ura19

Pool D results

  1. 1Wales4 wins
  2. 2Australia3 wins
  3. 3Fiji2 wins
  4. 4Georgia1 wins
  5. 5Uraguay0 wins

Quarter Finals

5eng
54%
46%
aus6
2nzl
64.97%
35.03%
sco8
1wal
62.74%
37.26%
fra7
3ire
52.85%
47.15%
rsa4

Semi Finals

2nzl
53.7%
46.3%
eng5
1wal
51.04%
48.96%
ire3

3rd place

3ire
52.7%
47.3%
eng5

Final

1wal
50.04%
49.96%
nzl2