world team championships 2015 part 4

banner-website

Excitement is already building about WTC 2016! With the extra data released by the WTC team, there’s a great opportunity to improve predictions for next year. To improve my predictions for next year I first need to assess how well I predicted this year’s results. I was very proud when Italy Michelangelo beat Poland Grunts in round 1, but I had winners Finland Blue as one of the lowest rated teams.

First off, I ran out of time when making predictions previously. After generating rank distributions for each team, I simply used the maximum score for the team order. This does not take into account the shape of the distribution profiles. A better estimate of the typical value of a ranked distribution is the moment. This then gives a better summary of the simulation data.

 

 

 

R Code for Summarizing Rank Profile Moment
moment <- function(x) {
    mean(x * seq_along(x))
}

mmax <- apply(X = res, MARGIN = 1L, FUN = moment)
reso <- res[order(mmax), ]
mmax <- mmax[order(mmax)]

mmax[1:4]
#
#           England Lions                USA Stars 
#                 14.7988                  14.8010 
#      Italy Michelangelo         Australia Wombat 
#                 15.3058                  15.4640 


Rank Team Score.Moment
1 England Lions 14.80
2 USA Stars 14.80
3 Italy Michelangelo 15.31
4 Australia Wombat 15.46
5 Sweden Nobel 19.94
6 Poland Grunts 21.25
7 Poland Leaders 28.15
8 Scotland Irn 28.97
9 Sweden Dynamite 31.17
10 Germany Dichter & Denker 33.49
11 Ireland Craic 33.54
12 Australia Platypus 35.61
13 Germany Bier & Brezel 38.68
14 England Roses 38.73
15 Belgium Blonde 39.92
16 Netherlands Lion 44.30
17 Canada Goose 45.94
18 Finland Blue 48.45
19 Canada Moose 49.32
20 Denmark Red 50.20
21 Finland White 52.47
22 USA Stripes 52.93
23 China 52.97
24 Belgium Brown 53.75
25 Denmark White 54.19
26 Ireland Ceol 54.46
27 Greece Prime 55.21
28 Greece Epic 55.49
29 Spain North 55.52
30 Middle East 55.87
31 Spain South 56.22
32 United Nations 56.77
33 Wales Storm 57.01
34 Wales Fire 57.15
35 Italy Leonardo 61.20
36 Ukraine 62.63
37 Scotland Bru 63.44
38 France Asterix 64.29
39 Russia Wolves 64.58
40 Russia Bears 66.00
41 France Obelix 66.84
42 Northern Ireland 1 69.35
43 Netherlands Hero 71.47
44 Portugal Prime 72.55
45 Norway Blue 72.58
46 Northern Ireland 2 73.64
47 Norway Red 73.82
48 Czech Republic 78.65
49 Switzerland Red 80.90
50 Portugal KULT 89.98

 

 

 

Next up is to create a statistic that summarizes goodness of a rank prediction. I created a function to score a guessed sequence. The score is the mean distance a guess position is from the actual result. The score is the sum of the absolute distance between the predictions and the actual results, divided by the number of guesses. So for example, guessing exactly right has a score of 0. Guessing the first three teams in the reverse order scores (2+0+2)/3 = 1.33. I also added partial matching to allow guesses just for countries (for example if the team names change), in which case the average difference for each team is returned.

> library(WTCTools)
> scoreSequence(guess = letters[1:3], result = letters)
[1] 0
attr(,"n")
[1] 3
> scoreSequence(guess = c("c", "b", "a"), result = letters)
[1] 1.333333
attr(,"n")
[1] 3
> scoreSequence(guess = "Australia", result = leaderboard15$Team)
[1] 2
attr(,"n")
[1] 1

The probability of getting a certain score given a number of guesses can be estimated by simulation.

 

 

 

R Code for Estimating Probability Density
> # get all n
> NN <- 2e3
> scor <- seq(from = 0, to = 50, by = 0.1)
> rres <- matrix(0, nrow = NN, ncol = 50)
> dens <- matrix(0, nrow = length(scor), ncol = 50)
> for (j in 50:1) {
+      for (i in seq_len(NN)) {
+           # score random guess
+           gg <- c(sample(lead$Team, size = j), rep("", times = 50 - j))
+           rres[i, j] <- scoreSequence(guess = gg[order(ind[i, ])], 
+                result = lead$Team)
+      }
+      # sequence of scores
+      rres[, j] <- sort(rres[, j])
+      # cumulative probability
+      # percentage of scores that are less than score threshold
+      for (qu in seq_along(scor)) {
+           dens[qu, j] <- 100 * sum(rres[, j] < scor[qu]) / NN
+      }
+ }

 

 

 

These probabilities can be visualized as a series of traces. When guessing just one team’s rank a score could be any value between 7 and 24 (a 50% chance of being in the range 1-39). As we increase the number of guesses we are more likely for the score to be a middle value. So when making 5 guesses, there is a 50% chance of being in the range 13-20. When making 25 guesses, there is a 50% chance of being in the range 15-18.

wtc_score_quantiles_sample

 

 

 

R Code for Plotting Cumulative Probability
  library(colorspace)
  cols <- diverge_hsv(n = 50)
  matplot(x = scor, y = dens, 
+     xlab = "Score", ylab = "Cumulative Probability (%)", 
+     col = cols, type = "l", lty = 1)
  legend(x = "bottomright", legend = c(1, 10, 40, 50), 
+     col = cols[c(1, 10, 40, 50)], lty = 1)

 

 

 

The benchmark for place prediction is using the 2014 ranking to place the 2015 teams. Of the 52 teams in WTC2014, 17 teams returned with the same name, and 43 nation teams returned. Guessing this year’s team result based on last year’s performance gives a score of between 9.8 and 10, depending on which two non-present teams are excluded. My score for 2015 based on my posted prediction is 11.3, which is a poor showing! Using the moment rather than the outcome distribution maximum gives me a score of 9.8, as good as the best prediction based on past performance (and for 49 teams, rather than 42). This is a relief, but obviously can be improved!

So why did my analysis rate Finland Blue so poorly? My analysis was based solely on the 2014 results. Of the three players in Finland Blue who played in 2014, Jaakko had a good record (5-1), but Tatu and Mikko had an average record (3-3). With the two unrated players, Henry and Pauli, also being assumed as average players, this put the Finnish team right in the middle of the pack (18/50). However, all three veteran players had a good record in 2013. Including this data shows that the team was stronger than their 2014 performance suggested. They were not the favourites, although their caster selection looks strong. With three year’s WTC data I believe my predictions for 2016 will be better scoring!

Advertisements

2 thoughts on “world team championships 2015 part 4

  1. Pingback: the pairing game | analytical gaming

  2. Pingback: Scoring WTC Forecast Performance | analytical gaming

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s