Excitement is already building about WTC 2016! With the extra data released by the WTC team, there’s a great opportunity to improve predictions for next year. To improve my predictions for next year I first need to assess how well I predicted this year’s results. I was very proud when Italy Michelangelo beat Poland Grunts in round 1, but I had winners Finland Blue as one of the lowest rated teams.

First off, I ran out of time when making predictions previously. After generating rank distributions for each team, I simply used the maximum score for the team order. This does not take into account the shape of the distribution profiles. A better estimate of the typical value of a ranked distribution is the moment. This then gives a better summary of the simulation data.

## R Code for Summarizing Rank Profile Moment

moment <- function(x) { mean(x * seq_along(x)) } mmax <- apply(X = res, MARGIN = 1L, FUN = moment) reso <- res[order(mmax), ] mmax <- mmax[order(mmax)] mmax[1:4] # # England Lions USA Stars # 14.7988 14.8010 # Italy Michelangelo Australia Wombat # 15.3058 15.4640

Rank | Team | Score.Moment |
---|---|---|

1 | England Lions | 14.80 |

2 | USA Stars | 14.80 |

3 | Italy Michelangelo | 15.31 |

4 | Australia Wombat | 15.46 |

5 | Sweden Nobel | 19.94 |

6 | Poland Grunts | 21.25 |

7 | Poland Leaders | 28.15 |

8 | Scotland Irn | 28.97 |

9 | Sweden Dynamite | 31.17 |

10 | Germany Dichter & Denker | 33.49 |

11 | Ireland Craic | 33.54 |

12 | Australia Platypus | 35.61 |

13 | Germany Bier & Brezel | 38.68 |

14 | England Roses | 38.73 |

15 | Belgium Blonde | 39.92 |

16 | Netherlands Lion | 44.30 |

17 | Canada Goose | 45.94 |

18 | Finland Blue | 48.45 |

19 | Canada Moose | 49.32 |

20 | Denmark Red | 50.20 |

21 | Finland White | 52.47 |

22 | USA Stripes | 52.93 |

23 | China | 52.97 |

24 | Belgium Brown | 53.75 |

25 | Denmark White | 54.19 |

26 | Ireland Ceol | 54.46 |

27 | Greece Prime | 55.21 |

28 | Greece Epic | 55.49 |

29 | Spain North | 55.52 |

30 | Middle East | 55.87 |

31 | Spain South | 56.22 |

32 | United Nations | 56.77 |

33 | Wales Storm | 57.01 |

34 | Wales Fire | 57.15 |

35 | Italy Leonardo | 61.20 |

36 | Ukraine | 62.63 |

37 | Scotland Bru | 63.44 |

38 | France Asterix | 64.29 |

39 | Russia Wolves | 64.58 |

40 | Russia Bears | 66.00 |

41 | France Obelix | 66.84 |

42 | Northern Ireland 1 | 69.35 |

43 | Netherlands Hero | 71.47 |

44 | Portugal Prime | 72.55 |

45 | Norway Blue | 72.58 |

46 | Northern Ireland 2 | 73.64 |

47 | Norway Red | 73.82 |

48 | Czech Republic | 78.65 |

49 | Switzerland Red | 80.90 |

50 | Portugal KULT | 89.98 |

Next up is to create a statistic that summarizes goodness of a rank prediction. I created a function to **score a guessed sequence**. The score is the mean distance a guess position is from the actual result. The score is the sum of the absolute distance between the predictions and the actual results, divided by the number of guesses. So for example, guessing exactly right has a score of 0. Guessing the first three teams in the reverse order scores (2+0+2)/3 = 1.33. I also added partial matching to allow guesses just for countries (for example if the team names change), in which case the average difference for each team is returned.

> library(WTCTools) > scoreSequence(guess = letters[1:3], result = letters) [1] 0 attr(,"n") [1] 3 > scoreSequence(guess = c("c", "b", "a"), result = letters) [1] 1.333333 attr(,"n") [1] 3 > scoreSequence(guess = "Australia", result = leaderboard15$Team) [1] 2 attr(,"n") [1] 1

The probability of getting a certain score given a number of guesses can be estimated by simulation.

## R Code for Estimating Probability Density

> # get all n > NN <- 2e3 > scor <- seq(from = 0, to = 50, by = 0.1) > rres <- matrix(0, nrow = NN, ncol = 50) > dens <- matrix(0, nrow = length(scor), ncol = 50) > for (j in 50:1) { + for (i in seq_len(NN)) { + # score random guess + gg <- c(sample(lead$Team, size = j), rep("", times = 50 - j)) + rres[i, j] <- scoreSequence(guess = gg[order(ind[i, ])], + result = lead$Team) + } + # sequence of scores + rres[, j] <- sort(rres[, j]) + # cumulative probability + # percentage of scores that are less than score threshold + for (qu in seq_along(scor)) { + dens[qu, j] <- 100 * sum(rres[, j] < scor[qu]) / NN + } + }

These probabilities can be visualized as a series of traces. When guessing just one team’s rank a score could be any value between 7 and 24 (a 50% chance of being in the range 1-39). As we increase the number of guesses we are more likely for the score to be a middle value. So when making 5 guesses, there is a 50% chance of being in the range 13-20. When making 25 guesses, there is a 50% chance of being in the range 15-18.

## R Code for Plotting Cumulative Probability

library(colorspace) cols <- diverge_hsv(n = 50) matplot(x = scor, y = dens, + xlab = "Score", ylab = "Cumulative Probability (%)", + col = cols, type = "l", lty = 1) legend(x = "bottomright", legend = c(1, 10, 40, 50), + col = cols[c(1, 10, 40, 50)], lty = 1)

The benchmark for place prediction is using the 2014 ranking to place the 2015 teams. Of the 52 teams in WTC2014, 17 teams returned with the same name, and 43 nation teams returned. Guessing this year’s team result based on last year’s performance gives a score of between 9.8 and 10, depending on which two non-present teams are excluded. My score for 2015 based on my posted prediction is 11.3, which is a poor showing! Using the moment rather than the outcome distribution maximum gives me a score of 9.8, as good as the best prediction based on past performance (and for 49 teams, rather than 42). This is a relief, but obviously can be improved!

So why did my analysis rate Finland Blue so poorly? My analysis was based solely on the 2014 results. Of the three players in Finland Blue who played in 2014, Jaakko had a good record (5-1), but Tatu and Mikko had an average record (3-3). With the two unrated players, Henry and Pauli, also being assumed as average players, this put the Finnish team right in the middle of the pack (18/50). However, all three veteran players had a good record in 2013. Including this data shows that the team was stronger than their 2014 performance suggested. They were not the favourites, although their caster selection looks strong. With three year’s WTC data I believe my predictions for 2016 will be better scoring!

Pingback: the pairing game | analytical gaming

Pingback: Scoring WTC Forecast Performance | analytical gaming