Google’s Cloud Platform has been predicting the winners of the World Cup since the round of 16 and so far has been 11/12 after getting 8/8 correct in the round of 8. Google’s Cloud Platform combines Cloud Dataflow, BigQuery and Compute Engine in order to arrive at their final results.
According to their predictions, which have been almost perfect with the exception of the France v Germany game (I predicted Germany) with Germany beating France, defeating Google’s perfect record. In fact, Google gave France a 69% chance of winning that game, but the result was 1-0 with Germany defeating France.
So, then, why did Google get things wrong? Well, they claim that they got things wrong for a lack of data, not getting enough data points that could help them more accurately adjust the percentages of each team’s probability. Google’s predictions are based upon an inductive model that is derived from game-play data rather than a model based off of changes in betting behavior or certain pre-existing models. The FiveThirtyEight Blog has had a running World Cup Prediction table that uses a different model that is based upon ESPN’s SPI (soccer player index) and has some flaws including not accounting for player injuries and suspensions. They have, however, posted a blog that addresses these losses to the Brazilian team in the Semi Final and Final.
Google says they got the France v Germany wrong because in the first four games of the World Cup, France simply took more shots and had more shots on target than Germany. Additionally, their shots were from a location closer to the goal than Germany’s indicating a stronger likelihood that they would have such things happen against Germany as well.
Google also said that in their first four games, Germany allowed their opponents to take more dangerous shots. This made the possibility that France would be able to score against Germany a much higher probability. Germany also allowed their opponents to pass better in their third of the field, which is a major factor in many games’ final scores. And the statistics in the game continued in Google’s favor in the Germany v France game with the exception of the final score itself. France had 13 shots and 9 on target while Germany only had 8 and 6 on target, indicating that France had plenty more chances and merely squandered them (or was merely unlucky).
Even so, its interesting to see how even though we have all this data and all these great models, nothing can make up for good old human error (in this case, France’s inability to put the ball in the back of the net). Google also has some interesting predictions for the next round, The Semi Finals. They predict that Germany has a 59% chance of beating Brazil, and that does not take into account the loss of Neymar to a broken back. They’ve also lost Thiago Silva to a suspension due to collecting his second yellow in his last game, which Brazil is obviously appealing.
In terms of Netherlands vs Argentina, Google has Argentina winning with 61% odds in their favor, but I believe that does not take into account the loss of Angel Di Maria. But as Nate Silver stated in the FiveThirtyEight blog above, very few players outside of Lionel Messi and Ronaldo have anywhere near a significant enough impact on a team’s ultimate performance.
Overall, it will be interesting to see how accurate Google’s predictions will ultimately be and whether or not they can improve upon their record using their own cloud platform to do the work. Obviously, this is a promotional post by Google to promote how they can do so much vast data crunching with their platform and how it provides clear and accurate results (even though their model is mostly responsible for that).