At the Olympics, the US is underwhelming, Russia still overperforms, and what's wrong with Southern Europe (except Italy)?
Russia is doing very well. The US and China, for all their dominance of the raw medal tables are actually doing just as well as you'd expect.
Portugal, Spain, and Greece should all be upset at themselves, while the fourth little piggy, Italy, is doing quite alright.
What determines medal counts?
I decided to play a data game with Olympic Gold medals and ask not just "Which countries get the most medals?" but a couple of more interesting questions.
My first guess of what determines medal counts was total GDP. After all, large countries should get more medals, but economic development should also matter. Populous African countries do not get that many medals after all and small rich EU states still do.
Indeed, GDP (at market value), does correlate quite well with the weighted medal count (an artificial index where gold counts 5 points, silver 3, and bronze just 1)
Much of the fit is driven by the two left-most outliers: US and China, but the fit explains 64% of the variance, while population explains none.
Adding a few more predictors, we can try to improve, but we don't actually do that much better. I expect that as the Games progress, we'll see the model fits become tighter as the sample size (number of medals) increases. In fact, the model is already performing better today than it was yesterday.
Who is over/under performing?
The US and China are right on the fit above. While they have more medals than anybody else, it's not surprising. Big and rich countries get more medals.
The more interesting question is: which are the countries that are getting more medals than their GDP would account for?
Top 10 over performers
These are the 10 countries which have a bigger ratio of actual total medals to their predicted number of medals:
delta got predicted ratio Russia 6.952551 10 3.047449 3.281433 Italy 5.407997 9 3.592003 2.505566 Australia 3.849574 7 3.150426 2.221921 Thailand 1.762069 4 2.237931 1.787366 Japan 4.071770 10 5.928230 1.686844 South Korea 1.750025 5 3.249975 1.538473 Hungary 1.021350 3 1.978650 1.516185 Kazakhstan 0.953454 3 2.046546 1.465884 Canada 0.538501 4 3.461499 1.155569 Uzbekistan 0.043668 2 1.956332 1.022322
Now, neither the US nor China are anywhere to be seen. Russia's performance validates their state-funded sports program: the model predicts they'd get around 3 medals, they've gotten 10.
Italy is similarly doing very well, which surprised me a bit. As you'll see, all the other little piggies perform poorly.
Australia is less surprising: they're a small country which is very much into sports.
After that, no country seems to get more than twice as many medals as their GDP would predict, although I'll note how Japan/Thailand/South Kore form a little Eastern Asia cluster of overperformance.
Top 10 under performers
This brings up the reverse question: who is underperforming? Southern Europe, it seems: Spain, Portugal, and Greece are all there with 1 medal against predictions of 9, 6, and 6.
France is country which is missing the most medals (12 predicted vs 3 obtained)! Sometimes France does behave like a Southern European country after all.
delta got predicted ratio Spain -8.268615 1 9.268615 0.107891 Poland -6.157081 1 7.157081 0.139722 Portugal -5.353673 1 6.353673 0.157389 Greece -5.342835 1 6.342835 0.157658 Georgia -4.814463 1 5.814463 0.171985 France -9.816560 3 12.816560 0.234072 Uzbekistan -3.933072 2 5.933072 0.337093 Denmark -3.566784 3 6.566784 0.456845 Philippines -3.557424 3 6.557424 0.457497 Azerbaijan -2.857668 3 5.857668 0.512149
The Caucasus (Georgia, Uzbekistan, Azerbaijan) may show up as their wealth is mostly due to natural resources and not development per se (oil and natural gas do not win medals, while human capital development does).
§
I expect that these lists will change as the Games go on as maybe Spain is just not as good at the events that come early in the schedule. Expect an updated post in a week.
Technical details
The whole analysis was done as a Jupyter notebook, available on github. You can use mybinder to explore the data. There, you will even find several little widgets to play around.
Data for medal counts comes from the medalbot.com API, while GDP/population data comes from the World Bank through the wbdata package.