Chess tournament games and Elo ratings
Posted on May 24, 2014 by Randy Olson
Chess is by far one of my favorite games. Ever since Seth Kadish shared one of his visualizations of square utilization by chess masters, I’ve been wanting to follow up to see what else we can visualize about chess. A few months ago, Daniel Freeman from ChessGames.com generously opened his chess data set to me to analyze, which contains a huge collection of 675,000+ chess tournament games ranging all the way back to the 15th century. This will be the first in a series of blog posts exploring this data set. To begin, I was interested in Elo ratings and how they predict the outcome of chess games.
Chess match: Bobby Fischer vs. Mikhail Tal (1960)
The goal of the Elo rating system is to assign a numeric value that represents a player’s chess skill. It’s a fairly straightforward yet elegant rating system:
All new players start at a relatively low Elo rating.
If you beat someone, your Elo rating goes up and their rating goes down the same amount, and vice versa if you lose.
The number of points your rating changes by is determined by the difference between you and your opponent’s rating.
For example, if you have an Elo rating of 1600 and beat a 2200 rated player, your ratings are going to change a lot. But if you beat a 1000 rating player (as a 1600 rating player), your ratings won’t change much.
Therefore, it’s in your best interest to play against others around or above your current Elo rating. After dozens of games, you’ll eventually arrive at an Elo rating that’s representative of your chess skill.
Let’s start by jumping into the diagnostics. Since this is a data set of chess tournament games, most of the rated players have pretty high Elo ratings. The majority of the games I’ll be analyzing were played by experts with a 2000+ Elo rating, many in the 2500 range. To give you a sense of what these ratings mean: Bobby Fischer’s peak rating was 2785, and Garry Kasparov’s was 2851. So we’re analyzing games by some pretty talented chess players.
Another important factor to look at is the difference in Elo ratings between the two players in each game. Following the wise advice above, most of the chess tournament games were played between competitors with a fairly close Elo rating. This will be important to keep in mind later as I look at the effect of differences in Elo ratings.
Enough diagnostics. Let’s get into the meat of the data.
Full article here.
This is exciting. May be you can work on this for PhD !!
Exciting stuff to do data analytics on. May be subject for your PhD !!