NÖTTV has now been using Ratings Central as its official ranking system for two years. Therefore it is high time to ask the creator to come into the limelight. And he met our wish with pleasure. Read the following very interesting and informative interview with David Marcus.
NÖTTV: David, you are the man behind Ratings Central, the rating system which is currently used by the NÖTTV. Can you give us some personal information about you?
David Marcus: There are actually two people behind Ratings Central: Sean O'Neill and I are partners in this endeavor, and have been from the start.
I was born in New York City on the upper west side of Manhattan. When I was growing up, I often played table tennis in Marty Reisman's club which was half a dozen blocks from where I lived. After junior high school, my family moved to Long Island. I attended Stony Brook University and received the B.S. in Mathematics (I was also valedictorian of my class). I then moved to Massachusetts to attend graduate school at MIT and received the Ph.D. in Mathematics (Probability).
After MIT, I went to work as an applied mathematician at TASC (The Analytic Sciences Corporation), a systems engineering company with particular expertise in guidance and navigation systems. TASC became part of first Primark, then Litton, and is now part of Northrop Grumman. I worked there for twenty-two years.
After that, for three and a half years, I was Director of Operations for The Table Tennis Pioneers, a North American table tennis retailer, and Banda Sports, Inc., a table tennis and sporting goods wholesaler in Canada.
Currently, I am retired, but still do some work as a self-employed business/IT consultant.
NÖTTV: Our members are certainly interested in the key facts of your rating system. Can you tell us which associations besides the NÖTTV are actually using it? And how many players, events and matches have been recorded and processed down to the present day?
David Marcus: The only organizations that I know of that are using it officially are Table Tennis Queensland (http://www.tabletennisqld.org) in Australia and the U.S. National Collegiate Table Tennis Association (http://www.nctta.org).
ITTF Para Table Tennis (http://www.ipttc.org) uses it to help with their selection for wildcards at World and Paralympic events, but not for their official ratings.
The U.S. Hardbat folks (http://www.hardbat.com/hardbat.html) use it, but they don't use Ratings Central. They use an offline app I let them use long ago, before we had Ratings Central.
I developed the system while I was a member of the USATT Ratings Committee. The committee recommended that USATT adopt my system, but by the time we were ready, our friends on the board of directors had been replaced by people who didn't like us. So, USATT still uses their own system.
Of course, many clubs and leagues in the U.S. use it for their own ratings.
You can see the number of players, events, and matches under the list of recent events on the home page at http://www.ratingscentral.com.
NÖTTV: When did you start thinking about the development of an own rating system and what was your motivation for this step? Furthermore, how did you start this project?
David Marcus: When I first got involved with organized table tennis in the U.S., I found the U.S. rating system to be ad hoc and, on close inspection, puzzling. The core of the system was the ratings chart that said how many points a player should gain or lose as a function of the difference in the players' ratings. Other parts of the system were the adjustments and the method for assigning ratings to new players. The ratings chart seemed illogical to me. For example, it said that if you beat a player with the same rating, you would gain eight points. Why eight? I couldn't see any possible reason for eight rather than some other value.
There were other things that seemed illogical. For example, the ratio of the points in the two columns of the ratings chart (one column for upsets, the other for the higher rated player winning) had been adjusted using historical data so that chance upsets wouldn't change the players' ratings (on average). This always seemed backwards to me: It was saying that if we start with correct ratings, then the ratings chart won't mess them up. My feeling was that if we started with correct ratings, why change them at all? And, what if the ratings weren't right to begin with? Plus, the argument said nothing about the values in a given column, only about the ratio of the two columns. In fact, the adjustment formulas (which many players either didn't understand or didn't realize existed) always made more sense to me than did the ratings chart.
I knew the chess people used the Elo system (I'd played some organized chess when in high school), so my first thoughts were that the Elo system must be better, and why didn't table tennis just use the Elo system rather than inventing their own?
I put some effort into finding out how the U.S. table tennis system had been developed (it turned out that someone had just sat down with a piece of paper and made it up) and finding out more about the Elo system. I eventually located a copy of Elo's out-of-print book. I was rather disappointed by what I read. Elo was very good at telling people how wonderful his system was, but while it was probably better than the table tennis system, it still struck me as ad hoc (e.g., Elo wrote down a probability model, but his update formula is just pulled from thin air rather than derived from the model). And, his claims for the accuracy of his system seemed questionable.
Although my academic training was in probability, I hadn't taken any statistics courses in graduate school. However, I was learning a good deal of statistics and optimal estimation from my job at TASC. The ideas from Kalman filtering used in guidance and navigation systems and Bayesian statistics that I used in my job seemed to be just what was needed to develop a good rating system. A colleague of mine at TASC did a project involving tracking and estimation using discrete random variables, and some of the techniques that he used seemed like they would also be useful for a rating system. The broad outlines of the mathematical model seemed clear. The questions that would need answering to develop a working system were: What should the specific parameters in the model be? And, can I come up with a practical algorithm to get numbers out of the model?
I'd met Sean because I was involved in running various national tournaments. Sean had been five-time U.S. Men's Singles Champion. At the time, Sean was an athlete rep. on the USATT Board of Directors. He was unhappy with various problems with the current rating system (e.g., the excessive volatility, the fact that established players could lose too many points for a single loss to a less-established player, and the fact that the best way to gain points was to go into a major tournament underrated). He wanted a better rating system, one that would be fairer both for club players and for selecting players to compete to be on the national team. I told him that I thought all the problems were inherent with the current system (i.e., they couldn't be fixed by tweaking things), but that a new system, developed on a sound statistical foundation, wouldn't have these problems.
However, I wasn't too interested in spending a year of weekends (my rough estimate for the effort involved) to develop a new system if USATT wasn't going to adopt it. Obviously, I didn't expect USATT to promise to adopt the new system before it even existed, but I wanted some assurance that the board was actually interested in this. Since the work would entail more number crunching and graphics than my old computer could handle (this was in 1997, when computers were much slower than now, and my computer was a bunch of years older than that), I suggested that USATT buy me a new computer, basically as a show of good faith, since I was offering to do the work for free.
However, the board wouldn't come up with the money (possibly because the executive director didn't support the idea - he was kind of cheap), but then the executive director suggested that the U.S. Olympic Committee's Sport Science and Technology Committee would provide grants for developing new things, and they might be willing to provide a grant for this. So, I wrote a grant proposal, and USATT submitted it to the USOC, and they approved it. The grant ($5,480) only paid for equipment (hardware and software), and USATT technically still owned the equipment (although by the time the work was over, the computer was out of date, so USATT just let me keep it).
So, now I was committed. I bought a new computer and software and got USATT to provide me with five and a half years of tournament data, and went to work. By 1999, I had the new system working. (Note that it is basically Sean's fault that I developed the system!)
After that, we tried and failed to get USATT to adopt the system. Sean said that he thought some clubs might like to have their own ratings. I provided Sean with an app that he could use to calculate ratings. He and some friends set up a website so clubs could upload their results. Sean would download the results once a day, run them through my app, then upload the results. This was the first version of Ratings Central. I was rather surprised that there was a demand for club ratings, but apparently there was.
At the time, I was working on Zermelo (http://www.davidmarcus.com/Zermelo.htm), a new version of my tournament software, and so didn't have time to help Sean with the Ratings Central website. In 2003, I released the first version of Zermelo, and told Sean that I could help him develop an improved version of Ratings Central. I wrote a new version of the ratings processor (that would run continuously rather than once a day), wrote Cantor (an app event directors could use to submit events), updated Zermelo to support Ratings Central, and built a new website. We launched the new Ratings Central on April 21, 2004.
NÖTTV: What's the difference between Ratings Central and the system used by the ITTF. Would you say that yours is the better one?
David Marcus: The ITTF system was set up by the CTTA (Canadian Table Tennis Association). The CTTA has always had a different system than the USATT system. It has always been common knowledge in North America that the USATT system is much better than the CTTA system. The inadequacies of the CTTA system were a problem for U.S. tournaments that had many Canadian participants (e.g., the U.S. Open, U.S. Nationals, U.S. Open Team Championships). To deal with these Canadian participants, we needed a way to convert CTTA ratings to U.S. ratings. Over the years, there were a couple of times when I came up with such conversion charts. However, no matter how you convert an inaccurate number, the result is still inaccurate. So, U.S. tournaments that let Canadians participate and use converted CTTA ratings had to be cautious and live with the resulting problems. At one point, when I was on the USATT Ratings Committee, we suggested to the CTTA that they provide us with data for all their tournaments, and USATT would process the tournaments for USATT ratings. CTTA almost agreed, but at the last minute, their executive director realized that if they did that, it would basically kill their rating system since no one would care what the CTTA ratings were if all Canadians also had a USATT rating that included all their CTTA tournament events.
For many years, CTTA had a chronic problem with new players being underrated. They even had severe deflation, so players were reluctant to play a tournament because the more they played, the lower their rating became. Eventually, CTTA realized that they should do something, and decided to replace their broken, deflationary system with a system where it was easy to gain points. Several of us tried to explain to the CTTA ratings people that while players might like gaining points more than losing them, after a while, players would notice that the numbers simply reflected how much they played, and not how good they were, and so players would stop caring about the ratings. The CTTA ratings people insisted that we were wrong. They went ahead with their plans. From what my contacts in Canada told me, for a year or two, tournament participation picked up, but then the flaws in the system started to become apparent. While rampant inflation may be better than rampant deflation, neither is very good.
Since the CTTA set up the ITTF ratings, I wasn't expecting much. I believe that after a couple of years, the ITTF reorganized things so the CTTA folks didn't have as much authority, but the CTTA folks are still involved.
It is hard to keep politics out of these things. Back soon after I developed my system, I corresponded with the chair of the USCF (U.S. Chess Federation) Ratings Committee. He's a statistician, and he'd developed a rating system that had some similarities to what I had done. He told me how his committee had tried to get the USCF to make improvements to the USCF rating system, but had failed because those players who would be hurt by the changes always put up a lot of resistance.
Before the ITTF had ratings, I knew someone who was on the ITTF Ranking Committee. From what he told me, the rankings weren't too scientific back then. Basically, each committee member would look at the completed draw sheets, then rank the players. The committee chair would average the rankings.
On June 16, 2010, the ITTF announced that they had to modify the bonus points part of the 2010 version of their system because junior and cadet players were accumulating too many points. See http://www.ittf.com/_front_page/ittf_full_story1.asp?ID=20373. The same page also has some documents that give details on the ITTF's current system. It is a very crude system, not that different from the old CTTA system. It consists of a ratings chart, event weightings, bonus points, and some rules for new players. Bonus points are a bad idea. Event weightings aren't a bad idea, but probably aren't needed. Plus, it isn't obvious what the right weightings should be, and bad weightings could cause more harm than good. The ITTF event weightings are asymmetrical (different for winners and losers), which is illogical. As for the ratings chart, I don't know any way of determining what the "right" values in such a chart should be.
I'm not surprised that the ITTF is having problems with their ratings. I don't find their statement that a "yearly review of the Ranking System will take place every year to analyze the changes in place" to be comforting.
A good rating system should be based on sound statistical principles. Mine is; the ITTF's isn't. A good rating system should keep track of how much information it has on each player. Mine does; the ITTF's doesn't. A good rating system should have as few parameters as possible and the parameters should be physically meaningful. Mine has one to four parameters (depending on how you count) and they are all physically meaningful; the ITTF's has lots (more than twenty) and they aren't physically meaningful. A good rating system shouldn't have ad hoc rules. Mine doesn't; the ITTF's does. A good rating system should be carefully tested on historical data. Mine was; I doubt the ITTF's was.
My system is better than the ITTF system.
In fact, my system works better than I expected it to work while I was developing it. It seems the algorithm really does a very good job of capturing most of the information that is available in the data.
You can see my system's ratings of all the Pro Tour players at http://www.ratingscentral.com/ProPlayers.php. You can check for yourself whether these ratings/rankings seem closer to the truth than the ITTF ratings/rankings. After we started running the Pro Tour events through my system, we used to check the results against the ITTF rankings. The output from my system always seemed to make more sense. I haven't compared the rankings recently.
NÖTTV: Is the algorithm of your system open to the public?
David Marcus: The algorithm was published in the article "New Table-Tennis Rating System" in the Journal of the Royal Statistical Society: Series D (The Statistician), 2001, vol. 50, part 2, pp. 191-208. You can find a copy of the article at http://www.ratingscentral.com/Doc/NewTTRS.pdf. There are some small differences between my current implementation and what the article says, but they only affect the speed, not the numbers.
NÖTTV: We are often asked by our members how Ratings Central works. When your grandma would ask you at Saturday afternoon's coffee, how would you explain the functionality of your rating system to her?
David Marcus: I'd say what it says at http://www.ratingscentral.com/HowItWorksGerman.php. I'd also tell her that if she had any questions, she should ask me.
The key feature that is different from most rating/ranking systems is that the system keeps track of how much information it has on each player.
NÖTTV: Let's speak about David Marcus, the table tennis player. When did you start your table tennis career and what was your greatest success?
David Marcus: I first got involved with serious table tennis when I attended MIT. There was a very active table tennis club at MIT, and we competed against other schools, attended USATT-sanctioned tournaments, and ran our own USATT-sanctioned tournaments at MIT. I was president of the MIT Table Tennis Club for more than a decade. Currently, I'm one of the organizers of the Boston Table Tennis Center (http://www.bostonttc.us).
Since 1983, I've organized and run tournaments of all sizes. I was Tournament President of the U.S. Open Team Championship from 1989 to 1997. I was co-tournament director of the 1989 and 1990 U.S. Opens, 1990 World Veterans Championships, 1989 U.S. National Championships, 2000 North American Olympic Trials, and several four-star tournaments. I have been tournament director for dozens of zero and one star tournaments. I was the Table Tennis Technical Delegate for the 1999 Special Olympics World Summer Games. I've run the Bay State Games (Massachusetts Olympics) Table Tennis Event since its inception in the Games in 1985, and I've been inducted into the Bay State Games Hall of Fame.
Starting in 1982, I competed in numerous tournaments of all sizes: local tournaments, state games, regional tournaments, U.S. Opens, U.S. Nationals, and U.S. Open Team Championships. (In the U.S., players of all levels can compete in all these tournaments. The ratings allow the tournaments to set up events for players of each level.)
I attended the 1989 (Germany), 1991 (Japan), and 1999 (Netherlands) World Championships as a spectator.
I am a USATT-Certified National Umpire and USATT-Certified Referee.
I developed the software system used for the 1989 and 1990 U.S. Opens, 1989 U.S. National Championships, and 1990 World Veterans Championships. I developed the software system used in 1996 and 1997 by the U.S. Open Team Championship. I first used software that I wrote to run a tournament in 1985. My current tournament software is fully integrated with Ratings Central.
My greatest playing success was probably the first U.S. Open Team Championship that I competed in. This was probably around 1987. I and my two teammates were all rated in the low 1600's. We moved up two divisions in the preliminary stage on Friday and beat several 1900 teams that weekend. I believe my post-tournament rating was 1912. One of my other teammates came out with a similar rating, and the third was rated 1850. (My numbers may be a bit off, but that's the way I remember it.)
NÖTTV: Are you still active in this sport or do you enjoy the retirement already?
David Marcus: I don't play regularly anymore. Occasionally, I'll go to the Boston Table Tennis Center to play. The last time I played was with my girlfriend. I haven't played a tournament in many years. Of course, I have a long to-do list of improvements to make to Ratings Central. And, I still run my state games every year.
NÖTTV: So, let's come to the last question. What do you associate spontaneously with Austria?
David Marcus: Werner Schlager, Europe, beer, NÖTTV, and intelligent people who realize that numbers are most useful when they are correct.
Der NÖTTV dankt David Marcus sehr herzlich für dieses ausführliche Interview!