Resumes and Meta-Rankings
10 October 2007
Recent discussions at Sunday Morning Quarterback and on the Fans Collective Survery message boards suggest an article about similarities between classes of computer rating systems and poll voting philosophies. Reviewing those suggests a useful genneralization.
|Voting Phiolosphy||Computer Rating Class|
- The voter ranks teams based upon her perception of which teams would be "likely" to beat teams she ranks lower.
- Teams are ordered by the rating such that if team A is ranked higher than team B then A has a better than 50 percent chance of winning.
- The voter ranks teams based only upon the results of games played, making no assumptions about the relative rankings of teams that haven't played.
- Teams are ordered based upon winners of games played. Usually a combination of winning percentage and some measure of schedule strength.
Now, if you think that a voter who espouses the "Power" philosophy only needs information from the "Predictive" ratings or the "Resume" voter doesn't use those, that's only because I placed the definitions side-by-side.
In fact, the "predictive" ratings certainly use all the data the "retrodictive" ones use, they just emphasize the data that the author feels contribute more to the probability of winning a game - mostly ability to score points against teams compared to the other teams' ability to prevent points from being scored. Likewise the "pure resume" voter needs some basis for determining the relative "worth" of a win, since teams with the same record have to be compared - usually by strength of schedule, but how to define that?
The Resume() function
The general idea behind resume voting is to compare teams based upon their best wins and worst losses. How to define quality of wins and losses objectively, though, is left open. A "pure resume" approach would define win quality based upon the quality of defeated opponents' wins, and right away you're into recursion that can be hard for a human voter to comprehend, much less calculate. Computers to the rescue!
One of my favorite "tricks" involves a function that maps any rating into a retrodictive form. Use the rankings from an arbitrary rating system to assign the values of wins and losses:
For example, if 120 teams are ranked, a win over #1 is worth (120+1)-1 = 120 points. A loss to the #1 team is worth -1 points, and a loss to the last-ranked team is worth -120 points. Just add up the points for wins and losses and divide by the number of games played to get a "resume score" for each team.
- Win worth: (#ranked teams+1) − opponent's rank
- Loss worth: − opponent's rank
This "resume function" is an example of another class of computer ratings:
As implied by the definition, there are two flavors of meta-ratings. Some, like the Resume function, take the output of a single rating and create a new one using it and some external factor (games won and lost in the case of the Resume function), and others (like what I call the Bucklin Majority) combine multiple-ratings into a new summary rating.
- Meta ratings
- Meta ratings use the output of one or more computer ratings to define a new rating.
For example, we can take the rankings associated with Jeff Sagarin's "Predictor" system to get resSAGP. The column definitions are:
- The index/rank for the "resume score."
- The "resume score" for the team based upon the input rank.
- Team name<
- The name of the ranking used to form the resume report. This is the rank the team was assigned by the rating that is the input to Resume().
- Conf, W, and L
- are informational columns listing the team's conference affiliation, and wins and losses to-date
- Avg_W Ornk
- is the average opponents' rank for the team's wins
- Avg_L Ornk
- is the average opponents' rank for the team's losses
- is the average opponents' rating by this system for all the team's games
- ix - the second ix is just the ordinal rating of the SOS values.
- The best rank of all the opponents defeated by the team
- The worst rank of all the opponents to whom the team lost
An interesting property of the Resume function is that if you take any two ratings Rα and Rβ the rankings for Resume(Rα) and Resume(Rβ) are more nearly alike than Rα and Rβ.
| τ || || || τ-Distance || N |
|0.9070 ||resSAGP || resISOV ||2260 ||†221 |
|0.9058 ||resISR || resISOV ||2290 ||221 |
|0.8775 ||ISR || resISR ||2978 ||221 |
|0.8760 ||resSAGP || resISR ||3014 ||†221 |
|0.7551 ||SAGP || resSAGP ||7142 ||242 |
|0.7535 ||ISOV || resISOV ||5992 ||221 |
|0.7525 ||ISR || ISOV ||6016 ||221 |
|0.7424 || SE ||SAGP ||7512 ||242 |
The τ Distance is the number of pairs whose order is reversed in the two rankings. With 221 teams, there are 24,310 total pairs. The ISR is like Sagarin's Elo-Chess in that it only takes into account who won and where the games are played, and the ISOV like Jeff's Predictor, taking into account the strength of the victories. Notice that the difference affects nearly a quarter of the team pairs, but the Resume functions agree on over 90 percent.
I will add Resume(ISOV) and Resume(ISR) reports as indexes to the Team Resume pages.