All of these reports use the information published by MasseyRatings at College Football Ranking Composite with Dr. Massey's kind permission.
The general idea is to provide the ranking information in all formats that would be needed to count "votes" in a team-quality poll were the computer rating systems "voters" in such a poll. There are many ways to count ranked ballots to create a composite rank, and since there's no best way to do that, I try to provide enough information to use any way to do so.
In general I believe that any computer ranking regardless how bad is likely better than any human's opinion no matter how 'expert' the human because the computer rating takes into account every game that contributes to each of the 8,128 team-vs-team comparisons and the human is subjectively biased by having only a very small subset of the games played as direct influence.
Just as a human team-quality poll can result in a better ranking than that of any individual voter's subjective rankings, a "poll" of the computer rankings each of which is based upon different objective measurements can result in a measurably better list.
I only include computer ratings that rank all teams in the field, so my list will never exactly match Dr. Massey's, which includes human top 25's and a few computer "top n" where n is less than the number of teams in the field.
Top 25 | Truncates every computer rating's ranking at 25 and then counts the
ballots the same way media polls do, using a 25-24-23-... point assignment for teams ranked 1-2-3....I only include this report to demonstrate how much information is left out of the usual media presentation of their poll results. In addition to the number of "points" I include the number of ballots that listed the team in the top 25 and the number of votes for each rank for which the team was voted in the top 25.
| |||||||||||||||||||||||
Borda | The usual method of counting top 25 ballots is a variation on
the Borda Count. In its basic form, teams get one point for each team they are ranked better than. In a 128-team field,
a #1 vote is worth 127, #2 worth 126 down to #128 worth 0. When all teams are ranked the order is the same as the average
rank over all ratings.
| |||||||||||||||||||||||
Majority Consensus |
My consensus rank is based upon the Bucklin vote-counting method. For each team find the best rank for which a majority
of the ratings agree the team should be ranked at least that highly. I use a strict majority, namely 50% + one rating.
When there are an odd number of ratings this is the same as the arithmetic median. For an even number of ratings it is
the best rank worse than the median.
| |||||||||||||||||||||||
Pairwise Matrix | Even when the majority ranks team A better than
team B, it is possible that team B is ranked better on more ballots than team A. In Condorcet voting, the ballots are
translated into pairwise comparisons between alternatives. The method suffers from a lack of transitivity: team A > team B and
team B > team C does not imply team A > team C!
| |||||||||||||||||||||||
Correlations | The basis for measuring how alike two ordinal rankings
are is the distance metric. This is the number of swaps required by a bubble sort to place one of the lists in the
same order as the other. The distance varies from zero (the lists are identical) to the total number of team-pairs (the lists
are reversals of each other; 8,128 for a 128-team field.) For each ranking I report the contribution to the distance function by each team.
The distance is the number of discordant pairs - the number of pairs where the teams' relative orders are reversed in the two rankings. When the teams are in the same relative order in both lists the pair is said to be concordant. When the teams have the same rank in either list the pair is ignored.These give -1 ≤ { τ, γ } ≤ 1 with |τ| ≤ |γ|. Both will be -1 if the teams are in exactly reverse order, 0 if the relationship is perfectly random (whatever that means!) and +1 if the rankings are identical. The τ and γ are the same if there are no ties (but notice that ties in the Majority Consensus rank are to be expected, in which case τ will be closer to zero than γ.)
| |||||||||||||||||||||||
Conference Ranks | There are more ways to aggregate team ranks by conference than ratings, but I have chosen these.
| |||||||||||||||||||||||
Weighted Violations | Roughly one in five games result in the
worse-ranked team winning no matter which rating produces the ranking. Were it not so sport would not be interesting. One
measure of how well a rating represents results-to-date is the count of Retrodictive Ranking Violations, the number
of games in which after the ranking takes into account team A beat team B it still ranks team B better than team A.
Motivated by Potemkin's idea that instead of just counting the number of RRVs we shuold take into account the size of the violation (rank difference) and the importance (how highly the loser is ranked) I came up with a Weighted RRV value that combines the size of the upset (in scores and rank difference) with importance (loser's rank.) The "importance" component also takes into account that violations later in the season (when the rating has more input) should count more.
|