Quantifying "Consensus"

© Copyright 2005, Paul Kislanko

The method I use to consolidate computer rankings to form a "meta-ranking" is similar to a voting system known as "Bucklin". It is less subject to extreme rankings than a simple arithmetic mean or the method typically used in sports polls (Borda counts).

The basic idea is simple - assign a team rank R if a majority of voters (or computer rankings, in this case) agree that the team should have rank R or a better rank. Votes (or computer rankings) for a better rank are counted for the rank where a majority agrees the team should be ranked at least that high.

This vote-counting method is not affected by extreme rankings in either direction. If someone (or some computer) ranks Ivy-Covered-U first, but all other voters rank ICU 50th, the vote for #1 gets counted as a vote for #50. Likewiise if some mean-spirited voter (or biased compter) ranks ICU very low, that becomes irrelevant because votes below the "majority agrees" rank don't affect ICU.

There can be (and usually are) ties if all we consider is the highest ranking for which a majority agrees is appropriate.
Maj Cnt
of 16
BestWorstTeamConf HOW KEE CLA MAS DWI MOR BIL DES DOK GM UCS PFZ CGV CSL SOL THM
71010 414 GeorgiaSEC 6 5 11 10 13 12 6 13 10 10 10 14 9 10 12 4
8109 532 Boise StWAC 10 15 8 17 8 20 9 20 5 31 32 12 6 6 10 10
Here we note that there's a tie for tenth. This one's easy to break in favor of Georgia, because more computers ranked Georgia 10th or higher than ranked Boise St 10th or higher. Things get a little trickier when there's also a tie for number of voters that contribute to the majority's ranking:

Maj Cnt
of 16
BestWorstTeamConf HOW KEE CLA MAS DWI MOR BIL DES DOK GM UCS PFZ CGV CSL SOL THM
141410 218 TennesseeSEC 15 9 14 16 18 14 14 2 7 6 15 18 15 12 4 7
151410 429 Ohio StateB10 14 11 22 14 29 10 16 9 16 4 7 23 5 19 6 5
To break this tie we consider only the contributions from sources that ranked the teams lower. In this case, Tennessee gets the tiebreaker because 3 15ths, a 16th, and two 18ths beats two 16ths, a 19th, a 22nd, a 23rd, and a 29th. Ideally, we'd use a recursive implemtation of Bucklin to resolve the tie by "voting" for the 15th place using only the ballots that had the tied teams ranked lower than 14th and ignoring all other teams, but we used a variation of Borda as this tiebreaker (sum 1/(rank - majority_rank) for all ranks greater than the majority_rank and choose the highest value - 3/(15-14) + 1/(16-14) + 2/(18-14) = 4 and 2/(16-14) + 1/(19-14) + 1/(22-14) + 1/(23-14) + 1/(29-14) = 1.50).

Ranking Rankings

One of the byproducts of this measurement is that we can rank the rankings based upon the percentage of times the rankings are a part of the majority that decides a team ranking. This is at least interesting;

Agreement with Consensus

DOK800.672
BIL740.622
DWI730.613
CGV720.605
HOW710.597
PFZ710.597
MAS700.588
GM690.580
THM690.580
KEE680.571
DES680.571
SOL680.571
MOR660.555
CSL650.546
CLA650.546
UCS640.538

This is not a very strong correlation, because it is "one-sided." A rating that assigns a rank that is "too high" is not counted as an error for the same reason it does not contribute to the composite being "too high". Nonetheless, at the top of the rankings there are fewer choices for erroneously high ranks, so where it matters most this simple counting statistic suffices.