Comparing Rankings

© Copyright 2007, Paul Kislanko

During the 2007 college baseball season I came acrossed a "correlation test" for ordinal rankings that turns out to be particularly useful. Keeping the math-y bits simple:

It turns out that there's an ordinal ranking correlation that is quite useful. It is called the Kendall τ-test. What makes it especially useful for our purposes is that there is a metric associated with it called the τ Distance that makes it possible to find out which teams contribute the most to correlation or non-correlation of two rankings that include all the same teams.

The τ-Distance is the total number of position swaps that are required to transform either ranked list into the order specified by the other. What makes that useful is that at the same time we count swaps we can count how many times each team was involved in a swap, so we can find out which teams contribute most to the non-correlation.

An example that demonstrates how the τDistance for a team gives more insight than the usual measures of correlations is provided by comparing the ranks from the final 2006 computer rankings to the the 2006 pre-season rankings:
Team Pre Final τD(Pre,Final)
LSU 5 5 4

Here the comparison is from the pre-season computer rankings for LSU to the final rankings. Since the Tigers were predicted to be 5th and wound up 5th, we might think the pre-season rankings were perfect for the team (the obvious correlation is the square of the differences in rankings, which is zero.) But the τDistance contribution by LSU is 4, not zero.

What the τ does is measure the ranking of LSU against every other team. In the pre-season list, #1 Texas and #4 were higher than LSU's #5 but were #18 and #19 compared to LSU's #5 in the final list, and #14 Florida and #12 Louisville were behind LSU in the preseason but finished #1 and #3 in the final. So the 4 pairs (LSU, Texas), (LSU, Virginia Tech), (Florida, LSU) and (Louisville, LSU) are reversed in the two rankings, resulting in that contribution to the Distance.

Pairs whose relative ranks are reversed in the orderings are sometimes called discordant. The τ Distance between two rankings is just the number of discordant pairs.

The Kendall τ correlation for the full pre-season computer rankings and the final ones is 0.5354 with total distance 3262. So for all team pairs, the pre-season had the "correct" team ranked higher only about 53.5 percent of the time. To tell the truth, that's actually higher than I expected.

SE 366 792 786
MB   814 824
SAG     750
SE 0.9479 0.8872 0.8881
MB   0.8841 0.8826
SAG     0.8932
A more useful application is to compare different rankings taken at the same time. I was curious, for instance, how different the Sagarin-ELO and Massey-BCS rankings were from the Sagarin and Massey rankings that take into account Margin of Victory.

Notice that the τ values in these comparisons are a lot closer to one than the value found for the pre-season vs final test.

Trivia or Seriously Useful?

Many students of the BCS formula have long-noted that the computer rankings most usually thrown out for being too high or too low are Billingsley's. Some have called for its replacement, others (including me) have called for elimination of the "no-MOV" ban (unless you can eliminate that from the human polls, too) and expansion of the list of computer rankings to the point where eliminating the highest/lowest isn't necessary.

Now, while not a fan of some aspect's of Bilingsley's method, especially the carry-over from year to year, I also have argued that too much "sameness" in the computers would mean only one is necessary. Nonetheless, when we calculate the τ-Distance for each pair of BCS computers, we find that Billingsley's system has the most "discordant pairs" compared to every other system than any pair of systems that doesn't include his.

τ Distance for Pairs of BCS Computer Rankings

MB 166 998 818 644 1330 Season   MB 366 624 858 712 1182
SE   996 776 646 1388     SE   682 872 682 1196
COL     512 614 1124     AND     494 520 1110
WOL       610 1256     COL       498 1084
AND         1146     WOL         1226

One could imagine an objective criterion for including one of the 100+ available computer rankings framed in terms of having a lower τ-Distance when compared to the average BCS computer component than a rating that is already included.