Correlating Rankings

Comparing Rankings

© Copyright 2007, Paul Kislanko

During the 2007 college baseball season I came acrossed a "correlation test" for ordinal rankings that turns out to be particularly useful. Keeping the math-y bits simple:

a correlation test results in a number between -1 and +1 that is nearer the end-points if there's a strong relationship between two sets of numbers, and nearer to zero if the numbers are measuring different things.
when the numbers are ordinal ranks (1^st, 2^nd, 3^rd, etc.) you should use a different test than what you'd use for rating values (96.18, 95.97, 94.29,...)

It turns out that there's an ordinal ranking correlation that is quite useful. It is called the Kendall τ-test. What makes it especially useful for our purposes is that there is a metric associated with it called the τ Distance that makes it possible to find out which teams contribute the most to correlation or non-correlation of two rankings that include all the same teams.

The τ-Distance is the total number of position swaps that are required to transform either ranked list into the order specified by the other. What makes that useful is that at the same time we count swaps we can count how many times each team was involved in a swap, so we can find out which teams contribute most to the non-correlation.

An example that demonstrates how the τDistance for a team gives more insight than the usual measures of correlations is provided by comparing the ranks from the final 2006 computer rankings to the the 2006 pre-season rankings:

Team Pre Final τD(Pre,Final)

LSU 5 5 4

Here the comparison is from the pre-season computer rankings for LSU to the final rankings. Since the Tigers were predicted to be 5th and wound up 5th, we might think the pre-season rankings were perfect for the team (the obvious correlation is the square of the differences in rankings, which is zero.) But the τDistance contribution by LSU is 4, not zero.
What the τ does is measure the ranking of LSU against every other team. In the pre-season list, #1 Texas and #4 were higher than LSU's #5 but were #18 and #19 compared to LSU's #5 in the final list, and #14 Florida and #12 Louisville were behind LSU in the preseason but finished #1 and #3 in the final. So the 4 pairs (LSU, Texas), (LSU, Virginia Tech), (Florida, LSU) and (Louisville, LSU) are reversed in the two rankings, resulting in that contribution to the Distance.
Pairs whose relative ranks are reversed in the orderings are sometimes called discordant. The τ Distance between two rankings is just the number of discordant pairs.

The Kendall τ correlation for the full pre-season computer rankings and the final ones is 0.5354 with total distance 3262. So for all team pairs, the pre-season had the "correct" team ranked higher only about 53.5 percent of the time. To tell the truth, that's actually higher than I expected.

MB SAG MAS

SE 366 792 786

MB 814 824

SAG 750

τ MB SAG MAS

SE 0.9479 0.8872 0.8881

MB 0.8841 0.8826

SAG 0.8932

A more useful application is to compare different rankings taken at the same time. I was curious, for instance, how different the Sagarin-ELO and Massey-BCS rankings were from the Sagarin and Massey rankings that take into account Margin of Victory.

Notice that the τ values in these comparisons are a lot closer to one than the value found for the pre-season vs final test.

Trivia or Seriously Useful?

Many students of the BCS formula have long-noted that the computer rankings most usually thrown out for being too high or too low are Billingsley's. Some have called for its replacement, others (including me) have called for elimination of the "no-MOV" ban (unless you can eliminate that from the human polls, too) and expansion of the list of computer rankings to the point where eliminating the highest/lowest isn't necessary.

Now, while not a fan of some aspect's of Bilingsley's method, especially the carry-over from year to year, I also have argued that too much "sameness" in the computers would mean only one is necessary. Nonetheless, when we calculate the τ-Distance for each pair of BCS computers, we find that Billingsley's system has the most "discordant pairs" compared to every other system than any pair of systems that doesn't include his.

τ Distance for Pairs of BCS Computer Rankings

	SE	COL	WOL	AND	BIL	←Reg	Final →		SE	AND	COL	WOL	BIL
MB	166	998	818	644	1330	Season		MB	366	624	858	712	1182
SE		996	776	646	1388			SE		682	872	682	1196
COL			512	614	1124			AND			494	520	1110
WOL				610	1256			COL				498	1084
AND					1146			WOL					1226

One could imagine an objective criterion for including one of the 100+ available computer rankings framed in terms of having a lower τ-Distance when compared to the average BCS computer component than a rating that is already included.