Ranking Correlations

Correlation to Consensus

July 27, 2014

My definition of the "consensus" team rank is the best rank for the team for which a majority of rating systems rank the team at least that highly. That is the median rank if the number of ratings is odd, or the best rank worse than the median for an even number of ratings. (You can find it by alternately eliminating the best and worst of remaining ranks, beginning with the best, until there is only one ranking left.)

The "correlation to consensus" I use is the "distance" of a ranking from the consensus ranking. This is the number of pairs in the opposite order than in the Majority ranking. I like this measurement of correlation because it has a simple interpretation: it is the number of swaps a bubble sort would require to transform the ranking into the consensus ranking.

The distance is the number of discordant pairs - the number of pairs where the teams' relative orders are reversed in the two rankings. When the teams are in the same relative order in both lists the pair is said to be concordant. When the teams have the same rank in either list the pair is ignored.

These can be turned into rank correlation coefficients in several ways. The two I calculate are:

Kendall's tau:

τ = #Concordant pairs - #Discordant pairs

# Total Pairs

Goodman and Kruskal's gamma:

γ = #Concordant pairs - #Discordant pairs

#Concordant pairs + #Discordant pairs

These give -1 ≤ τ ≤ γ ≤ 1. Both will be -1 if the teams are in exactly reverse order, 0 if the relationship is perfectly random (whatever that means!) and +1 if the rankings are identical. The τ and γ are the same if there are no ties (but notice that ties in the Majority consensus rank are to be expected.)

Utility

Of course correlations can be calculated for any pair of ratings. For instance I can categorize a new rating as "Predictive" or "Retrodictive" by the size of it's distances from existing ratings known to be in those categories. I calculate the correlations between all ratings-pairs on an ad-hoc basis, but I haven't made a regular report page. For the 13 ratings Dr. Massey included as of Fri Jul 25 we get:

	Distance(row,col)
	Maj	RWP	BRN	MAR	PAY	MAS	XWP	MOR	DII	BIL	HOW	PFZ	CSL	DOK
Maj	∗	331	484	503	523	527	547	566	653	656	668	742	921	1022
RWP	331	∗	668	478	731	696	619	813	621	832	891	808	798	1231
BRN	484	668	∗	838	617	556	761	403	953	962	969	1048	1278	985
MAR	503	478	838	∗	929	964	701	931	795	904	945	784	756	1399
PAY	523	731	617	929	∗	519	778	728	1012	885	940	1077	1325	880
MAS	527	696	556	964	519	∗	849	621	997	998	933	1178	1386	863
XWP	547	619	761	701	778	849	∗	926	942	1037	1016	927	1145	1218
MOR	566	813	403	931	728	621	926	∗	1024	1003	964	1081	1353	1014
DII	653	621	953	795	1012	997	942	1024	∗	1035	1076	1021	771	1424
BIL	656	832	962	904	885	998	1037	1003	1035	∗	745	1020	1112	1141
HOW	668	891	969	945	940	933	1016	964	1076	745	∗	1011	1177	1112
PFZ	742	808	1048	784	1077	1178	927	1081	1021	1020	1011	∗	952	1597
CSL	921	798	1278	756	1325	1386	1145	1353	771	1112	1177	952	∗	1753
DOK	1022	1231	985	1399	880	863	1218	1014	1424	1141	1112	1597	1753	∗

It's interesting that even rankings relatively far from the Majority consensus tend to be closer to it than to most other ratings.

Another feature of the distance metric that makes it highly useful is that it is possible to capture and report the contribution of individual team ranks by a rating to the size of the variation. I've added a Corr report to the Analysis: links on the home page to report those, with ratings listed in closest-to-farthest from consensus order.

γ =	#Concordant pairs - #Discordant pairs

	#Concordant pairs + #Discordant pairs