Top 25 Correlations

August 27, 2017

I've always wondered what the "best" way to correlate different ratings' "top 25" is. It is a non-trivial question because because there is more than way to define the "consensus." Do you start with the consensus ranking based upon all ranks or just use the top 25 ranks from each rating to form a "top 25 consensus" and compare to that?

I still don't know what the answer is, but I've always wondered how I would define "correlation to top 25" using the latter approach. Actually, I knew how I would do it, but was too ~~lazy~~ busy to implement it until I was prompted to by compare the human top 25s to the computers.

Presuming that Dr. Massey's "Top 25 Cor to Con" is a correlation of the teams in a rating's top 25 to the "consensus" ranking of all teams by all ratings

The "average" or "consensus" ranking for each team is determined using a least squares fit based on paired comparisons between teams for each of the listed ranking systems. If a team is ranked by all systems, the consensus is equal to the arithmetic average ranking. When a team is not ranked by a particular system, its consensus will be lowered accordingly.

I choose to calculate the "consensus of top 25s" by only considering ranks 1-25 for each of the computers, as reported in Computer Top 25. The notion of correlation is made more complicated by the fact that different ratings' top 25s can (invariably do) include different teams.

For rank-based correlations I use the distance metric to characterize the correlation between two ratings. This is the number of team-pair swaps it would take to re-order one list to match the other. That doesn't work for lists that don't have the same elements, as is usually the case with the computer rankings' top 25s. No number of swaps of teams in one list can add a team to it!

So we add percentage of total team-pairs that are "concordant" as a measure of how close two rankings are. Instead of -1 ≤ x ≤ 1 we have 0 ≤ x ≤ 1.

Kendall's tau:

τ = #Concordant pairs − #Discordant pairs

#pairs

Goodman and Kruskal's gamma:

γ = #Concordant pairs − #Discordant pairs

#Concordant pairs + #Discordant pairs

Percentage of Pairs that are Concordant

%Concordant = #Concordant pairs

#Total pairs

I calculate the "top 25 consensus" by just summing 26-team's rank by rating over all ratings that rank the team. This is not a good method for counting votes when all ranks 1-130 are considered, but it has the same properties as the consensus described by Dr. Massey when applied to truncated rankings. Using the ranings from Massey Ratings College Football Ranking Composite as of Sun Aug 27 05:28:22 we have 64 teams in at least one of 40 ratings' top 25.

Top 25 Correlation to Top 25 Consensus

#Common #Teams Conc Disc #Pairs γ τ %Conc

Borda 23 27 334 14 351 0.9195 0.9117 0.9516

Mix 23 27 331 17 351 0.9023 0.8946 0.9430

DES 23 27 322 26 351 0.8506 0.8433 0.9174

PIR 23 27 313 35 351 0.7989 0.7920 0.8917

KPK 22 28 324 47 378 0.7466 0.7328 0.8571

PGH 21 29 346 47 406 0.7608 0.7365 0.8522

DII 21 29 346 47 406 0.7608 0.7365 0.8522

BIL 22 28 322 49 378 0.7358 0.7222 0.8519

PIG 21 29 345 48 406 0.7557 0.7315 0.8498

DOK 22 28 319 52 378 0.7197 0.7063 0.8439

KAM 22 28 318 53 378 0.7143 0.7011 0.8413

YAG 21 29 335 59 406 0.7005 0.6798 0.8251

KEL 21 29 334 59 406 0.6997 0.6773 0.8227

HOW 21 29 334 60 406 0.6954 0.6749 0.8227

SAG 21 29 331 62 406 0.6845 0.6626 0.8153

MAS 20 30 353 61 435 0.7053 0.6713 0.8115

BRN 21 29 329 64 406 0.6743 0.6527 0.8103

ARG 21 29 329 64 406 0.6743 0.6527 0.8103

TPR 21 29 328 66 406 0.6650 0.6453 0.8079

MOR 21 29 325 68 406 0.6539 0.6330 0.8005

FEI 21 29 314 79 406 0.5980 0.5788 0.7734

DWI 20 30 333 81 435 0.6087 0.5793 0.7655

FPI 19 31 354 81 465 0.6276 0.5871 0.7613

HAT 19 31 352 82 465 0.6221 0.5806 0.7570

MAR 20 30 329 85 435 0.5894 0.5609 0.7563

#Common #Teams Conc Disc #Pairs γ τ %Conc

DCI 20 30 329 85 435 0.5894 0.5609 0.7563

RTP 19 31 346 88 465 0.5945 0.5548 0.7441

MGS 20 30 323 92 435 0.5566 0.5310 0.7425

DEZ 22 28 280 91 378 0.5094 0.5000 0.7407

BWE 18 32 364 90 496 0.6035 0.5524 0.7339

RUD 19 31 341 94 465 0.5678 0.5312 0.7333

CTW 19 31 337 97 465 0.5530 0.5161 0.7247

BDF 20 30 309 105 435 0.4928 0.4690 0.7103

RWP 20 30 305 109 435 0.4734 0.4506 0.7011

CGV 19 31 321 114 465 0.4759 0.4452 0.6903

PFZ 18 32 328 126 496 0.4449 0.4073 0.6613

NGS 18 32 306 148 496 0.3480 0.3185 0.6169

LSD 17 33 316 155 528 0.3418 0.3049 0.5985

ENG 17 33 307 164 528 0.3036 0.2708 0.5814

LOG 17 33 303 168 528 0.2866 0.2557 0.5739

NUT 17 33 279 192 528 0.1847 0.1648 0.5284

PPP 13 37 318 215 666 0.1932 0.1547 0.4775

I've added a Top 25 Consensus page with three reports.

Top 25 Rank Correlations to Consensus: Lists the actual top-25 ranks for each rating, the consensus top 25, and what the top-25 list would be based upon average (Borda) and median (Mix) rankings using all team ranks (not just the top 25) from all ratings. The latter two are not used to determine the consensus.
For each team in any ratings' top-25:; Points is the usual sum of 25 points for each #1 ranking, 24 for each #2, and so on down to zero points for ranks worse than #25.; ∑Dist is the sum of the number of discordant pairs over all ratings. Lower numbers indicate stronger agreement among the computers about the team's rank.; #Votes is the number of ratings that include the team in their top 25.; Team links to the list of ratings that rank the team at each rank that any does.
For each Rating; ∑Dist is the sum of the number of discordant pairs when the rating is compared to every other rating. The lower the number the more representative the ranking is of all rankings.; Dist from consensus is the number of discordant pairs when the rating's top 25 is compared to the consensus top 25.; Common w/ Cons is the number of teams in both the rating's top 25 and the consensus top 25 (the intersection of teams ranked by each.) To find the number of teams ranked by either (the union) just subtract this number from 50.; %Concordant w/ Cons is (10000×) the ratio of concordant pairs to total pairs.
Computer Ratings Top 25 % Concordant: For any two ratings displays the percentage of team-pairs that are concordant, to four decimal places with the leading zero and decimal point removed. The column-rating that matches the best agreement with the row-rating is displayed as blue and those which least agree in red. Higher numbers indicate better agreement between the ratings.
Computer Ratings Top 25 # Common Teams: The order of the intersection of teams in row-rating's top 25 and column-rating's top 25 is displayed in row, column. If this value is c the number of pairs used to derive the previous report is
(50 − c) × (50 − c − 1) ⁄ 2 = ( c² − 99×c + 2450 ) ⁄ 2

The "top 25" is not as significant as its popularity might suggest. It approximately selects the top quintile of the 1A field, but the main value in voting for a top 25 is to rank some smaller number of teams.

γ =	#Concordant pairs − #Discordant pairs

	#Concordant pairs + #Discordant pairs

	#Common	#Teams	Conc	Disc	#Pairs	γ	τ	%Conc
Borda	23	27	334	14	351	0.9195	0.9117	0.9516
Mix	23	27	331	17	351	0.9023	0.8946	0.9430
DES	23	27	322	26	351	0.8506	0.8433	0.9174
PIR	23	27	313	35	351	0.7989	0.7920	0.8917
KPK	22	28	324	47	378	0.7466	0.7328	0.8571
PGH	21	29	346	47	406	0.7608	0.7365	0.8522
DII	21	29	346	47	406	0.7608	0.7365	0.8522
BIL	22	28	322	49	378	0.7358	0.7222	0.8519
PIG	21	29	345	48	406	0.7557	0.7315	0.8498
DOK	22	28	319	52	378	0.7197	0.7063	0.8439
KAM	22	28	318	53	378	0.7143	0.7011	0.8413
YAG	21	29	335	59	406	0.7005	0.6798	0.8251
KEL	21	29	334	59	406	0.6997	0.6773	0.8227
HOW	21	29	334	60	406	0.6954	0.6749	0.8227
SAG	21	29	331	62	406	0.6845	0.6626	0.8153
MAS	20	30	353	61	435	0.7053	0.6713	0.8115
BRN	21	29	329	64	406	0.6743	0.6527	0.8103
ARG	21	29	329	64	406	0.6743	0.6527	0.8103
TPR	21	29	328	66	406	0.6650	0.6453	0.8079
MOR	21	29	325	68	406	0.6539	0.6330	0.8005
FEI	21	29	314	79	406	0.5980	0.5788	0.7734
DWI	20	30	333	81	435	0.6087	0.5793	0.7655
FPI	19	31	354	81	465	0.6276	0.5871	0.7613
HAT	19	31	352	82	465	0.6221	0.5806	0.7570
MAR	20	30	329	85	435	0.5894	0.5609	0.7563
	#Common	#Teams	Conc	Disc	#Pairs	γ	τ	%Conc
DCI	20	30	329	85	435	0.5894	0.5609	0.7563
RTP	19	31	346	88	465	0.5945	0.5548	0.7441
MGS	20	30	323	92	435	0.5566	0.5310	0.7425
DEZ	22	28	280	91	378	0.5094	0.5000	0.7407
BWE	18	32	364	90	496	0.6035	0.5524	0.7339
RUD	19	31	341	94	465	0.5678	0.5312	0.7333
CTW	19	31	337	97	465	0.5530	0.5161	0.7247
BDF	20	30	309	105	435	0.4928	0.4690	0.7103
RWP	20	30	305	109	435	0.4734	0.4506	0.7011
CGV	19	31	321	114	465	0.4759	0.4452	0.6903
PFZ	18	32	328	126	496	0.4449	0.4073	0.6613
NGS	18	32	306	148	496	0.3480	0.3185	0.6169
LSD	17	33	316	155	528	0.3418	0.3049	0.5985
ENG	17	33	307	164	528	0.3036	0.2708	0.5814
LOG	17	33	303	168	528	0.2866	0.2557	0.5739
NUT	17	33	279	192	528	0.1847	0.1648	0.5284
PPP	13	37	318	215	666	0.1932	0.1547	0.4775