Playoff 2015

Playoff Selection Report Card

December 10, 2015

This year's "final four" turned out to be a "no-brainer" - in the sense that the committee came up with the same field the computers (which have no brains) would have. They came up with the same semifinal matchups, although they got the one/two and three/four seeds backwards. That earns the committee an A−.

12 teams got at least one top four ranking from the computers, and five teams were ranked fourth or better by more than half of the computers.

Massey Ratings College Football Ranking Composite as of Wed Dec 16 12:50:23
(117 computer rankings)

#	Team
115	Alabama
101	Oklahoma
100	Clemson
61	Michigan State
59	Ohio State
10	Stanford
9	Iowa
5	Baylor
3	TCU
2	Mississippi
	Notre Dame
1	Utah

An interesting point was made by the committee along the lines of "we want the best four, not necessarily the most deserving four." The computers are not all of the same mind when it comes to this - although Michigan State garnered more top 4 "votes", there actually are more computer ratings with the Buckeyes ranked better than the Spartans:

Massey Ratings College Football Ranking Composite as of Wed Dec 16 12:50:23
(117 computer rankings)

Ohio State (59)

Michigan State (58)

ARG #3 - #5
ASH #4 - #5
BAS #4 - #10
BBT #2 - #6
BCM #4 - #13
BDF #3 - #7
BRN #3 - #9
BWE #3 - #7
CGV #2 - #3
COF #4 - #5
CPA #4 - #15
CTW #2 - #4
DCI #4 - #9
DEZ #3 - #6
DII #4 - #5
DOI #3 - #21
DOK #2 - #10
DP #3 - #8
FPI #3 - #14
HEN #5 - #6
HKB #4 - #7
HNL #3 - #6
HOW #3 - #4
KAM #3 - #10
KEL #3 - #5
KLK #2 - #8
KPK #4 - #5
LOG #4 - #13
MAS #4 - #5
MDS #4 - #5
MOR #3 - #6
NOL #2 - #5
NUT #3 - #8
OSP #3 - #5
PAY #4 - #5
PIG #3 - #8
PIR #4 - #5
PTS #5 - #8
RBA #2 - #18
RSL #3 - #10
RT #4 - #5
RTH #4 - #5
RUD #4 - #5
RWP #3 - #5
S@P #4 - #9
SAG #4 - #9
SEL #4 - #5
SFX #3 - #5
SOL #3 - #7
SP #3 - #15
SRC #4 - #5
STH #3 - #5
STU #3 - #5
TFG #2 - #4
TPR #2 - #12
TRP #3 - #13
TS #5 - #6
WLK #3 - #8
WMR #2 - #7

ABC #6 - #4
ACU #6 - #4
AND #6 - #2
ATC #7 - #3
BIH #5 - #4
BIL #5 - #3
BOB #5 - #4
BOW #6 - #4
BSS #9 - #3
CI #5 - #1
CMV #3 - #2
COL #5 - #2
CPR #10 - #4
CSL #6 - #3
D1A #6 - #3
DES #6 - #3
DOL #6 - #3
DUN #7 - #4
ENG #4 - #2
EZ #5 - #2
FEI #7 - #5
FMG #5 - #3
GBE #5 - #1
GLD #5 - #2
GRS #5 - #4
HAT #5 - #2
ISR #6 - #2
JNK #5 - #2
KEE #5 - #4
KEN #5 - #4
KH #5 - #4
KNT #5 - #3
KRA #5 - #2
LAZ #7 - #4
LSD #5 - #3
LSW #7 - #3
MAA #6 - #2
MCK #5 - #4
MEA #5 - #3
MGN #5 - #4
MJS #5 - #3
MRK #4 - #3
MVP #5 - #4
MvG #6 - #3
PCP #7 - #3
PGH #7 - #3
PPP #5 - #3
REW #6 - #2
RFL #7 - #1
RTB #7 - #2
RTR #5 - #3
SOR #5 - #3
UCC #5 - #4
WEL #5 - #3
WIL #5 - #2
WOB #7 - #3
WOL #6 - #2
WWP #11 - #3

We can't know how a collection of humans combine their ~~biases~~ criteria weights, but we do know a lot about how the computer ratings work. Specifically I know how the three that I calculate work, and it is instructive that the most predictive (KLK in the list above) favors Ohio State, and the two that give more weight to results vis a vis opponent quality (ISR and WWP) favor Michigan State. The "out" for the computer-committee is the same as for the humans: "Consider the body of work." That suggests that we could define an objective metric that quantifies that vague notion. There are numerous ways to do that (college Hockey's pairwise matrix is an elegant one.) I'll present one that could be applied equally to human rankings and computer rating composite ranks, after a minor rant.

The Mythical "Eye Test"
I pretty much ignore punditry that includes this phrase. The problem is that anyone who uses it is unavoidably biased by both the objective data (games seen) and valid but unstated criteria for "looks good" and "looks bad." Even if everyone who uses the phrase were to watch every game, their rankings would be incomparable because the unstated underlying criteria may be fundamentally different. In other words, by nature it is subjective because it depends upon whose eyes are performing the test.
I would accept an "eye test" ranking provided the person who provides it meets the following criteria:

Has watched every game
Has a photographic memory
Can do 8,128 simultaneous comparisons in their head
I doubt that anyone who meets my criteria is spending their time ranking college football teams. I'd hope they're working on carbon-free energy possibilities, and fear they are working on more ways to target me with online ads or siphon fees from my retirement accounts.

Constructing a Body of Work Metric

The fundamental principle involved is the quality of opponents in teams' wins and losses. Individual computer ratings form their ratings based upon their determination of opponent quality, and presumably humans try to do the same. Our first step is to define a composite ranking for all teams so that every team's wins and losses can be categorized the same way. My choice is to use the "Bucklin" Majority to represent the computers. This is the best rank for which 50%+1 of the computers rank the team at least as high. Any other "vote-counting" method would work as long as ranks are assigned to all teams.

Step two is to assign the value of a win against a team with a specific rank, and the "cost" of a loss based upon rank. The simplest approach is just to use the rank values themselves but that can be misleading when summed over teams because opponents' rank difference 'n' is not the same at 10+n as 50+n. For example, the difference between teams ranked 1 and 10 is a lot higher than teams ranked 51 and 60. Similarly a loss to #1 is not all that different from a loss to #2. So I group the ranks into "buckets" to be used to formed histograms:

Ranks	1-3	4-9	10-21	22-40	41-89	90-108	109-120	121-126	127-NR
Grade	A+	A	B	C+	C	C−	D	E	F
Worth	64	48	32	24	16	8	4	2	1
Cost	1	2	4	8	16	24	32	48	64

The width of each "bucket" is based upon the presumption that team quality is normally distributed, and is defined to approximate ½ standard deviation except for C which corresponds to the mean ±½ standard deviation so has width of one standard deviation.

Form the Body of Work metric by assigning all the opponents their rank-based "grade", then form sums of the wins and losses based upon the equivalent weights. Subtract the losses sum from the winning sum and divide by the total number of games. Note that I do not consider this as much a "meta rating" as just a convenient sort sequence. What actually makes the results versus schedule strength visible is the pair of histograms. Here's the comparison for all the teams with an A or A+ grade.

BoW	Wsum	Lsum	Grade	Rank	Team	Wv:	A+	A	B	C+	C	C-	D	E	F	Lv:	A+	A	B	C+	C	C-	D	E	F
19.692308	272	16	A	4	Michigan State		0	2	2	0	6	2	0	0	0		0	0	0	0	1	0	0	0	0
19.153846	249	0	A+	3	Clemson		0	1	2	1	6	2	0	0	1		0	0	0	0	0	0	0	0	0
18.384615	243	4	A+	1	Alabama		0	0	2	6	2	0	0	1	1		0	0	1	0	0	0	0	0	0
17.666667	228	16	A+	3	Oklahoma		0	0	3	2	5	0	1	0	0		0	0	0	0	1	0	0	0	0
15.538462	210	8	A	7	Stanford		0	1	0	4	3	2	0	1	0		0	0	2	0	0	0	0	0	0
14.230769	187	2	A	8	Iowa		0	0	1	2	6	1	0	1	1		0	1	0	0	0	0	0	0	0
14.083333	172	3	A	8	Notre Dame		0	0	1	3	3	2	1	0	0		1	1	0	0	0	0	0	0	0
14.000000	170	2	A	4	Ohio State		0	0	1	0	8	1	0	1	0		0	1	0	0	0	0	0	0	0

The full list provides some interesting insight into how the quality of wins and losses determines the team's own grade. Again, the "BoW" metric is not a ranking on it's own, but it looks to me like a pretty good tiebreaker when the comparison is between a few pairs of teams out of the 8,128 pairs that contribute to the ratings.

Playoff Selection Report Card

December 10, 2015

Massey Ratings College Football Ranking Composite as of Wed Dec 16 12:50:23(117 computer rankings)

Massey Ratings College Football Ranking Composite as of Wed Dec 16 12:50:23(117 computer rankings)

The Mythical "Eye Test"

Constructing a Body of Work Metric

Massey Ratings College Football Ranking Composite as of Wed Dec 16 12:50:23
(117 computer rankings)

Massey Ratings College Football Ranking Composite as of Wed Dec 16 12:50:23
(117 computer rankings)