The Committee gets an A+

December 7, 2017

It was kind of fun to watch ESPN's selection shows. There were all sorts of discussions that included phrases like "the resume" which in turn led to different people defining the "resume" different ways. I especially liked Nick Saban's editorial about how much easier it would've been for Alabama to schedule an FBS team instead of Mercer if all the other FBS teams who had an open date that week weren't playing 1AA teams too. Nick said "Every FBS team should play 12 other FBS teams." I agree with that, and furthermore suggested that Dabo Swinney's suggestion of a "pre-season" game would be a great place to "park" the 1A vs 1AA games that provide income for the 1AA teams and a scrimmage for the 1A teams, as long as the 1A team has 12 1A opponents..

It wouldn't have just been easier for teams to schedule if the rules required 12 1A opponents: it would make it much easier for all of us to compare teams with no common opponents.

What's a "Resume"?

The pundits seemed to be using "records vs top n" by their favorite ranking. I even call one of my team reports "resume view" but I have a different one for each rating. How many different "resumes" can a team have? That's a trick question, because the answer is not just "as many as there are ratings that rank their opponents", you have to multiply by the number of ways you can combine opponents' ratings.

Defining "resume" would be easy if schedules were relatively symmetric since there can't be a round-robin. What if we form a "committee" of as many different objective rankings (i.e. computer algorithms) as we can find and let them decide?

As I'm writing there are 99 computer rankings available, and the process to determine "the best 4" could go something like this.

1 List all of the teams ranked in the top four by any member of the committee.
There are 14 of these:
43Ohio State
18Penn State
2Southern California
Notre Dame
Clemson and Georgia are ranked by a majority (50 or more) of the ratings ranking them in the top four, so they are in.

2 Order the remaining teams by the best rank that at least 50 of our "committee" rankings agree the team should be ranked at least this highly.

5Ohio State
7Penn State
10Notre Dame
12Southern California
Since there are two spots left and four teams ranked 5th or better by a majority of the committee we can eliminate all of the teams ranked worst than 5th, leaving Ohio State, Oklahoma, Wisconsin, and Alabama as four choices for the last two spots.

3 Compare the "resumes" of the remaining teams.
My choice for this most basic resume is to compare every team to every other team based on their relative position in the directed games graph. For teams A and B, count a "second order win" for A if the path from A to B is stronger than that from B to A, a loss if that from B to A is stronger, and a tie if the paths are of equal strength. We order the candidates by "second order winning percentage":

5Ohio State112B1024430.83721031610
Our computer committee might have chosen Alabama and Wisconsin for the last two spots.

It is interesting that all the human discussion notwithstanding, the last spot was between Oklahoma and Wisconsin, not Alabama and Ohio State. And the loss that hurt Ohio State the most was probably not at Iowa but the home loss to Oklahoma.

From the SOWP data for all teams we can find the pseudo Smith Set – the set of teams that have a stronger path to every team outside of the set. This year that is undefeated UCF and Memphis, whose only losses were to UCF. My "Bad Loss Index" (BLI) is the number of second-order losses or ties to teams outside of the Smith Set. (There isn't a "good loss" but there's no shame in losing to a team that nobody beat.)

The BLI is lower for Wisconsin than Ohio State because Wisconsin's wins over common opponents "cut off" some of the potential ?→Ohio State→Wisconsin chains. The same structure prevents LSU's bad loss to Troy from affecting Alabama, since Troy→LSU→Auburn→Alabama is trumped by Alabama→LSU. Ohio State can make up for the conference loss by beating teams that beat Iowa, but it couldn't beat any teams that beat Oklahoma.

But consider that if instead of comparing the four contenders for the last two spots based upon how their resumes with respect to the whole field, we only consider their relative position in the directed games graph against only the remaining teams. Taking the six pairs in turn we assign a 1 for a second order win, 0 for a loss, and ½ for a tie. Here's what we get if we use

3 Compare the "resumes" of the remaining teams
Form the Condorcet pairwise matrix of the remaining teams and select the team(s) with the most pairwise wins
PairwiseAlabamaOklahomaOhio StateWisconsin
1Ohio State00*1
So the committee chooses Alabama and Oklahoma instead of Alabama and Wisconsin to go with Clemson and Georgia, and there you have it. 99 computers using entirely objective criteria might have come up with the same answer as 13 subjective humans. Kudos to the humans for getting it right.

8 Dec 2017 update
I was remiss in not applying the pairwise-SOWP comparison to derive the seeding. Here the computer committee would disagree with the committee slightly, swapping the #3 and #4 giving different semifinal matchups.

© Copyright 2017, Paul Kislanko
Football Home