November 6, 2015

I'm beginning this week's column with a bit of an editorial, some analysis follows.

The editorial bit is that the College Football Committee should be using a better voting method. As I pointed out last year the committee "poll" should not be "list your top 25 teams in rank-order" it should be "list the teams you're considering for a top four rank" (there can be more than four!). Then the presentation of the results would not be a "top 25" ranking, it would be a list of teams that were mentioned on any voter's list, along with a count of how many voters mentioned them.

There's no need to make the committee try to order the teams - all they have to do is list the ones they're considering for including in their "top 4." When all the tallys are done, there probably won't be 25 teams on the list by this point in the season. A "top 25" doesn't really make sense, and publishing a "top 25" doesn't provide much in the way of useful information - did any voter rank the #25 team in the published results in their top 4? Probably not.

To show what I think a "College Football Committee" poll should like I've been publishing such a Top 4 Approval report based upon the computer rankings along with my other breakdowns of the computer rankings published at Dr. Massey's Comparison page. Here's a comparison of my report (as of 112 computer ratings through games of 31 October) to the CFC "poll" results:

#Top 4 Votes
(of 112)
TeamRank CFPTeam
58Ohio State22LSU
55LSU33Ohio St
42Alabama55Notre Dame
36Michigan State66Baylor
32TCU77Michigan St
26Notre Dame99Iowa
1Utah1614Oklahoma St
Southern California1615Oklahoma
Stanford1616Florida St
19Texas A&M
20Mississippi St

The comparison of the computers' "approval rating" to the committee's poll results is surprisingly close at the top (kudos to the humans) but the "# of voters with the team in a top 4" presentation gives a much better picture of the teams' relative position with respect to playoff contention. Note that I only allow the computers to pick 4 teams for their "considering top 4", whereas in the "poll" versions humans could list any number of "candidates." Using the "rank your top 25" voting, there's really no way to tell if a voter's #10 is relevant or not.

I would prefer using a lot of computer rankings (not one computer ranking) over any number of human rankings but that is the subject of other editorials that have been and will be.

How Bad is a Loss?

Teams to Watch
RankTeamConfWLBLITop 4
4Ohio StateB108058
7Notre DameND7126
8Michigan StateB108036
13Oklahoma StateB1280 
19Southern CaliforniaP125381
22Florida StateACC7133 
24Texas A&MSEC62 
25Mississippi StateSEC62 
25North CarolinaACC7127 
29Penn StateB1072 
The conventional wisdom is that the "teams in the hunt" come from the set of undefeated or one-loss teams. This strikes me as unnecessarily arbitrary, though as long as the committee subscribes to the notion there's not much harm in the assumption. For tracking purposes I have a bit more nuanced set of criteria.
Members of the Pseudo Smith Set
In election methods the Smith Set is a collection of alternatives for which every member has a pairwise advantage over every non-member. For my purposes "advantage" is defined by the existance of a "Team 1 beat Team B beat ... beat Team 2" that is either shorter than the one from Team 2 to Team 1 or there are more paths of the same length. As of this writing there are 20 teams that have stronger win-paths to each of the other 108 teams than the others have to them. (See Division 1 Win Path Summary.)
One-loss teams that are not in the Pseudo-Smith Set
There are seven such teams at this writing.
Teams that receive at least one computer rank better than fifth
Only Southern Cal and Michigan fall into this category.

As of now that gives us 29 teams, listed at the right along with the team values associated with the criteria:

The "majority consensus" (Bucklin) rank assigned by the computers listed in Dr. Massey's composite. This is the best rank for which more than half of the computers rank the team at least this highly. With 112 computer rankings, this corresponds to the 57th-best rank. When there are an odd number of rankings, this is just the median rank.
Team name.
Team's conference affiliation.
Number of division one wins.
Number of division one losses.
This is a "bad loss index." ∗ indicates that the team is in the pseudo Smith Set. Otherwise this is the number of teams outside of the Pseudo Smith Set that have a stronger A→B→... chain to the team than the team has to them.
Top 4
The number of computer rankings that have Team ranked 4th or better.

Constructing the pseudo Smith Set provides a way to quantify the notion "bad loss." A loss to a team that is in the pseudo Smith Set will not cause a team to be removed from it, so such a loss is not as "bad" as a loss to a team that already has a "bad loss" of its own. The degree of "badness" can be quantified by the BLI - Marshall's one loss gave 79 other teams the "advantage" over Marshall.

Clearly having a "bad loss" on a team's resume does not disqualify it - 12 teams had the "advantage" over last year's champion even after Ohio State won the playoff. But it is a pretty safe bet that the playoff will include four teams that are in the current list, so I will update it weekly from now until selection day.

