Fairness and Simplicity

© Copyright 2005, Paul Kislanko

In my last article I suggested a method for counting votes in the human poll that would provide the desired "transparency" of the polling process and still provide for anonymous ballots. In this one I'll turn my attention to how the polls (and computer rankings) might be used to avoid the problems with putting too much emphasis on one component.

The problem with human polls is that they are subject to human nature. A team that has a bye week is likely to fall in such a ranking, a team that's ranked highly early tends to remain that way even if other teams are playing better, and so on. But because human nature is somewhat universal, all of the human polls are subject to these phenomena. So, the first recommendation is

just use one poll.

Coaches and Harris Poll Comparison
Week 9

Best Team USA HAR d(CP) d(HP) d(Team)

1 Southern California 1 1 0 0 0

2 Texas 2 2 0 0 0

3 Virginia Tech 3 3 0 0 0

4 Alabama 4 4 0 0 0

5 Miami-Florida 5 5 0 0 0

6 LSU 6 7 0 1 1

6 UCLA 7 6 1 0 1

8 Florida St 8 8 0 0 0

9 Notre Dame 9 9 0 0 0

10 Georgia 10 11 0 1 1

10 Penn State 11 10 1 0 1

12 Ohio State 12 12 0 0 0

13 Oregon 13 13 0 0 0

14 Wisconsin 14 14 0 0 0

15 Florida 15 15 0 0 0

16 West Virginia 16 18 0 4 2

16 Texas Tech 17 16 1 0 1

17 Auburn 18 17 1 0 1

19 Boston College 19 20 0 1 1

19 TCU 20 19 1 0 1

21 California 21 21 0 0 0

22 Fresno St 22 23 0 1 1

22 Michigan 23 22 1 0 1

24 Colorado 24 24 0 0 0

25 Louisville 25 25 0 0 0

Variance from Best 0.49 0.57

Matches Best 19 20

Coaches and Harris Poll Comparison Week 9
Best	Team	USA	HAR	d(CP)	d(HP)	d(Team)
1	Southern California	1	1	0	0	0
2	Texas	2	2	0	0	0
3	Virginia Tech	3	3	0	0	0
4	Alabama	4	4	0	0	0
5	Miami-Florida	5	5	0	0	0
6	LSU	6	7	0	1	1
6	UCLA	7	6	1	0	1
8	Florida St	8	8	0	0	0
9	Notre Dame	9	9	0	0	0
10	Georgia	10	11	0	1	1
10	Penn State	11	10	1	0	1
12	Ohio State	12	12	0	0	0
13	Oregon	13	13	0	0	0
14	Wisconsin	14	14	0	0	0
15	Florida	15	15	0	0	0
16	West Virginia	16	18	0	4	2
16	Texas Tech	17	16	1	0	1
17	Auburn	18	17	1	0	1
19	Boston College	19	20	0	1	1
19	TCU	20	19	1	0	1
21	California	21	21	0	0	0
22	Fresno St	22	23	0	1	1
22	Michigan	23	22	1	0	1
24	Colorado	24	24	0	0	0
25	Louisville	25	25	0	0	0
	Variance from Best			0.49	0.57
	Matches Best			19	20

After the first half of the season (really well before that) there's just not enough difference between any of the human polls to tell them apart. Given that the AP pulled out because their voting is too transparent and the coaches have an obvious conflict of interest or two, we may as well use the Harris poll. It is sponsored by the BCS anyway, and if humans should have more weight (with which I disagree) that can be factored into the formula.

As you can see from this comparison of the polls after week eight, the coaches poll and Harris poll:

selected exactly the same "top 25" teams;
agreed exactly on 14 ranks;
differed by one position on 10 ranks;
and
the maximum difference in ranks for any team is two for one team

The only difference in the variance from the better rank of the two polls is exactly that one has Virginia Tech 16th and the other 18th.

The computers are not nearly as unanimous, which in general is a good thing. They all have different means of handling strength of schedule, some have different weights for later games, some (like human voters) carry over results from prior years, some factor in game location, some opponents'and opponents' opponents' records, and so forth. To the extent that these are all important to judging the quality of a team, some synthesis of different perspectives is desirable.

I have always thought that the way the human polls and computer rankings were handled should be the same, and in 2004 they changed the formula so that it superficially was. However, the normalization was to the "# of voters" level, and still resulted in different weights for each "voter". Also, the "don't include best and worst computer ranking" has no analogue in the human polls (though Harris' "trimming" in their second poll was similar in spirit).

So this brings us to my second recommendation:

use only ordinal ranks to combine all the components. Just as voters can only enter their first through 25th choices, take only the top 25 teams from each computer, and only the top 25 from the results of the Harris Poll. We have been inferring that the number of points associated with teams in the "others receiving votes" category provided a 26th, 27th, etc. team, but that's not the case. The computers rank all 119 teams 1 through 119, so 26th in a computer ranking is not the same as 26th in the Harris poll (which might be 3 voters ranking the team 18th, 22nd, and 25th).

Taking only the ordinal ranks makes sense because we do not know in general how the different computers come up with theirs. The fact that we know the humans use a flawed election method (Borda) to do so does not make that method a useful one for combining the computer rankings into one. If we list only the ordinal rankings from each source component for week 9 we get:

BCS Computer Rankings + Harris Results
(Top 25 only)

Team AND BIL COL HAR MB SE WOL

Texas 1 1 1 2 1 1 1

Southern California 2 2 4 1 3 2 3

Virginia Tech 3 3 2 3 2 3 2

Alabama 4 8 5 4 4 4 4

UCLA 5 5 6 6 6 6 6

Penn State 6 12 3 10 5 5 5

Wisconsin 7 6 8 14 8 8 8

Miami-Florida 8 9 14 5 9 11 12

Ohio State 9 13 9 12 7 7 9

Georgia 10 4 10 11 17 18 14

LSU 11 7 11 7 15 17 16

Oregon 12 7 13 11 9 7

Florida St 13 14 13 8 16 13 13

Florida 14 10 18 15 18 16 17

TCU 15 15 15 19 25 23 15

West Virginia 16 16 12 18 12 12 10

Colorado 17 18 19 24 14 15 18

Michigan 18 17 17 22 13 14 19

Boston College 19 22 20 20 20 20 20

Oklahoma 20 21 19 22 21

Texas Tech 21 11 16 16 10 10 11

Notre Dame 22 24 22 9 22 19 22

Georgia Tech 23 25 25 24 25

Minnesota 24 23 21 21 23

Northwestern 25 24 23 24 24

Boise St 19

Auburn 20 17

California 21 21

Louisville 23 25

Fresno St 23 25

Components are listed alphabetically left to right,
and teams ordered only by leftmost rank.

BCS Computer Rankings + Harris Results (Top 25 only)
Team	AND	BIL	COL	HAR	MB	SE	WOL
Texas	1	1	1	2	1	1	1
Southern California	2	2	4	1	3	2	3
Virginia Tech	3	3	2	3	2	3	2
Alabama	4	8	5	4	4	4	4
UCLA	5	5	6	6	6	6	6
Penn State	6	12	3	10	5	5	5
Wisconsin	7	6	8	14	8	8	8
Miami-Florida	8	9	14	5	9	11	12
Ohio State	9	13	9	12	7	7	9
Georgia	10	4	10	11	17	18	14
LSU	11	7	11	7	15	17	16
Oregon	12		7	13	11	9	7
Florida St	13	14	13	8	16	13	13
Florida	14	10	18	15	18	16	17
TCU	15	15	15	19	25	23	15
West Virginia	16	16	12	18	12	12	10
Colorado	17	18	19	24	14	15	18
Michigan	18	17	17	22	13	14	19
Boston College	19	22	20	20	20	20	20
Oklahoma	20		21		19	22	21
Texas Tech	21	11	16	16	10	10	11
Notre Dame	22	24	22	9	22	19	22
Georgia Tech	23	25	25		24	25
Minnesota	24		23		21	21	23
Northwestern	25		24		23	24	24
Boise St		19
Auburn		20		17
California		21		21
Louisville		23		25
Fresno St				23			25
Components are listed alphabetically left to right, and teams ordered only by leftmost rank.

The Formula?

This is fairly simple, though it is based upon the "majority rule" principle I discussed last time as the Bucklin method. Basically it says if more than half of the components (four in this case, since we have seven inputs) agree a team should be ranked N or higher, we give that team rank N. Nothing could be simpler, except there will be ties.

1 Drop the three lowest ranks (including "not ranked"): This eliminates all the teams that are not considered "top 25" by the majority of inputs.
2 Drop the three highest ranks: The remaining rank (which may often be equal to some or all of those dropped in this step) is the highest rank for which a majority of the input rankings agree the team deserves.

If all of the inputs agree on which teams are the 25 best, this is equivalent to just selecting the median ranking for each team. If not all inputs rank the team but a majority do, it is just "select the fourth highest ranking" for each team.

There usually will be ties, though, and there's a two-stage tiebreaker. For teams with the same "majority ranking":

1 Order the tied teams by the number of input rankings that contributed to the tie: A team that 5 inputs agree that team X should be ranked 5th or better is ranked higher than team Y if only 4 inputs agreed that team Y should be ranked 5th or better.
2If teams are still tied after stage 1, then order the tied teams by "how close the teams are to winning the rank": There are many ways to do this, but essentially it involves just taking the ballots that did not contribute to the selection of rank N for each of the tied teams and use those for only the tied teams to hold a new "election" for rank N+1.

An example using the inputs above illustrates the process.

Suggested Ranking						Notes
	Team	Majority	#≤Maj	Tie Brkr	Borda
1	Texas	1	6		825
2	Southern California	2	4		816
3	Virginia Tech	3	7		815
4	Alabama	4	5		800
5	Penn State	5	4		787	Using Borda would've had UCLA ahead of Penn State even though a majority of the inputs had the Nittany Lions ranked higher than the Bruins
6	UCLA	6	7		793
7	Wisconsin	8	6		774
8	Ohio State	9	5		767	5 inputs have the Buckeyes ranked 9th or better but only 4 have the Hurricanes that high. This illustrates a stage 1 tiebreaker.
9	Miami-Florida	9	4		765
10	Oregon	11	4	1.51	655	Here we have an example of a stage 2 tiebreaker. Oregon has votes for #12 and #13, Georgia for #14, #17, and #18, LSU #15, #16, #17, and Texas Tech has 2 #16s and a #21. Clearly Oregon's closer to a 5th #11 vote (by far) than any of the others. Borda would've ranked Oregon behind these teams, just because one input left them out.
11	Georgia	11	4	0.64	749
12	LSU	11	4	0.62	749
13	Texas Tech	11	4	0.50	738
14	West Virginia	12	4		737
15	Florida St	13	5		743
16	TCU	15	4		706
17	Florida	16	4		725
18	Michigan	17	4		713
19	Colorado	18	5		708
20	Boston College	20	6		692
21	Oklahoma	21	4		492
22	Notre Dame	22	6		693
23	Minnesota	23	4		483
24	Northwestern	24	4		475
25	Georgia Tech	25	5		473

26	Auburn				201	Teams that aren't ranked in the top 25 by at least four of the seven inputs are listed in Borda order. But by using a "true" Borda count we can tell that the first four of these were listed on two of the inputs and Boise State on only one of the seven. So we know that that Auburn needs to move into the top 25 in two more computers to get a ranking, and Boise needs to improve in three.
27	California				196
28	Louisville				190
28	Fresno St				190
30	Boise St				100

This approach is very simple, and likely to result in better orderings than the current one which gives too much weight to the polls. Based upon the reaction a few years ago when the computers determined the best matchup, one would expect that there would be complaints that the human polls only have 1/7th input. That is not quite correct, since the influence that any one component has depends upon where it fits with respect to all of the other components.

In any case, for years we've avoided the real issue - if computers "aren't to be trusted" then we need some unambiguous way to define what "trusted" means in terms of picking the best two teams. That will be the subject of my next essay.