I ought to be a bit embarrassed about the disclaimer I had posted on the site
The ratings I calculate will not be available until the field is well-enough connected for them to make sense.It had originally said "until the field is connected" but I had constructed the field so that it became connected after week 3. With only two or three games per team and only a fraction of all games matchups between "like" teams, the ratings didn't "look right" so I didn't publish them. The embarrassing bit is that I was contradicting my own statement that "any computer rating no matter how bad is better than any human ranking no matter how 'expert' the human." Here I allowed one human's judgement - decidedly not an expert's - to invalidate two computer ratings' rankings.
This got me thinking again about whether there's an objective way to determine when there is enough data to begin paying attention to a particular rating's ranking or to rankings in general. For the ratings I calculate I suspect the criterion is the average path length between two vertexes of the connected games graph, but would have to guess what the value for "connected enough" would be. For "rankings in general" I wondered if the correlation of one week's composite to the next might be a viable metric. That may turn out to be useful but if the conjecture is right it isn't yet obvious but it does provide an opportunity to revisit rank correlations.
The measure of correlation I use is the distance between two ordinal rankings:
The distance is the number of discordant pairs - the number of pairs where the teams' relative orders are reversed in the two rankings. When the teams are in the same relative order in both lists the pair is said to be concordant. When the teams have the same rank in either list the pair is ignored.I like this because it's a simple counting statistic (the distance between two ratings is the total number of position swaps it would take to transform one list into the other) and because it is easy to capture the contribution to the distance by each member of the list. I mostly use rank-correlations to compare different rankings (Computer Ranking Correlation to Majority Conssensus) but the same principle applies to comparing successive instances of the same ranking. My conjecture was that we might be able to tell when rankings in general become useful based upon the week-to-week variation in the Computer Rankings by Bucklin Majority ranks.
I really should have known better. Having a different number of ratings from one week to another is not really a problem because the majority consensus definition is based upon the median, and that is fairly stable with respect to number of ratings. The problem is that very quickly the variation from week to week switches from "ratings getting better" to "all ratings taking into account recent results." The week-to-week variation itself is not very large to begin with (less than 10 per cent of the pairwise comparisons are discordant) and it would take more work than I have done to distinguish between the two effects.
Although my guess turns out to not be all that useful through five weeks, in the process of finding that out I produced a couple of reports that taken together provide good examples of how a team's contribution to the distance measure is more useful than just their rank differences. The chart on the left shows the distance contribution of each team between pre-season and week one, week one → week2, etc. along with a symbol that indicates the direction of rank difference from week4 to week5. The chart on the right shows the actual ranks from pre-season through week five.
Both Texas A&M and Ole Miss have the same ranking after week five as after week four but both contribute three to the distance between those week's consensus rankings. In the Aggies' case, that was because teams ranked better than their 7th fell below them (Houston 6→8, Louisville 6→10 and Stanford 6→14). In Ole Miss' case, the contribution comes from two teams jumping Ole Miss (Miami 18→11, Western Michigan(!) 25→14) and one higher-ranked team falling below (Florida State 11→19.) To make visible why teams moved the way they did the second chart links to the teams' resumes based upon current Majority Consensus ranks.
Majority Consensus 6 Oct 2016 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Temporal Correlation
|
Ranking History
|