"Intended" Schedule Strength

© Copyright 2005, Paul Kislanko

A "strength of schedule" measurement that includes games not yet played necessarily must be based upon some ordinal ranking. Simply averaging opponents' power ranking is not very useful, because the list of opponents' ranks does not typically have a normal distribution. It's also true that the difference between two teams is not linearly rated to the difference in ranks - there's usually a greater difference between say, rank 1 and rank 10 than between rank 51 and rank 60.

The SOS measurement used here takes those factors into account by relating the median opponent's strength to the median of the field weighted by how "top-heavy" or "bottom-heavy" the schedule is. There are two parts to the straightforward calculation:

  1. Sum the total differences of opponents' ranks from the median of the field. For instance, if there are 119 teams the median is 60, and for a typical schedule we might have:
    Opp RankRank - Median
    The sum of the differences is -10 and the average is -0.9. We chose games against opponents from the top half of the field to contribute the negative amount because lower numbers are better in ordinal rankings.
  2. The average difference from the field's median isn't useful to compare different teams' schedules - all teams could have a zero value because 10 and 109 is the same as 50 and 69. To come up with a schedule rating that can be compared, we just add the average deviation from the field median to the median power rating for this particular schedule. In this case we get:
    -0.9 + 57 = 56.1
    and then order the schedules by increasing value of schedule rating. In this case, the schedule came out 58th in the list, very nearly the overall median.

An example using pre-season composite computer rankings as the teams' strength rating is here.

Note that unlike a simple average, this measurement can fall outside of the range of the ordinal rankings used as input. A very top-heavy schedule (North Carolina has 10 of 11 opponents from the top half of the field) can be negative, and ten teams here have such a bottom-heavy schedule that the measurement falls "below" the worst rank in the field (119th).

The same formula can be applied in any case where an average doesn't properly characterize an aggregate of ordinal ranks. For example, it could be used as an alternative to Jeff Sagarin's "central mean" to aggregate rankings of teams by conference affiliation.