Deciding whether gaudy offensive numbers are due to good offense or weak opponents' defense (or both!) is an interesting and complex problem. The short answer is that you have to adjust the statistics based upon the strength of the opponent (lower offensive production is expected against a top defense, worse defensive stats are expected against a top offense.) The complexity is trying to decide whether the "good defense" looks good only because it played "bad offenses", and vice versa.
The simple answer is "adjust offensive and defensive stats by SOS" but that begs the question "which SOS?"
A general definition of SOS is equivalent to the first derivative of a function that characterizes a team. (Mathephobes may want to skim past the next bit.)
What we need is a numerical rating that relates to the statistic that we want to normalize. If we have a rating system that combines the metric with other statistics, we would like to be able to separate that out.
If we have a rating R(t) = ƒ(...,PSt,PAt,...), then we can use ∂ƒ / ∂PA to adjust a team's Points Scored, and ∂ƒ / ∂PS to adjust its Points Allowed. The mentally-tough part is that ƒ′( t ) is a function of team t's opponents' R, and in general if we don't know how ƒ uses the PS and PA variables, we can't find the derivative with respect to PA or PS.
If all we cared about was average MOV, we could do something like
|
Suppose we take averages over games instead of averages over teams. In this case we would find R(t) by:
|
Note that if G(g) = constant for all games for all teams, then the result is a pure combination of winning percentage, opponents' winning percentage, opponents' opponents' winning percentage, and so on. This algorithm is Boyd Nation's original Iterative Strength Rating.
The question is what should G be if what we're concerned about is scoring? What we'd like is something that combines the ability to score points and prevent points from being scored and has a different value for each game. Typically this is Margin of Victory, but that can be misleading - not all 7-point wins are equal. A team that wins 42-35 was much more in danger of losing than one that won 10-3. So I chose a measurement called Strength of Victory (which I think is commonly used to analyze the NFL.)
SOVgame = | (winning score - losing score) |
( winning score + losing score ) |
When we set
G(g) = SOVgame | × | (average points per game) |
(average SOV for all games) |
It turns out that the ISOV is not a particularly good "power rating" by itself. But because of the way it is constructed, the SOSISOV (defined as average of opponents's ISOV values) is essentially the derivative with respect to "the ability to score and prevent scores", and
|
The ISOV (and its SOS) are normally-distributed because of the way they are calculated, but in all sports there's a bit of assymetry - there are more bad teams than good ones. So we can improve the approximation some by comparing each team's SOS to every other team's ISOV. In the process, we can add back in the home field advantage factor for each team. So for each of the 7140 pairs of D-1A teams, we consider three possible game locations to construct Normalized Scoring Statistics. This amounts to "imagining" how a team's scoring offense and defense would rank if every team played every other team home and home and on a neutral field.