Input Impacts

Schedule Topology Ratings Effects

© Copyright 2011, Paul Kislanko

It should not come as a surprise to followers of this site that I have found yet another measure of how "connected" a team is to the "field." Back in August I described how to use the incidence matrix to find the number of A↔...↔B paths between any { A B } team-pair, and the new metric uses those to illustrate some problems current FBS scheduling philosophies present to analysts.

Every "advanced" rating system uses every path in the games graph to form an objective team ranking. Every team in the FBS is connected to every other team in FBS by an A↔B↔... chain of no mor than 4 ↔s. But there is a qualitative difference in the way some teams are treated. Every "advanced" method will take into account A↔B₁↔B₂ with the same weight as A↔B_i↔C and A↔B₁↔C₁↔B₂↔A the same as A↔B↔C↔D↔E.

The new metric is:

Average(# paths of length λ to teams that are the team's opponents + # λ-long paths that begin and end at the team)

average(# paths of length λ to teams that are not opponents or the team itself)

(Feel free to suggest a catchy name for it.)

The higher this number, the less-connected is the team. Since advanced ratings use at least all the paths up to the games graph diameter (most use them more than once) here's what that measure looks like for λ=4:

84 of the 120 FBS teams have values between 3 and 4, and three more teams are in that range ±0.02 units. The most-connected teams are the FBS Independents - especially the Acadamies play the most geographically diverse schedules in the field. The least-connected? The 22 teams in the Pac 12 and Big 12.

The 9-game conference schedules those conferences play just do not allow enough scheduling opportunities. A consequence of the "knots" in the games graph these extreme teams form is that advanced ratings that treat each game ("edge" in the games graph) equally wind up using qualitatively different formulas for these teams than for teams with "normal" connectivity. Roughly, the opponents' winning percentage counts twice as much relative to the rest of the field.

Aside: Although I used opponents' winning percentage as an example, the same consideration applies regardless of the advanced ratings' components. As long as each edge of the games graph (game) is treated equally, the the effective algorithm varies with the "local topology" for a specific team compared to the "global."

Playing an extra conference game isn't the only reason a team might be less connected to the FBS field. In 2011 93 FBS teams played 97 games against 77 FCS teams. (76 of the 246 Division 1 teams played only opponents from the same subdivision.)

Performing the same calculation for all of D1 (for which the games graph diameter is 6) and ranking them 1 (least-connected) to 246 (most-connected) we get

Again there are about 90 teams whose values fall roughly between 3 and 4 and they are almost all FBS teams.

Looking at just the FBS teams' ranks we see they fall into two distinct groups:

The vertical gaps are FCS teams' ranks, so we see that the 22 teams that play only three non-conference games are only as connected to the D1 field as the typical FCS team and the FBS teams with measurements less than 4 O-paths to ¬O paths are more connected than any but the independent FCS teams.

Another source of variation in algorithms is the field definition itself. To form the above graph, I first ranked the entire D1 field and then ordered the FBS teams 1-120 based upon their relative position in the entire list. But this is not the same order they were in when only FBS games were used.

The groupings stay the same, but the ordering with group is "shuffled."

What this illustrates is that the same rating usually will give a different ordering of FBS teams if all D1 games are used to form it than if only games involving FBS opponents are used. This can be important - the RPI (which is not an "advanced" system) picks a different #2 if all D1 games are used than it does when only FBS games are considered.

So what?

To the extent that this measurement just gives another characterization of how connected a team is, it's only immediate practical use is to give another rough measure of "applicability" to a team for rating algorithms that process all edges of the games graph. But the important bit is that we can use it to demonstrate the effects of conference scheduling philosophies and teams' non-conference choices.

Note that the problem is not just that teams are disconnected. If all teams were equally as disconnected (by being very much more connected to a small subset of the field) our advanced ratings would not have much difficulty using the connectors to rate the separate "islands" as a group and then compare groups (this was the case in D1 baseball prior to the early 2000s.) Or, for weakly-connected fields like FCS, D2 and D3 you could rely on a post-season tournament to compare teams from the well-connected subfields.

In short, it is bad for us analysts and the people who depend upon our rankings for anything important when:

conferences mandate more than 8 conference games
more than a quarter of the FBS field schedules non-FBS opponents

and the trend in the last few years has been more of each.

The BCS is a coalition of conferences and bowls whose rules are based upon consensus. But were there a King/Queen of BCS who could dictate requirements, I'd recommend he/she require participating conferences to:

provide a minimum of four discretionary (non-conference) scheduling opportunities for each team in their conference
and
require their members to schedule a non-FBS team no more often than once in every four-year interval (as the NCAA rule for counting wins for bowl eligibility used to read)

Nobody cares that the current trends make the ratings harder to get right because everybody likes to complain about the ratings when they don't like the ranks, and everybody likes to cite them when the errors in them benefit their team.

Those of us who produce them just wish everybody who complains or cites knew more about them than whether they "agree" with them or not!

FBS	O : ¬ O₄ for FBS
D1	O : ¬ O₆ for D1 = {FBS ∪ FCS}

31 December update

Advanced systems do not depend on only the number of paths that are games-graph-diameter long - they use all paths up to some length that depends upon their convergence criterion.

Above I chose λ=4 for FBS (6 for D1) and graphed
average # (paths to opponents+paths to self)

average # paths to non-opponents
for just that pathlength. It is more appropriate to use number of paths ≤ a given λ

Here's what that looks like for λ=2, 3 and 4 for the FBS field.

See this table for the graph source.

λ=4 is the games-graph diameter, but as I noted above advanced systems' results will usually depend upon some (much) larger value of λ. The same pattern persists with higher λ values as can be seen from by this graph of the changes as λ varies from 4 to 8:

The range of (S+O)paths÷¬O paths gets smaller as λ gets larger, but the pattern stays the same: there is a qualitative difference in the rating functions for teams that are relatively less (or more, though that involves fewer teams) -connected than the teams for which the values fall within the "linear" range in the graphs.