Discardable Fun?

© Copyright 2006, Paul Kislanko

Ever-inspirational mathephobe SMQ described the Win Chain Graph as "Discardable Fun With Charts and Stuff." Well, it's fun to play around with the hypertext presentation (I described it as a "timewaster" on a message board) but there is a useful side to it.

Anyone (or any thing) that tries to rank teams that haven't played needs some basis of comparison that involves intermediaries. The "skeleton" for all computer-assisted ratings is the wins graph, and the logical foundation of human subjective judgements is "well, they kinda sorta looked better against team X", ignoring the logical truth that A beat C and B beat C says nothing about the relationship between A and B in a directed graph.

In Part 1 I described a way to decide if there were a stronger "win chain" from team A to team B than from team B to team A, and noted that there might not be one from A→B or B→A, or the number of shortest chains in the A↔B relationship might be the same in either direction.

Essentially this is a "second order" definition of wins, losses, and ties. Far from "discardable", it gives some insight regarding objective computer ratings that basically do what humans would if they could simultaneously consider all games at once, and gives us less-capable humans some information that we might use in order to prove that we're better than them.

When we compare every team to every other team using the definition of "win" being "A beat B or A beat a team that beat B", or "A beat a team that beat a team that beat B" or ... and similarly for losses, and we define ties as the number of teams with shortest A→B and B→A paths being equal, we can just count the number of teams for which there are winning paths, losing paths, and tied paths the same way we count wins, losses and ties.

By counting up the Ws, Ls, Ts, and "unknowns", we can get something like this:
WC_WP Team W L T ? Unc%
0.992 Ohio State 117 0 0 1 0.009
0.983 Michigan 116 1 0 1 0.009
0.966 Wisconsin 114 2 0 2 0.017
0.966 Notre Dame 114 2 0 2 0.017
0.958 Penn State 113 4 0 1 0.009
0.958 Boise St 113 0 0 5 0.044
0.932 West Virginia 109 7 2 0 0.017
0.919 Louisville 108 9 1 0 0.008
0.915 Southern California 108 10 0 0 0.000
0.907 Arkansas 107 11 0 0 0.000

WP is just (W + T/2) / (W + L + T + ?), and is the percentage of other teams for which this team has a stronger win chain. Unc% is the useful bit. It is (T + ?) / (W + L + T +?), and is the percentage of "uncertainty" associated with the team for computer rankings based only upon wins and losses. The entire list is not quite a useful ranking on its own, but changes in such computer rankings can be modeled by changes in this list.

Note that unlike normal wins and losses, these aren't static. What is a W today can turn into an L (or vice versa) based upon results of teams forming the intermediate links.