In Ratings Scope I introduced my new ISR and ISOV report format, which compared to prior years adds the "Norm" column and eliminates the "ASOS" and "PASOS" meta-ratings. I mentioned that I'd found histograms to be a better tool for this kind of analysis.
We're familiar with the tables that show "records vs top 25, 50, …" but I do not find those very useful because the ranks are arbitrary and the differences between the extremes of the ranges is so variable. There's a much larger difference between #25 and #1 than there is between #50 and #26.
Instead for "buckets" I divide the field into ½ standard deviation (σ) -wide groups of teams then apply a basic "bonus for better wins, penalty for worse losses" formula. This is "vs opponents' ratings" as opposed to opponents' ranks.
When applied to the 2014 season (with the field modified as described in the cited article) for the ISR this results in a "records vs teams ranked 1-4, 5-17, 18-35, 36-130, 131-162, 163-180, 181-196, 197+". The widest rank range corresponds to opponents within ½ standard deviation of average, and both wins over and losses to teams in this range are assigned value 1.
The weightings are arbitrary, but the only requirements for them are that they be symetrical around the center bucket and be significantly different enough at the tails to distinguish between "good/bad" and "even better/worse" in the formula's result. Again using 2014 results with the modified field, for the ISR the report looks like this.
|Sort||The value of the formula applied to the teams' record is not by itself very useful, but is included to make it somewhat easier to analyze a pair of schedules/results.|
|The relative rank of each team according to this better-win/worse-loss formula. Teams with the same sort value will have rank one greater than the team(s) with the next higher sort value.|
|rating (ISR in the example) is the team's relative rank according to the base rating.|
The degree to which the two rankings agree is a measure of how retrodictively self-consistent the rating is. When a pair of teams' relative position is opposite in the two rankings, it should be easy to recognize an "upset" in one of the team's rows in the table.
|Team||The team's name. By the time I publish my first rankings this will very likely be a link to a more detailed analysis of the team's results.|
|Rec||Total wins and losses against teams in the field. Games that were not used to calculate rating are treated as if they did not occur.|
|Conf||The team's conference affiliation. Mainly included to provide a visual break between the overall record and the histogram table.|
|Team's record vs teams whose rating values are better than the average plus the number of standard deviations indicated by the column headings.
>3σ/2 should be read as "opponents' rating greater than μ+1.5×σ but less than μ+2×σ", where μ is the rating average and σ its (population) standard deviation.
So instead of records vs 1-25, 26-50, etc. for the 128-team 1A field there is more value derived by using 1-3, 4-9, 10-21, 22-40, 41-89, 90-108, 109-120, 121-126 and 127+. In fact if we had a perfect computer rating (call it Greatest Of Deciders) and still wanted human involvement in selecting the 4-team tournament teams it would make a lot of sense to let the G.O.D. pick the first three teams and only ask the humans to pick which of the other six teams ranked better than 10th should complete the tournament.
© Copyright 2015, Paul Kislanko