Monday, February 05, 2007

In future years: ACAORS for short.

In the beginning, the Pairwise Rankings used to select and seed the NCAA hockey tournament were hopelessly complicated and nearly arbitrary. Back when the NCAA had but twelve teams in the tourney, it was arbitrary enough to exclude the WCHA regular season champion in 1994, creating a strange interstitial period where automatic bids were awarded for both regular-season and playoff championships. Teams were included or excluded from "consideration" based on an arbitrary RPI cliff; since one of the factors in the PWR was a team's record against "teams under consideration" (TUC) the meaningless season-ending death rattles of Ferris State or Ohio State or whatever mediocre teams happened to be thrashing around the .500 RPI Mendoza line became hugely important to the fortunes of teams actually good enough to make the tournament. The CCHA's stupidly extensive conference season combined with the PWR's stupidly sparse "common opponents" category to make single games hugely important, comparison-flipping monstrosities. The "Last 16" category favored teams with front-loaded schedules. It was all very, very stupid.

So the committee set out to change said stupidities: they dumped the L16 category entirely. They fiddled with RPI. They introduced a "good wins" RPI bonus, then scaled it back to only apply to road games. They changed the TUC boundary from the arbitrary "above .500" to the equally arbitrary "top 25." The end result? A system that's just about as dumb as it was before, but one that hews closer to RPI -- something any basketball fan can tell you is state-of-the-art in the same way an Apple IIE is. Since RPI 1) is one of the four factors in each pairwise comparison, 2) is the tiebreaker whenever any individual comparison stalemates, and 3) is furthermore the tiebreaker whenever two teams come out with the same number of points, in general the thing looks like a slightly corrected(?) version of the RPI. At this moment in time the 1-2-3-4-5 teams in PWR are also the 1-2-3-4-5 teams in RPI. Next are 7 and 9. Where are 6 (Michigan) and 8 (Michigan State)? Languishing far below those spots for reasons that can only be described as... yes, completely arbitrary.

In particular, an early-season loss to Hockey East punching bag Northeastern -- a game for which Michigan has already been punished in the RPI -- is a millstone around Michigan's neck. It's the basis for PWR's highly scientific decision to give BC and BU the Common Opponents category and thus those comparisons. BU's record against fellow Michigan opponents is a stirring 1-0-1; Michigan's is but 1-1. Two games amongst nearly 40. Meanwhile, the BC-Michigan common opponents comparison is more robust -- BC is 4-2-1, Michigan 3-3 -- but terribly close, just like the TUC category. BC edges Michigan by a tiny fraction in both categories and takes the comparison despite being fully .035 behind Michigan in the PWR-declared all-important RPI, a huge gap comparable to that between Michigan and 1-seed locks. As a result, the team with the 6th best RPI in the country languishes in 10th. Flip those two comparisons and Michigan is in 8th, not coincidentally where mathematically-robust-but-hard-to-understand KRACH* has them.

Stupidities like these are the heart of all PWR complaints. The committee has made a few changes that will help cut down on the system's notorious volatility, but the route they've chosen only serves to obscure the arbitrary nature of the process instead of actually fix it. I'm thinking of the TUC cutoff change here, which is equally random but more controlled. Instead of three or four teams potentially flipping their TUC status with fundamentally irrelevant games, it's maybe one or two these days, but teams will still see their fortunes wax or wane because the mediocre teams they beat were slightly more or less mediocre than some other mediocre teams other teams beat.

There are two issues at heart here:

  • Non-RPI categories are stupid and flawed. Common Opponents and TUC record are not weighted for schedule strength. TUC record counts games against #1 as heavily as games against #25. A hypothetical Common Opponents comparison could feature four games against #60 and one against #1 for one team and the inverse for another.
  • Non-RPI categories are given equal weight no matter the circumstances. Michigan's loss instead of a tie against Northeastern flips an entire comparison that is decisively Michigan's on RPI. BC's narrow advantages in stupid and flawed non-RPI categories obliterate Michigan's huge lead in the RPI, also flawed but vastly more comprehensive as it takes every game into account and attempts to weight it by difficulty.
If the selection committee is married to the idea of the Pairwise, it should be reconceived as the RPI correction scheme it is and a team shouldn't lose a major advantage in the most comprehensive available category because of a weighting issue. In this hypothetical system, each team's RPI would be the input of the first parameter (all numbers plucked from the ether):


Whatever categories the committee would like to include would then serve to modify that RPI. Say .500 is your TUC mendoza line. Higher than that, you get a bonus; lower, a penalty. You can have a special TUC RPI if you desire to include some schedule strength into the categories -- which we do. Schedule adjusted or no, we add and subtract bits from the RPI based on what the committee wants of emphasize, then scale them based on game count to downplay anomalies like the one flipping the BU-Michigan comparison:

TUC.45 * x .52 * x
COP.60 * y.66 * y

Where x and y are appropriate weighting factors. Do the same for head-to-head, weighting it somewhat more heavily (in our hypothetical world, let's give BC a win over Michigan). Then take your modifying factors and add them to your RPI starting point to come up with a final modified RPI:

TUC-0.0025 +0.005

In this hypothetical world, BC's slight advantages in the PWR categories help them run up the RPI ditch they've dug for themselves but not all the way. This system is still fairly intelligible but has enough fine grain to kill 99% of the silly comparison-flipping and volatility that math-inclined college hockey fans know and loathe so well.

(Least. Popular. Post. Everrrrrr.)

*(KRACH has real-world issues: it doesn't chase shiny records enough to prevent aesthetically ugly things like having nine WCHA teams in the top 20 and featuring WCHA also-rans Michigan Tech, Wisconsin, and Minnesota State as hypothetical bubble teams. Over the last half-decade or so it's been relentless in its WCHA boosterism -- almost as relentless as real life. But the masses are unlikely to go for a system that has the potential to put more than half of a conference into a supposedly national tournament, deserving or not.)