A league's B-team higher than A-team? What're (insert team here) doing so far down/up? That ranking makes no sense!
What the heck is that Min-V ranking doing?
Well, like both ranking written about last week, European Roller Derby Rankings and Derby Chart, the Min-V system doesn't consider A and B teams as related. One's ranking doesn't affect the other.
As well, the Min-V doesn't consider scores. It only considers who won and lost.
The system works like this:
If the Qarth Rollers defeat the Vaes Dothrak Rolling Horde, then QR should be ranked above VDRH in all future rankings. That is, QR should have more ranking points than VDRH. If that's not the case, the bout is a violation.
The computer takes a table of all the bouts and results, as well as a list of all teams, and uses an efficient trial-and-error method to compute the minimum number of violations possible, hence Min-V.
[Details and theoretical basis thanks to Dr. Coleman can be found in this pdf file.]
The computer then outputs a table with each team and its ranking points. This is only one possible solution; there are an infinite set of ranking point tables which produce the same number of violations. As well, there are several orderings of teams which will not change that number.
Due to this, the ranking can be optimized. Optimization is the process of adjusting the ranking points values for each team to more closely match what other ranking systems produce. For example, it doesn't matter mathematically if LRG[A] or Gent[B] are ranked #1. Since neither played the other, changing that order will not cause a violation.
LRG[A] will be optimized to #1, and Gent[B] lower in the table. However, Gent[B] cannot be moved below Paris without causing a violation. They, in turn, cannot go below Bear City, etc. The knock-on effects in a Min-V system can be massive, so optimization must be done carefully.
In this case, optimization was done attempting to match the Derby Chart ranking. It could be done to approach any ranking scheme using the same Min-V basis.
Min-V has been used effectively in the US College Football system for years, with great predictive and retrodictive results. As was shown last week, retrodictivity in other derby rankings is spotty at best. ERDR rankings say that 1 in 5 bouts were upsets, Derby Chart's say 1 in 4. The Min-V ranking table, as odd as it looks, only has 1 in 30 bouts as upsets.
It's the most retrodictively correct ranking by more than a factor of 6.
But is it the best ranking?
The answer to that question is a question itself: "How is best measured?"
And to set those two questions into perspective, consider this one: "Why care about rankings?"
As there's no trophy awarded on rankings, and no Champions League in European derby (yet???), the choice is yours. Min-V is presented here to show just how impossible it is to conclusively put all European leagues in a ranked order.
If they're properly used, rankings can be a helpful source of information. ERDR has an archive of past rankings, and Derby Chart has bout records for each team. But don't become too dependent on them. Even the most theoretically precise system is wrong 1 out of 30 times, and the most trusted are 1 in 4 or 1 in 5.
Keep rankings in perspective. They're to inform and educate, not to dictate.
I'm Stat Man, and I am a roller derby announcer and commentator. This means that I watch a hell of a lot of derby. As my name implies, I like quantitative analysis, and this blog is to explore what happens when the lessons I learn from other sports are applied to derby.
Friday, May 31, 2013
Tuesday, May 28, 2013
The Upset, Part II: All over the place
After discussing why we rank, it's time to say just how good each scheme is. For the purposes of this study, I've looked at the rankings as of 1 Jan 2013.
Retrodiction
or, how often did the team that won end up ranked lower?
This is the bit that causes the most confusion. "But we beat them, why aren't we higher?" Well, ranking algorithms generally aren't actually written to minimize this. They're written to be more concerned about other things, and hope this comes along for the ride. A scheme has been written, called Min-V, which is primarily concerned about minimizing retrodiction error. More on its algorithm later.
How do Derby Chart, the European Roller Derby Rankings, and Min-V stack up in this category, as well as predictive ability?
Retrodiction
or, how often did the team that won end up ranked lower?
This is the bit that causes the most confusion. "But we beat them, why aren't we higher?" Well, ranking algorithms generally aren't actually written to minimize this. They're written to be more concerned about other things, and hope this comes along for the ride. A scheme has been written, called Min-V, which is primarily concerned about minimizing retrodiction error. More on its algorithm later.
How do Derby Chart, the European Roller Derby Rankings, and Min-V stack up in this category, as well as predictive ability?
DC | ERDR | Min-V | bouts | |
---|---|---|---|---|
Past upsets | 70 | 55 | 9 | 272 |
Past upset % | 25.7% | 20.2% | 3.3% | 272 |
Predict upsets | 16 | 15 | 17 | 45 |
Predict upset % | 35.6% | 33.3% | 37.8% | 45 |
Turns out, both rankings do a poor job of retrodicting bouts. Only the Min-V system, with the sole purpose of minimizing retrodiction errors, has a low upset percentage. For prediction, all three give at best a 2-in-3 chance of being correct.
Conclusion
Don't trust the rankings too much, unless they're Min-V. And Min-V looks like this:
1 | London Rollergirls | 20.911 |
---|---|---|
2 | Gent GO-GO Roller Girls [B] | 20.713 |
3 | London Rollergirls [B] | 20.711 |
4 | Paris Roller Girls | 20.703 |
5 | Bear City Roller Derby | 20.697 |
6 | Rainy City Roller Girls | 20.672 |
7 | Stockholm Roller Derby | 20.668 |
8 | Hellfire Harlots | 20.612 |
9 | Middlesbrough Milk Rollers | 20.602 |
10 | Glasgow Roller Derby [B] | 20.6 |
11 | Brighton Rockers | 20.592 |
12 | Glasgow Roller Derby | 20.587 |
13 | London Rollergirls [C] | 20.58 |
14 | Helsinki Roller Derby | 20.572 |
15 | Gent GO-GO Roller Girls | 20.568 |
16 | Leeds Roller Dolls | 20.558 |
16 | Tiger Bay Brawlers | 20.558 |
18 | Auld Reekie Roller Girls | 20.548 |
18 | Crime City Rollers | 20.548 |
20 | Central City Rollergirls | 20.537 |
21 | Dublin Roller Girls | 20.502 |
21 | Hot Wheel Roller Derby | 20.502 |
23 | Newcastle Roller Girls [B] | 20.5 |
23 | Leeds Roller Dolls [B] | 20.5 |
23 | Manchester Roller Derby | 20.5 |
26 | Bear City Roller Derby [B] | 20.498 |
27 | Copenhagen Roller Derby | 20.432 |
28 | Kallio Rolling Rainbow | 20.427 |
29 | Sheffield Steel Roller Girls [B] | 20.4 |
30 | Royal Windsor Rollergirls | 20.232 |
31 | Southern Discomfort | 20.2 |
32 | Lincolnshire Rolling Thunder | 20.1 |
32 | Quad Guards | 20.1 |
32 | Ruhrpott Roller Girls | 20.1 |
35 | Tyne & Fear | 20 |
36 | Expendables | 19.9 |
37 | Roller Derby Bordeaux Club | 10.6 |
38 | Roller Girls of the Apocalypse | 10.5 |
38 | Herault Derby Girlz | 10.5 |
40 | Cork City Firebirds | 10.488 |
41 | Bristol Roller Derby | 10.442 |
42 | Birmingham Blitz Dames | 10.437 |
43 | London Rockin Rollers [B] | 10.421 |
44 | Stuttgart Valley Rollergirlz [B] | 10.402 |
45 | Paris Roller Girls [B] | 10.4 |
45 | Roller Derby Rennes | 10.4 |
47 | One Love Roller Dolls | 10.398 |
48 | Lutèce Destroyeuses - Paris | 10.3 |
49 | Brussels Derby Pixies | 10.2 |
49 | Roller Derby Metz Club | 10.2 |
51 | MRD: New Wheeled Order | 10.1 |
52 | Newcastle Roller Girls | 1 |
53 | Crime City Rollers [B] | 0.988 |
54 | Liverpool Roller Birds | 0.9 |
54 | Seaside Sirens Roller Girls | 0.9 |
56 | Sheffield Steel Roller Girls | 0.888 |
56 | London Rockin Rollers | 0.888 |
56 | Dolly Rockit Rollers | 0.888 |
59 | Stuttgart Valley Rollergirlz | 0.878 |
59 | Romsey Town Rollerbillies | 0.878 |
61 | Croydon Roller Derby | 0.8 |
61 | Rainy City Roller Girls [B] | 0.8 |
63 | Big Bucks High Rollers | 0.798 |
64 | Lincolnshire Bombers | 0.788 |
65 | Tiger Bay Brawlers [B] | 0.7 |
66 | Vienna Roller Girls | 0.6 |
67 | Rockcity Rollers | 0.5 |
67 | Milton Keynes Concrete Cows | 0.5 |
67 | Barockcity Rollerderby | 0.5 |
67 | Amsterdam Derby Dames | 0.5 |
71 | Bristol Roller Derby [B] | 0.4 |
71 | Rebellion Roller Derby | 0.4 |
71 | Dublin Roller Girls [B] | 0.4 |
71 | Harbor Girls | 0.4 |
71 | Portsmouth Roller Wenches | 0.4 |
71 | Liverpool Roller Birds [B] | 0.4 |
71 | Rotterdam Death Row Honeys | 0.4 |
78 | Munich Rolling Rebels | 0.3 |
78 | Blackland Rockin'K-Rollers | 0.3 |
78 | Bembel Town Roller Girls | 0.3 |
78 | Oxford Roller Derby | 0.3 |
78 | Royal Windsor Rollergirls [B] | 0.3 |
78 | Birmingham Blitz Dames [B] | 0.3 |
78 | Bad Bunny Rollers | 0.3 |
85 | Inhuman League | 0.2 |
85 | South West Angels of Terror | 0.2 |
85 | Manchester Roller Derby [B] | 0.2 |
85 | Bedfordshire Roller Girls | 0.2 |
85 | Mean Valley Roller Girls | 0.2 |
85 | Copenhagen Roller Derby [B] | 0.2 |
85 | Belfast Roller Derby | 0.2 |
85 | Hereford Roller Girls | 0.2 |
85 | Roller Derby Karlsruhe | 0.2 |
85 | Roller Derby Lyon | 0.2 |
85 | Dirty River Roller Grrrls | 0.2 |
85 | One Love Roller Dolls [B] | 0.2 |
85 | Gothenburg Roller Derby | 0.2 |
85 | Namur Roller Girls | 0.2 |
99 | Swansea City Roller Derby | 0.1 |
99 | Roller Derby Belfort | 0.1 |
99 | Crash Test Brummies | 0.1 |
99 | Evolution Rollergirls | 0.1 |
99 | Kent Roller Girls | 0.1 |
99 | Frankfurt Roller Derby | 0.1 |
99 | Dundee Roller Girls | 0.1 |
99 | Plymouth City Roller Girls | 0.1 |
99 | Dorset Roller Girls | 0.1 |
99 | Roller Derby Toulouse [B] | 0.1 |
99 | Norfolk Brawds | 0.1 |
99 | Helsinki Roller Derby [B] | 0.1 |
99 | Bruising Banditas | 0.1 |
99 | Kallio Rolling Rainbow [B] | 0.1 |
99 | Nantes Derby Girls | 0.1 |
99 | Roller Derby Metz Club [B] | 0.1 |
99 | Lincolnshire Bombers [B] | 0.1 |
99 | Dom City Dolls | 0.1 |
99 | Fierce Valley Roller Girls | 0.1 |
99 | Tampere Roller Derby | 0.1 |
99 | Central City Rollergirls [C] | 0.1 |
99 | Roller Derby Calaisis | 0.1 |
99 | Furness Firecrackers | 0.1 |
99 | Roller Derby Grenoble | 0.1 |
99 | Hell's Belles | 0.1 |
99 | Nidaros Roller Derby | 0.1 |
99 | Porto Roller Derby | 0.1 |
126 | Seaside Sirens Roller Girls [B] | 0 |
126 | Severn Roller Torrent | 0 |
126 | Shoetown Slayers | 0 |
126 | Barcelona Roller Derby | 0 |
126 | Les Quads de Paris | 0 |
126 | Imposters Roller Girls | 0 |
126 | Roller Derby Arras | 0 |
126 | Hell's Ass Derbygirls | 0 |
126 | Fair City Rollers | 0 |
126 | Brighton Rockers [B] | 0 |
126 | Velvet Sluts | 0 |
126 | Wolverhampton Honour Rollers | 0 |
126 | Nottingham Roller Girls | 0 |
126 | Kernow Rollers | 0 |
126 | Wirral Whipiteres | 0 |
126 | Wakey Wheeled Cats | 0 |
126 | Wiltshire Roller Derby | 0 |
126 | Vendetta Vixens | 0 |
126 | Roller Derby Angoulême | 0 |
126 | Eastside RocknRollers | 0 |
126 | Tenerife Roller Derby | 0 |
126 | Aarhus Derby Dames | 0 |
126 | Dolly Rockit Rollers [B] | 0 |
126 | Cardiff Roller Collective | 0 |
126 | Cornwall Roller Derby | 0 |
126 | Cherry Blood | 0 |
126 | Marseille Roller Derby Club | 0 |
126 | Spiders Black Widows | 0 |
126 | Oslo Roller Derby | 0 |
126 | Central City Rollergirls [B] | 0 |
126 | Roller Girls of the Apocalypse [B] | 0 |
126 | Amsterdam Derby Dames [B] | 0 |
126 | Auld Reekie Roller Girls [B] | 0 |
126 | Jakey Bites | 0 |
126 | Lahti Roller Derby | 0 |
126 | Royal Swedish Roller Derby | 0 |
126 | Luleå Roller Derby | 0 |
126 | Preston Roller Girls | 0 |
126 | Graveyard Queens Cologne | 0 |
126 | Lahti Roller Derby [B] | 0 |
126 | Roller Derby Lorient | 0 |
126 | Roller Derby Lille | 0 |
126 | Porvoo | 0 |
126 | Dresden Pioneers | 0 |
126 | Bairn City Rollers | 0 |
126 | Voodoo Vixens Besançon | 0 |
126 | Prague City Roller Derby | 0 |
126 | Plymouth City Roller Girls [B] | 0 |
126 | Grin n Barum | 0 |
126 | Kouvola Rock n Rollers | 0 |
126 | Southern Discomfort [B] | 0 |
126 | The Switchblade RollerGrrrls | 0 |
126 | Tester | 0 |
126 | Stockholm Roller Derby [B] | 0 |
126 | Dock City Rollers | 0 |
126 | Nantes Derby Girls [B] | 0 |
126 | Valencia Roller Derby | 0 |
126 | South Wales Silures | 0 |
126 | Panam Squad | 0 |
126 | Red Lion Roller Derby | 0 |
126 | Nought | 0 |
126 | Zurich City Rollergirls | 0 |
126 | Montpellier Derby Club | 0 |
126 | Limerick Roller Derby | 0 |
126 | Hulls Angels Roller Dames | 0 |
126 | Middlesbrough Milk Rollers [B] | 0 |
126 | B.M.O Roller Derby Girls | 0 |
126 | Roller Derby Toulouse | 0 |
126 | Big Bucks High Rollers [B] | 0 |
126 | Tyne & Fear [B] | 0 |
126 | Surrey Roller Girls | 0 |
126 | Kamiquadz | 0 |
126 | Milton Keynes Quads of War | 0 |
126 | Roller Derby Avingon | 0 |
126 | Bourne Bombshells | 0 |
126 | Kent Roller Girls [B] | 0 |
126 | Granite City Roller Girls | 0 |
Saturday, May 4, 2013
The Upset, Part I: Why do we Rank?
In derby, as in most other sports, there are multiple ranking schemes. US College Football, or NCAA Football as it's commonly known, has 3 official rankings and nearly 150 unofficial ones. European derby, with its 4, is tame by comparison.
Why so many? One word: upsets.
Upsets, in the American usage, are games in which the expected winner loses to the expected loser. They're games where the "underdog" wins, and to many sports fans one of the joys of watching sports.
We would expect Arsenal to win, but every so often Bradford City walk away with the victory. It's a major source of excitement in any sport!
But what does that have to do with rankings? Well, they mean that any ranking system cannot be perfect. There will always be upsets, thus there will always be errors in the ranking. Thus, ranking schemes need to be designed with priorities in mind.
That is, a ranking scheme needs a purpose, a question to answer. There are three such questions:
- Who did the best? Who deserves the crown for best performance over the previous x time?
- Who will do the best? Who will be expected to win in the coming games?
- Who is good competition? Who will most likely give an exciting bout to a given team with minimal risk of a blow-out?
1. A ranking for the purpose of awarding a crown has some of the more rigid rules. If the crown is for best performance in a premier league season, for example, that ranking can only consider that season. All teams start the season on 0 points, and the ranking shifts from there.
A good ranking for this purpose is highly retrodictive. A retrodictive ranking is one that, over the course of the past period, has a minimum number of upsets.
In the European rankings, DerbyChart is entirely retrodictive with a limit of 12 months. EuroDerby is entirely retrodictive within its divisions for a 12 month limit, with divisional placement based on the previous year's retrodictive ranking. Thus, both seem designed to produce "the best performance for derby year xxxx."
2. A ranking for the purpose of prediction is much more free in its structure. As the goal is only to forecast the future, rather than award for a given period, a predictive ranking can use scores from any previous period.
In fact, a predictive ranking can use any factor, as long as the predictions do well. Some baseball predictive rankings take transfers, market size, stadium size, team value, and all number of things into account. If a scheme's predictions do well, then it's a good ranking. Simple enough.
In the European rankings, the European Roller Derby Rankings and Flat Track Stats are predictive in nature. Both consider all bouts since a team's debut, and the latter is explicitly designed with an algorithm based on prediction.
3. A ranking for the purpose of finding similarly-competitive teams is as free a structure as a predictive ranking, and often uses similar math.
In fact, the only difference between 2 and 3 is how the teams reading the rankings use them. As an impartial observer reading algorithms, it is often to determine whether a ranking is designed for predictivity or competitivity.
In the European rankings, the European Roller Derby Rankings' stated purpose is to allow teams to find opponents of similar skill. EuroDerby can be easily used for this purpose as well, with it's divisional system.
Back to upsets. Were it not for upsets, the three rankings would be identical. If there were no improvement, all expectations of victory or defeat would be met. This would be boring.
Instead, rankings have to deal with upsets. An upset for a retrodictive ranking system is not always a problem; however, a retrodictive system should seek to minimize past upsets. For a competitiveness ranking, it may not be a problem as well; if the ranking predicted a close bout and it was, the ranking has done its job even if the winner was not correct.
A predictive system has the biggest problem with upsets, as they indicate that the original ranking was wrong. Thus, a predictive system must react to upsets with some sort of correction to the ordering of teams.
So, how good are the various systems at being predictive and retrodictive? How accurate are they? Stay tuned for a detailed analysis of their performance, followed by a possible way of minimizing the number of upsets and maximizing the "correctness" of the ranking scheme.
Subscribe to:
Posts (Atom)