Friday, May 31, 2013

The Upset, Part III: Min-V

A league's B-team higher than A-team?  What're (insert team here) doing so far down/up?  That ranking makes no sense!

What the heck is that Min-V ranking doing?

Well, like both ranking written about last week, European Roller Derby Rankings and Derby Chart, the Min-V system doesn't consider A and B teams as related.  One's ranking doesn't affect the other.

As well, the Min-V doesn't consider scores.  It only considers who won and lost.

The system works like this:

If the Qarth Rollers defeat the Vaes Dothrak Rolling Horde, then QR should be ranked above VDRH in all future rankings.  That is, QR should have more ranking points than VDRH.  If that's not the case, the bout is a violation.

The computer takes a table of all the bouts and results, as well as a list of all teams, and uses an efficient trial-and-error method to compute the minimum number of violations possible, hence Min-V.

[Details and theoretical basis thanks to Dr. Coleman can be found in this pdf file.]

The computer then outputs a table with each team and its ranking points.  This is only one possible solution; there are an infinite set of ranking point tables which produce the same number of violations.  As well, there are several orderings of teams which will not change that number.

Due to this, the ranking can be optimized.  Optimization is the process of adjusting the ranking points values for each team to more closely match what other ranking systems produce.  For example, it doesn't matter mathematically if LRG[A] or Gent[B] are ranked #1.  Since neither played the other, changing that order will not cause a violation.

LRG[A] will be optimized to #1, and Gent[B] lower in the table.  However, Gent[B] cannot be moved below Paris without causing a violation.  They, in turn, cannot go below Bear City, etc.  The knock-on effects in a Min-V system can be massive, so optimization must be done carefully.

In this case, optimization was done attempting to match the Derby Chart ranking.  It could be done to approach any ranking scheme using the same Min-V basis.

Min-V has been used effectively in the US College Football system for years, with great predictive and retrodictive results.  As was shown last week, retrodictivity in other derby rankings is spotty at best.  ERDR rankings say that 1 in 5 bouts were upsets, Derby Chart's say 1 in 4.  The Min-V ranking table, as odd as it looks, only has 1 in 30 bouts as upsets.

It's the most retrodictively correct ranking by more than a factor of 6.

But is it the best ranking?

The answer to that question is a question itself: "How is best measured?"

And to set those two questions into perspective, consider this one: "Why care about rankings?"

As there's no trophy awarded on rankings, and no Champions League in European derby (yet???), the choice is yours.  Min-V is presented here to show just how impossible it is to conclusively put all European leagues in a ranked order.

If they're properly used, rankings can be a helpful source of information.  ERDR has an archive of past rankings, and Derby Chart has bout records for each team.  But don't become too dependent on them.  Even the most theoretically precise system is wrong 1 out of 30 times, and the most trusted are 1 in 4 or 1 in 5.

Keep rankings in perspective.  They're to inform and educate, not to dictate.

Tuesday, May 28, 2013

The Upset, Part II: All over the place

After discussing why we rank, it's time to say just how good each scheme is.  For the purposes of this study, I've looked at the rankings as of 1 Jan 2013.

or, how often did the team that won end up ranked lower?

This is the bit that causes the most confusion.  "But we beat them, why aren't we higher?"  Well, ranking algorithms generally aren't actually written to minimize this.  They're written to be more concerned about other things, and hope this comes along for the ride.  A scheme has been written, called Min-V, which is primarily concerned about minimizing retrodiction error.  More on its algorithm later.

How do Derby Chart, the European Roller Derby Rankings, and Min-V stack up in this category, as well as predictive ability?

Past upsets70559272
Past upset % 25.7% 20.2% 3.3% 272
Predict upsets1615 1745
Predict upset % 35.6% 33.3% 37.8% 45

Turns out, both rankings do a poor job of retrodicting bouts.  Only the Min-V system, with the sole purpose of minimizing retrodiction errors, has a low upset percentage.  For prediction, all three give at best a 2-in-3 chance of being correct.


Don't trust the rankings too much, unless they're Min-V.  And Min-V looks like this:

1London Rollergirls20.911
2Gent GO-GO Roller Girls [B]20.713
3London Rollergirls [B]20.711
4Paris Roller Girls20.703
5Bear City Roller Derby20.697
6Rainy City Roller Girls20.672
7Stockholm Roller Derby20.668
8Hellfire Harlots20.612
9Middlesbrough Milk Rollers20.602
10Glasgow Roller Derby [B]20.6
11Brighton Rockers20.592
12Glasgow Roller Derby20.587
13London Rollergirls [C]20.58
14Helsinki Roller Derby20.572
15Gent GO-GO Roller Girls20.568
16Leeds Roller Dolls20.558
16Tiger Bay Brawlers20.558
18Auld Reekie Roller Girls20.548
18Crime City Rollers20.548
20Central City Rollergirls20.537
21Dublin Roller Girls20.502
21Hot Wheel Roller Derby20.502
23Newcastle Roller Girls [B]20.5
23Leeds Roller Dolls [B]20.5
23Manchester Roller Derby20.5
26Bear City Roller Derby [B]20.498
27Copenhagen Roller Derby20.432
28Kallio Rolling Rainbow20.427
29Sheffield Steel Roller Girls [B]20.4
30Royal Windsor Rollergirls20.232
31Southern Discomfort20.2
32Lincolnshire Rolling Thunder20.1
32Quad Guards20.1
32Ruhrpott Roller Girls20.1
35Tyne & Fear20
37Roller Derby Bordeaux Club10.6
38Roller Girls of the Apocalypse10.5
38Herault Derby Girlz10.5
40Cork City Firebirds10.488
41Bristol Roller Derby10.442
42Birmingham Blitz Dames10.437
43London Rockin Rollers [B]10.421
44Stuttgart Valley Rollergirlz [B]10.402
45Paris Roller Girls [B]10.4
45Roller Derby Rennes10.4
47One Love Roller Dolls10.398
48Lutèce Destroyeuses - Paris10.3
49Brussels Derby Pixies10.2
49Roller Derby Metz Club10.2
51MRD: New Wheeled Order10.1
52Newcastle Roller Girls1
53Crime City Rollers [B]0.988
54Liverpool Roller Birds0.9
54Seaside Sirens Roller Girls0.9
56Sheffield Steel Roller Girls0.888
56London Rockin Rollers0.888
56Dolly Rockit Rollers0.888
59Stuttgart Valley Rollergirlz0.878
59Romsey Town Rollerbillies0.878
61Croydon Roller Derby0.8
61Rainy City Roller Girls [B]0.8
63Big Bucks High Rollers0.798
64Lincolnshire Bombers0.788
65Tiger Bay Brawlers [B]0.7
66Vienna Roller Girls0.6
67Rockcity Rollers0.5
67Milton Keynes Concrete Cows0.5
67Barockcity Rollerderby0.5
67Amsterdam Derby Dames0.5
71Bristol Roller Derby [B]0.4
71Rebellion Roller Derby0.4
71Dublin Roller Girls [B]0.4
71Harbor Girls0.4
71Portsmouth Roller Wenches0.4
71Liverpool Roller Birds [B]0.4
71Rotterdam Death Row Honeys0.4
78Munich Rolling Rebels0.3
78Blackland Rockin'K-Rollers0.3
78Bembel Town Roller Girls0.3
78Oxford Roller Derby0.3
78Royal Windsor Rollergirls [B]0.3
78Birmingham Blitz Dames [B]0.3
78Bad Bunny Rollers0.3
85Inhuman League0.2
85South West Angels of Terror0.2
85Manchester Roller Derby [B]0.2
85Bedfordshire Roller Girls0.2
85Mean Valley Roller Girls0.2
85Copenhagen Roller Derby [B]0.2
85Belfast Roller Derby0.2
85Hereford Roller Girls0.2
85Roller Derby Karlsruhe0.2
85Roller Derby Lyon0.2
85Dirty River Roller Grrrls0.2
85One Love Roller Dolls [B]0.2
85Gothenburg Roller Derby0.2
85Namur Roller Girls0.2
99Swansea City Roller Derby0.1
99Roller Derby Belfort0.1
99Crash Test Brummies0.1
99Evolution Rollergirls0.1
99Kent Roller Girls0.1
99Frankfurt Roller Derby0.1
99Dundee Roller Girls0.1
99Plymouth City Roller Girls0.1
99Dorset Roller Girls0.1
99Roller Derby Toulouse [B]0.1
99Norfolk Brawds0.1
99Helsinki Roller Derby [B]0.1
99Bruising Banditas0.1
99Kallio Rolling Rainbow [B]0.1
99Nantes Derby Girls0.1
99Roller Derby Metz Club [B]0.1
99Lincolnshire Bombers [B]0.1
99Dom City Dolls0.1
99Fierce Valley Roller Girls0.1
99Tampere Roller Derby0.1
99Central City Rollergirls [C]0.1
99Roller Derby Calaisis0.1
99Furness Firecrackers0.1
99Roller Derby Grenoble0.1
99Hell's Belles0.1
99Nidaros Roller Derby0.1
99Porto Roller Derby0.1
126Seaside Sirens Roller Girls [B]0
126Severn Roller Torrent0
126Shoetown Slayers0
126Barcelona Roller Derby0
126Les Quads de Paris0
126Imposters Roller Girls0
126Roller Derby Arras0
126Hell's Ass Derbygirls0
126Fair City Rollers0
126Brighton Rockers [B]0
126Velvet Sluts0
126Wolverhampton Honour Rollers0
126Nottingham Roller Girls0
126Kernow Rollers0
126Wirral Whipiteres0
126Wakey Wheeled Cats0
126Wiltshire Roller Derby0
126Vendetta Vixens0
126Roller Derby Angoulême0
126Eastside RocknRollers0
126Tenerife Roller Derby0
126Aarhus Derby Dames0
126Dolly Rockit Rollers [B]0
126Cardiff Roller Collective0
126Cornwall Roller Derby0
126Cherry Blood0
126Marseille Roller Derby Club0
126Spiders Black Widows0
126Oslo Roller Derby0
126Central City Rollergirls [B]0
126Roller Girls of the Apocalypse [B]0
126Amsterdam Derby Dames [B]0
126Auld Reekie Roller Girls [B]0
126Jakey Bites0
126Lahti Roller Derby0
126Royal Swedish Roller Derby0
126Luleå Roller Derby0
126Preston Roller Girls0
126Graveyard Queens Cologne0
126Lahti Roller Derby [B]0
126Roller Derby Lorient0
126Roller Derby Lille0
126Dresden Pioneers0
126Bairn City Rollers0
126Voodoo Vixens Besançon0
126Prague City Roller Derby0
126Plymouth City Roller Girls [B]0
126Grin n Barum0
126Kouvola Rock n Rollers0
126Southern Discomfort [B]0
126The Switchblade RollerGrrrls0
126Stockholm Roller Derby [B]0
126Dock City Rollers0
126Nantes Derby Girls [B]0
126Valencia Roller Derby0
126South Wales Silures0
126Panam Squad0
126Red Lion Roller Derby0
126Zurich City Rollergirls0
126Montpellier Derby Club0
126Limerick Roller Derby0
126Hulls Angels Roller Dames0
126Middlesbrough Milk Rollers [B]0
126B.M.O Roller Derby Girls0
126Roller Derby Toulouse0
126Big Bucks High Rollers [B]0
126Tyne & Fear [B]0
126Surrey Roller Girls0
126Milton Keynes Quads of War0
126Roller Derby Avingon0
126Bourne Bombshells0
126Kent Roller Girls [B]0
126Granite City Roller Girls0

Saturday, May 4, 2013

The Upset, Part I: Why do we Rank?

In derby, as in most other sports, there are multiple ranking schemes.  US College Football, or NCAA Football as it's commonly known, has 3 official rankings and nearly 150 unofficial ones.  European derby, with its 4, is tame by comparison.

Why so many?  One word: upsets.

Upsets, in the American usage, are games in which the expected winner loses to the expected loser.  They're games where the "underdog" wins, and to many sports fans one of the joys of watching sports.

We would expect Arsenal to win, but every so often Bradford City walk away with the victory.  It's a major source of excitement in any sport!

But what does that have to do with rankings?  Well, they mean that any ranking system cannot be perfect.  There will always be upsets, thus there will always be errors in the ranking.  Thus, ranking schemes need to be designed with priorities in mind.  

That is, a ranking scheme needs a purpose, a question to answer.  There are three such questions:
  1. Who did the best?  Who deserves the crown for best performance over the previous x time?
  2. Who will do the best?  Who will be expected to win in the coming games?
  3. Who is good competition?  Who will most likely give an exciting bout to a given team with minimal risk of a blow-out?
These must be different questions only because of upsets.  Each has different ways of dealing with that problem, because each has different rules defining how rankings may or may not be calculated.

1. A ranking for the purpose of awarding a crown has some of the more rigid rules.  If the crown is for best performance in a premier league season, for example, that ranking can only consider that season.  All teams start the season on 0 points, and the ranking shifts from there.

A good ranking for this purpose is highly retrodictive.  A retrodictive ranking is one that, over the course of the past period, has a minimum number of upsets.

In the European rankings, DerbyChart is entirely retrodictive with a limit of 12 months.  EuroDerby is entirely retrodictive within its divisions for a 12 month limit, with divisional placement based on the previous year's retrodictive ranking.  Thus, both seem designed to produce "the best performance for derby year xxxx."

2. A ranking for the purpose of prediction is much more free in its structure.  As the goal is only to forecast the future, rather than award for a given period, a predictive ranking can use scores from any previous period.  

In fact, a predictive ranking can use any factor, as long as the predictions do well.  Some baseball predictive rankings take transfers, market size, stadium size, team value, and all number of things into account.  If a scheme's predictions do well, then it's a good ranking.  Simple enough.

In the European rankings, the European Roller Derby Rankings and Flat Track Stats are predictive in nature.  Both consider all bouts since a team's debut, and the latter is explicitly designed with an algorithm based on prediction.

3. A ranking for the purpose of finding similarly-competitive teams is as free a structure as a predictive ranking, and often uses similar math.

In fact, the only difference between 2 and 3 is how the teams reading the rankings use them.  As an impartial observer reading algorithms, it is often to determine whether a ranking is designed for predictivity or competitivity. 

In the European rankings, the European Roller Derby Rankings' stated purpose is to allow teams to find opponents of similar skill.  EuroDerby can be easily used for this purpose as well, with it's divisional system.

Back to upsets.  Were it not for upsets, the three rankings would be identical.  If there were no improvement, all expectations of victory or defeat would be met.  This would be boring.

Instead, rankings have to deal with upsets.  An upset for a retrodictive ranking system is not always a problem; however, a retrodictive system should seek to minimize past upsets.  For a competitiveness ranking, it may not be a problem as well; if the ranking predicted a close bout and it was, the ranking has done its job even if the winner was not correct.  

A predictive system has the biggest problem with upsets, as they indicate that the original ranking was wrong.  Thus, a predictive system must react to upsets with some sort of correction to the ordering of teams.

So, how good are the various systems at being predictive and retrodictive?  How accurate are they?  Stay tuned for a detailed analysis of their performance, followed by a possible way of minimizing the number of upsets and maximizing the "correctness" of the ranking scheme.