Derby and Logic, by Stat Man: May 2013

Friday, May 31, 2013

The Upset, Part III: Min-V

A league's B-team higher than A-team? What're (insert team here) doing so far down/up? That ranking makes no sense!

What the heck is that Min-V ranking doing?

Well, like both ranking written about last week, European Roller Derby Rankings and Derby Chart, the Min-V system doesn't consider A and B teams as related. One's ranking doesn't affect the other.

As well, the Min-V doesn't consider scores. It only considers who won and lost.

The system works like this:

If the Qarth Rollers defeat the Vaes Dothrak Rolling Horde, then QR should be ranked above VDRH in all future rankings. That is, QR should have more ranking points than VDRH. If that's not the case, the bout is a violation.

The computer takes a table of all the bouts and results, as well as a list of all teams, and uses an efficient trial-and-error method to compute the minimum number of violations possible, hence Min-V.

[Details and theoretical basis thanks to Dr. Coleman can be found in this pdf file.]

The computer then outputs a table with each team and its ranking points. This is only one possible solution; there are an infinite set of ranking point tables which produce the same number of violations. As well, there are several orderings of teams which will not change that number.

Due to this, the ranking can be optimized. Optimization is the process of adjusting the ranking points values for each team to more closely match what other ranking systems produce. For example, it doesn't matter mathematically if LRG[A] or Gent[B] are ranked #1. Since neither played the other, changing that order will not cause a violation.

LRG[A] will be optimized to #1, and Gent[B] lower in the table. However, Gent[B] cannot be moved below Paris without causing a violation. They, in turn, cannot go below Bear City, etc. The knock-on effects in a Min-V system can be massive, so optimization must be done carefully.

In this case, optimization was done attempting to match the Derby Chart ranking. It could be done to approach any ranking scheme using the same Min-V basis.

Min-V has been used effectively in the US College Football system for years, with great predictive and retrodictive results. As was shown last week, retrodictivity in other derby rankings is spotty at best. ERDR rankings say that 1 in 5 bouts were upsets, Derby Chart's say 1 in 4. The Min-V ranking table, as odd as it looks, only has 1 in 30 bouts as upsets.

It's the most retrodictively correct ranking by more than a factor of 6.

But is it the best ranking?

The answer to that question is a question itself: "How is best measured?"

And to set those two questions into perspective, consider this one: "Why care about rankings?"

As there's no trophy awarded on rankings, and no Champions League in European derby (yet???), the choice is yours. Min-V is presented here to show just how impossible it is to conclusively put all European leagues in a ranked order.

If they're properly used, rankings can be a helpful source of information. ERDR has an archive of past rankings, and Derby Chart has bout records for each team. But don't become too dependent on them. Even the most theoretically precise system is wrong 1 out of 30 times, and the most trusted are 1 in 4 or 1 in 5.

Keep rankings in perspective. They're to inform and educate, not to dictate.

Tuesday, May 28, 2013

The Upset, Part II: All over the place

After discussing why we rank, it's time to say just how good each scheme is. For the purposes of this study, I've looked at the rankings as of 1 Jan 2013.

Retrodiction
or, how often did the team that won end up ranked lower?

This is the bit that causes the most confusion. "But we beat them, why aren't we higher?" Well, ranking algorithms generally aren't actually written to minimize this. They're written to be more concerned about other things, and hope this comes along for the ride. A scheme has been written, called Min-V, which is primarily concerned about minimizing retrodiction error. More on its algorithm later.

How do Derby Chart, the European Roller Derby Rankings, and Min-V stack up in this category, as well as predictive ability?

	DC	ERDR	Min-V	bouts
Past upsets	70	55	9	272
Past upset %	25.7%	20.2%	3.3%	272
Predict upsets	16	15	17	45
Predict upset %	35.6%	33.3%	37.8%	45

Turns out, both rankings do a poor job of retrodicting bouts. Only the Min-V system, with the sole purpose of minimizing retrodiction errors, has a low upset percentage. For prediction, all three give at best a 2-in-3 chance of being correct.

Conclusion

Don't trust the rankings too much, unless they're Min-V. And Min-V looks like this:

1	London Rollergirls	20.911
2	Gent GO-GO Roller Girls [B]	20.713
3	London Rollergirls [B]	20.711
4	Paris Roller Girls	20.703
5	Bear City Roller Derby	20.697
6	Rainy City Roller Girls	20.672
7	Stockholm Roller Derby	20.668
8	Hellfire Harlots	20.612
9	Middlesbrough Milk Rollers	20.602
10	Glasgow Roller Derby [B]	20.6
11	Brighton Rockers	20.592
12	Glasgow Roller Derby	20.587
13	London Rollergirls [C]	20.58
14	Helsinki Roller Derby	20.572
15	Gent GO-GO Roller Girls	20.568
16	Leeds Roller Dolls	20.558
16	Tiger Bay Brawlers	20.558
18	Auld Reekie Roller Girls	20.548
18	Crime City Rollers	20.548
20	Central City Rollergirls	20.537
21	Dublin Roller Girls	20.502
21	Hot Wheel Roller Derby	20.502
23	Newcastle Roller Girls [B]	20.5
23	Leeds Roller Dolls [B]	20.5
23	Manchester Roller Derby	20.5
26	Bear City Roller Derby [B]	20.498
27	Copenhagen Roller Derby	20.432
28	Kallio Rolling Rainbow	20.427
29	Sheffield Steel Roller Girls [B]	20.4
30	Royal Windsor Rollergirls	20.232
31	Southern Discomfort	20.2
32	Lincolnshire Rolling Thunder	20.1
32	Quad Guards	20.1
32	Ruhrpott Roller Girls	20.1
35	Tyne & Fear	20
36	Expendables	19.9
37	Roller Derby Bordeaux Club	10.6
38	Roller Girls of the Apocalypse	10.5
38	Herault Derby Girlz	10.5
40	Cork City Firebirds	10.488
41	Bristol Roller Derby	10.442
42	Birmingham Blitz Dames	10.437
43	London Rockin Rollers [B]	10.421
44	Stuttgart Valley Rollergirlz [B]	10.402
45	Paris Roller Girls [B]	10.4
45	Roller Derby Rennes	10.4
47	One Love Roller Dolls	10.398
48	Lutèce Destroyeuses - Paris	10.3
49	Brussels Derby Pixies	10.2
49	Roller Derby Metz Club	10.2
51	MRD: New Wheeled Order	10.1
52	Newcastle Roller Girls	1
53	Crime City Rollers [B]	0.988
54	Liverpool Roller Birds	0.9
54	Seaside Sirens Roller Girls	0.9
56	Sheffield Steel Roller Girls	0.888
56	London Rockin Rollers	0.888
56	Dolly Rockit Rollers	0.888
59	Stuttgart Valley Rollergirlz	0.878
59	Romsey Town Rollerbillies	0.878
61	Croydon Roller Derby	0.8
61	Rainy City Roller Girls [B]	0.8
63	Big Bucks High Rollers	0.798
64	Lincolnshire Bombers	0.788
65	Tiger Bay Brawlers [B]	0.7
66	Vienna Roller Girls	0.6
67	Rockcity Rollers	0.5
67	Milton Keynes Concrete Cows	0.5
67	Barockcity Rollerderby	0.5
67	Amsterdam Derby Dames	0.5
71	Bristol Roller Derby [B]	0.4
71	Rebellion Roller Derby	0.4
71	Dublin Roller Girls [B]	0.4
71	Harbor Girls	0.4
71	Portsmouth Roller Wenches	0.4
71	Liverpool Roller Birds [B]	0.4
71	Rotterdam Death Row Honeys	0.4
78	Munich Rolling Rebels	0.3
78	Blackland Rockin'K-Rollers	0.3
78	Bembel Town Roller Girls	0.3
78	Oxford Roller Derby	0.3
78	Royal Windsor Rollergirls [B]	0.3
78	Birmingham Blitz Dames [B]	0.3
78	Bad Bunny Rollers	0.3
85	Inhuman League	0.2
85	South West Angels of Terror	0.2
85	Manchester Roller Derby [B]	0.2
85	Bedfordshire Roller Girls	0.2
85	Mean Valley Roller Girls	0.2
85	Copenhagen Roller Derby [B]	0.2
85	Belfast Roller Derby	0.2
85	Hereford Roller Girls	0.2
85	Roller Derby Karlsruhe	0.2
85	Roller Derby Lyon	0.2
85	Dirty River Roller Grrrls	0.2
85	One Love Roller Dolls [B]	0.2
85	Gothenburg Roller Derby	0.2
85	Namur Roller Girls	0.2
99	Swansea City Roller Derby	0.1
99	Roller Derby Belfort	0.1
99	Crash Test Brummies	0.1
99	Evolution Rollergirls	0.1
99	Kent Roller Girls	0.1
99	Frankfurt Roller Derby	0.1
99	Dundee Roller Girls	0.1
99	Plymouth City Roller Girls	0.1
99	Dorset Roller Girls	0.1
99	Roller Derby Toulouse [B]	0.1
99	Norfolk Brawds	0.1
99	Helsinki Roller Derby [B]	0.1
99	Bruising Banditas	0.1
99	Kallio Rolling Rainbow [B]	0.1
99	Nantes Derby Girls	0.1
99	Roller Derby Metz Club [B]	0.1
99	Lincolnshire Bombers [B]	0.1
99	Dom City Dolls	0.1
99	Fierce Valley Roller Girls	0.1
99	Tampere Roller Derby	0.1
99	Central City Rollergirls [C]	0.1
99	Roller Derby Calaisis	0.1
99	Furness Firecrackers	0.1
99	Roller Derby Grenoble	0.1
99	Hell's Belles	0.1
99	Nidaros Roller Derby	0.1
99	Porto Roller Derby	0.1
126	Seaside Sirens Roller Girls [B]	0
126	Severn Roller Torrent	0
126	Shoetown Slayers	0
126	Barcelona Roller Derby	0
126	Les Quads de Paris	0
126	Imposters Roller Girls	0
126	Roller Derby Arras	0
126	Hell's Ass Derbygirls	0
126	Fair City Rollers	0
126	Brighton Rockers [B]	0
126	Velvet Sluts	0
126	Wolverhampton Honour Rollers	0
126	Nottingham Roller Girls	0
126	Kernow Rollers	0
126	Wirral Whipiteres	0
126	Wakey Wheeled Cats	0
126	Wiltshire Roller Derby	0
126	Vendetta Vixens	0
126	Roller Derby Angoulême	0
126	Eastside RocknRollers	0
126	Tenerife Roller Derby	0
126	Aarhus Derby Dames	0
126	Dolly Rockit Rollers [B]	0
126	Cardiff Roller Collective	0
126	Cornwall Roller Derby	0
126	Cherry Blood	0
126	Marseille Roller Derby Club	0
126	Spiders Black Widows	0
126	Oslo Roller Derby	0
126	Central City Rollergirls [B]	0
126	Roller Girls of the Apocalypse [B]	0
126	Amsterdam Derby Dames [B]	0
126	Auld Reekie Roller Girls [B]	0
126	Jakey Bites	0
126	Lahti Roller Derby	0
126	Royal Swedish Roller Derby	0
126	Luleå Roller Derby	0
126	Preston Roller Girls	0
126	Graveyard Queens Cologne	0
126	Lahti Roller Derby [B]	0
126	Roller Derby Lorient	0
126	Roller Derby Lille	0
126	Porvoo	0
126	Dresden Pioneers	0
126	Bairn City Rollers	0
126	Voodoo Vixens Besançon	0
126	Prague City Roller Derby	0
126	Plymouth City Roller Girls [B]	0
126	Grin n Barum	0
126	Kouvola Rock n Rollers	0
126	Southern Discomfort [B]	0
126	The Switchblade RollerGrrrls	0
126	Tester	0
126	Stockholm Roller Derby [B]	0
126	Dock City Rollers	0
126	Nantes Derby Girls [B]	0
126	Valencia Roller Derby	0
126	South Wales Silures	0
126	Panam Squad	0
126	Red Lion Roller Derby	0
126	Nought	0
126	Zurich City Rollergirls	0
126	Montpellier Derby Club	0
126	Limerick Roller Derby	0
126	Hulls Angels Roller Dames	0
126	Middlesbrough Milk Rollers [B]	0
126	B.M.O Roller Derby Girls	0
126	Roller Derby Toulouse	0
126	Big Bucks High Rollers [B]	0
126	Tyne & Fear [B]	0
126	Surrey Roller Girls	0
126	Kamiquadz	0
126	Milton Keynes Quads of War	0
126	Roller Derby Avingon	0
126	Bourne Bombshells	0
126	Kent Roller Girls [B]	0
126	Granite City Roller Girls	0

Saturday, May 4, 2013

The Upset, Part I: Why do we Rank?

In derby, as in most other sports, there are multiple ranking schemes. US College Football, or NCAA Football as it's commonly known, has 3 official rankings and nearly 150 unofficial ones. European derby, with its 4, is tame by comparison.

Why so many? One word: upsets.

Upsets, in the American usage, are games in which the expected winner loses to the expected loser. They're games where the "underdog" wins, and to many sports fans one of the joys of watching sports.

We would expect Arsenal to win, but every so often Bradford City walk away with the victory. It's a major source of excitement in any sport!

But what does that have to do with rankings? Well, they mean that any ranking system cannot be perfect. There will always be upsets, thus there will always be errors in the ranking. Thus, ranking schemes need to be designed with priorities in mind.

That is, a ranking scheme needs a purpose, a question to answer. There are three such questions:

Who did the best? Who deserves the crown for best performance over the previous x time?
Who will do the best? Who will be expected to win in the coming games?
Who is good competition? Who will most likely give an exciting bout to a given team with minimal risk of a blow-out?

These must be different questions only because of upsets. Each has different ways of dealing with that problem, because each has different rules defining how rankings may or may not be calculated.

1. A ranking for the purpose of awarding a crown has some of the more rigid rules. If the crown is for best performance in a premier league season, for example, that ranking can only consider that season. All teams start the season on 0 points, and the ranking shifts from there.

A good ranking for this purpose is highly retrodictive. A retrodictive ranking is one that, over the course of the past period, has a minimum number of upsets.

In the European rankings, DerbyChart is entirely retrodictive with a limit of 12 months. EuroDerby is entirely retrodictive within its divisions for a 12 month limit, with divisional placement based on the previous year's retrodictive ranking. Thus, both seem designed to produce "the best performance for derby year xxxx."

2. A ranking for the purpose of prediction is much more free in its structure. As the goal is only to forecast the future, rather than award for a given period, a predictive ranking can use scores from any previous period.

In fact, a predictive ranking can use any factor, as long as the predictions do well. Some baseball predictive rankings take transfers, market size, stadium size, team value, and all number of things into account. If a scheme's predictions do well, then it's a good ranking. Simple enough.

In the European rankings, the European Roller Derby Rankings and Flat Track Stats are predictive in nature. Both consider all bouts since a team's debut, and the latter is explicitly designed with an algorithm based on prediction.

3. A ranking for the purpose of finding similarly-competitive teams is as free a structure as a predictive ranking, and often uses similar math.

In fact, the only difference between 2 and 3 is how the teams reading the rankings use them. As an impartial observer reading algorithms, it is often to determine whether a ranking is designed for predictivity or competitivity.

In the European rankings, the European Roller Derby Rankings' stated purpose is to allow teams to find opponents of similar skill. EuroDerby can be easily used for this purpose as well, with it's divisional system.

Back to upsets. Were it not for upsets, the three rankings would be identical. If there were no improvement, all expectations of victory or defeat would be met. This would be boring.

Instead, rankings have to deal with upsets. An upset for a retrodictive ranking system is not always a problem; however, a retrodictive system should seek to minimize past upsets. For a competitiveness ranking, it may not be a problem as well; if the ranking predicted a close bout and it was, the ranking has done its job even if the winner was not correct.

A predictive system has the biggest problem with upsets, as they indicate that the original ranking was wrong. Thus, a predictive system must react to upsets with some sort of correction to the ordering of teams.

So, how good are the various systems at being predictive and retrodictive? How accurate are they? Stay tuned for a detailed analysis of their performance, followed by a possible way of minimizing the number of upsets and maximizing the "correctness" of the ranking scheme.