Monday, February 23, 2009

Why Failing to Fail is a Failure (Prediction Market edition)

Oscars brought a new wave of predictions, reviving a little bit the discussion for prediction markets. A lot of discussion today about who got it right, who got it wrong, and some stupid bragging about "we got them all correctly, we are SOOOO good".

No, Mr. Right, if you got them all right, then you are a failure. If your markets do not fail, they are a failure!

Let's see the set of claims from HubDub, who believe that they nailed the results, in contrast to InTrade that missed the "best actor" award. So here are the results for the 6 major categories from the two exchanges, together with the probabilities for the frontrunners:


Category Hubdub InTrade
Best Picture 98% 90%
Best Director 76% 90%
Best Actor 63% 70.0% (wrong)
Best Actress 87% 85%
Best Sup Actor 100% 95%
Best Sup Actress 64% 58.8%



So the question is: How many contracts should they get correctly to claim good accuracy?

The knee-jerk answer is "all of them". Unfortunately, it is incorrect as well. According to the reported numbers, for HubDub the probability of getting all the answers correctly is 0.98*0.76*0.63*0.87*1.0*0.64=0.26. For InTrade, the corresponding probability is 0.287.

Is there a more likely outcome? Yes, for HubDub, according to their own numbers, they should have picked correctly 5 out of 6 frontrunners, with probability 0.42. For InTrade, they should have picked correctly 5 out of 6 frontrunners with probability 0.43. Here is the complete table:


SuccessesProbability
HubDub
Probability
InTrade

002.00E-05
11.00E-043.00E-04
20.005240.00624
30.05950.05285
40.247130.22217
50.426410.431
60.261620.28742



And here is the expected number of successes and the respective probability for the two exchanges:

So, the most likely outcome for both exchanges was to guess 5 out of 6 correctly! Not to guess 6 out of 6! In this respect InTrade is actually better than HubDub, since they got 5 out of the 6 frontrunners right.

Just in case, you want to run your own experiments, here is the 15-line Java code:
public class PM_probability {

public static int countSuccesses(double[] probabilities) {
int successes =0;
for (double d: probabilities) {
double p = Math.random();
if (p<d) successes++;
}
return successes;
}

public static void main(String[] args) {
double[] prices = {0.90, 0.90, 0.75, 0.85, 0.95, 0.588};
int[] histogram = new int[prices.length+1];
int N=100000;
for (int i=0; i<N; i++) {
int s = countSuccesses(prices);
histogram[s]++;
}
for (int i=0; i<histogram.length; i++) {
System.out.println(i +"\t" +1.0*histogram[i]/N);
}
}
}

Guessing all the frontrunners correctly is something to brag about ONLY if the reported confidences are high enough. If they are not and you get them all correctly, then the markets have biases and are NOT accurate.

If your markets succeed more often than they should, then your markets have failed!