I’d like to quickly review how a modeling prediction works. The prediction is below.
If we want to guess what’s going to happen as far as the results, and do it in a systematic way, we can describe the past and use that description to project the future. The advancement of a semifinalist can be given in terms of a chance, a probability. We can find a number, let’s call it x, for each contestant. If x is low, the probability of advancement is low, and if x is high, the probability of advancement is high. One curve that does this nicely is called the logistic regression. It looks like this:For a large enough x, making it into the finals is all but certain. For small enough x, it’s extremely unlikely. Somewhere in between, well, you can see the curve. x is called the ‘logit’.
What does it mean for someone to have a 30% chance of making the finals? The same thing it means that there’s a 30% chance of rain. It means that 30% of the time that the day looked like that, it rained. 3 out of 10.
How you actually determine x is the main activity of the modeler. He has to look through the past and determine which variables matter most in determining the outcome. I won’t rehash all of these statistics here, but you can read about them in the methodology. Suffice it to say that there are certain scores around the internet which are, to varying degrees, predictive of advancement. A main score has to do with how good the performance was, collected by WhatNotToSing.com. Another variable is how much the contestant has been on screen. We can even include a yes/no variable in the equation for x, sometimes called a dummy variable, which is either 0 or 1 indicating whether the contestant’s audition was shown. Each of these things either increases the odds, and hence increases x, or decreases the odds. More screen time, higher odds of advancing. Better performance, better odds. You get the idea.
This year, not all of the systems are working. Votefair is not holding their poll, and Dialidol, well …
let’s just say it’s having an off day.
So the following projections were run using only pre-exposure, WNTS (approximate, since official numbers are not yet posted), and whether the audition was shown. I have included Dialidol, but everyone got a zero except Malcolm Allen, whose score I will arbitrarily reduce to 5. 50 is a huge statistical outlier, possibly indicating that Malcolm Allen’s phone line is down (eek!). Not ideal, but we have to make the best guess we can.
First the girls. This is different from the tentative assignments I gave last night. For one thing, WNTS firmed up their numbers, and for another thing some watchful commenters noticed that I screwed up and missed Jena Irene’s audition (because it was listed as Jena Ascuitto). Her pre-exposure has been recomputed as well.
Names in green are quite likely to be voted in, names in red not so much. In yellow, the model is too unsure, because the ranked probability is within its margin of error. Note that the sum of the probabilities is 5, since 5 people will advance.
*Arbitrarily set to 5 because whoa my god Dialidol is being weird
The men’s was pretty surprising to me, given that I didn’t think that C.J. was very strong. But WNTS scores how they score, and people evidently liked that. He got a ton of screen time with Casey Thrasher (who was not allowed to sing), so that makes him quite likely to make it through. Also, to be frank, Spencer’s ranking is shockingly high.
So what the hell is up with Dialidol? A quick look on their forums didn’t suggest that anything was going on with the software. Could the program really have measured a constant busy signal for Malcolm Allen and no other phone line? A busy signal that ranks with that of a finale episode? When I see the value of 50, that’s just not believable.
Anyway, try not to take the above projection too seriously. I would be amazed if it was even halfway right, given the current state of the Idolsphere. Hopefully things will normalize a bit once the finals start, Votefair starts their poll, and Dialidol gets its act together.