Semifinals projection

I’d like to quickly review how a modeling prediction works. The prediction is below.

If we want to guess what’s going to happen as far as the results, and do it in a systematic way, we can describe the past and use that description to project the future. The advancement of a semifinalist can be given in terms of a chance, a probability. We can find a number, let’s call it x, for each contestant. If x is low, the probability of advancement is low, and if x is high, the probability of advancement is high. One curve that does this nicely is called the logistic regression. It looks like this:logitFor a large enough x, making it into the finals is all but certain. For small enough x, it’s extremely unlikely. Somewhere in between, well, you can see the curve. x is called the ‘logit’.

What does it mean for someone to have a 30% chance of making the finals? The same thing it means that there’s a 30% chance of rain. It means that 30% of the time that the day looked like that, it rained. 3 out of 10.

How you actually determine x is the main activity of the modeler. He has to look through the past and determine which variables matter most in determining the outcome. I won’t rehash all of these statistics here, but you can read about them in the methodology. Suffice it to say that there are certain scores around the internet which are, to varying degrees, predictive of advancement. A main score has to do with how good the performance was, collected by Another variable is how much the contestant has been on screen. We can even include a yes/no variable in the equation for x, sometimes called a dummy variable, which is either 0 or 1 indicating whether the contestant’s audition was shown. Each of these things either increases the odds, and hence increases x, or decreases the odds. More screen time, higher odds of advancing. Better performance, better odds. You get the idea.

This year, not all of the systems are working. Votefair is not holding their poll, and Dialidol, well …


let’s just say it’s having an off day.

So the following projections were run using only pre-exposure, WNTS (approximate, since official numbers are not yet posted), and whether the audition was shown. I have included Dialidol, but everyone got a zero except Malcolm Allen, whose score I will arbitrarily reduce to 5. 50 is a huge statistical outlier, possibly indicating that Malcolm Allen’s phone line is down (eek!). Not ideal, but we have to make the best guess we can.

First the girls. This is different from the tentative assignments I gave last night. For one thing, WNTS firmed up their numbers, and for another thing some watchful commenters noticed that I screwed up and missed Jena Irene’s audition (because it was listed as Jena Ascuitto). Her pre-exposure has been recomputed as well.

Name Pre-exposure
WNTS DI VF Probability
of Advancing
Malaya Watson 433 Yes 75 0 n/a 0.695
Emily Piriz 79 Yes 80 0 n/a 0.686
M.K. Nobilette 314 Yes 75 0 n/a 0.680
Jessica Meuse 1048 Yes 60 0 n/a 0.666
Kristen O’Connor 344 Yes 40 1.364 n/a 0.603
Jena Irene 397 Yes 60 0 n/a 0.568
Majesty Rose 380 Yes 60 0 n/a 0.565
Briana Oakley 436 Yes 40 0 n/a 0.366
Marialle Sellars 320 Yes 20 0 n/a 0.087
Bria Anai 312 Yes 20 0 n/a 0.085

Names in green are quite likely to be voted in, names in red not so much. In yellow, the model is too unsure, because the ranked probability is within its margin of error. Note that the sum of the probabilities is 5, since 5 people will advance.

Name Pre-exposure
WNTS DI VF Probability
of Advancing
C.J. Harris 915 Yes 70 0 n/a 0.806
Alex Preston 538 Yes 70 0 n/a 0.746
Sam Woolf 474 Yes 70 0 n/a 0.734
Caleb Johnson 440 Yes 70 0 n/a 0.728
Malcolm Allen 420 Yes 35 5* n/a 0.578
Dexter Roberts 615 Yes 50 0 n/a 0.577
Ben Briley 603 Yes 50 0 n/a 0.574
Spencer Lloyd 561 Yes 25 0 n/a 0.223
Emmanuel Zidor 641 Yes 10 0 n/a 0.020
George Lovett 84 No 35 0 n/a 0.015

*Arbitrarily set to 5 because whoa my god Dialidol is being weird

The men’s was pretty surprising to me, given that I didn’t think that C.J. was very strong. But WNTS scores how they score, and people evidently liked that. He got a ton of screen time with Casey Thrasher (who was not allowed to sing), so that makes him quite likely to make it through. Also, to be frank, Spencer’s ranking is shockingly high.

So what the hell is up with Dialidol? A quick look on their forums didn’t suggest that anything was going on with the software. Could the program really have measured a constant busy signal for Malcolm Allen and no other phone line? A busy signal that ranks with that of a finale episode? When I see the value of 50, that’s just not believable.

Not to 50!

Not to 50!

Anyway, try not to take the above projection too seriously. I would be amazed if it was even halfway right, given the current state of the Idolsphere. Hopefully things will normalize a bit once the finals start, Votefair starts their poll, and Dialidol gets its act together.

Bookmark the permalink.

Comments are closed.

Comments are closed