Top 3 Tracker for the Top 6

The Top 3 tracker is a long view of the contest, trying to figure out who will end up in the Top 3 based on the current standings. It works in a similar way to the finals model, but rather than week-to-week, it looks at historical information to decide who most looks like a Top 3 contestant.

Here is the Top 3 tracker over time:

The Top 3 is looking very much like Jena, Caleb, and Alex. Jena has, with her last couple of performances, put her back above a 90% chance. Caleb got out of a rut this week and improved, and Alex is steady. Jessica, long occupying the space around 30%, still finds herself there. And previous lowest-vote-getter Sam continues his see-saw by falling to about 40%, and 4th most likely. C.J. remains the lowest, although he more than doubled his chances from 2% to 5%.

Continue reading

Top 7 Projection (final)

Name Song WNTS MJs VF Not-safe Probability
C.J. Harris Gravity 62 40.9 4 0.588
Dexter Roberts Muckalee Creek Water 46 22 6 0.579
Jessica Meuse Gunpowder and Lead 69 18.6 10 0.482
Sam Woolf Sail Away 52 14.3 14 0.433
Alex Preston The A Team 73 1.2 17 0.372
Caleb Johnson Family Tree 70 1.8 18 0.360
Jena Irene Creep 90 1.2 32 0.185

Important!

The methodology for the finals model is described here (though see below—some modifications have been made). The model is 87% accurate on ranking within a margin of error of +/- 3%. Probabilities being what they are, somebody with a not-safe probability of just 0.25 will be in the bottom 3 one out of four times. Please do not comment that the numbers are wrong. They are probabilities, not certainties or even claims. Do not gamble based on these numbers.

Names in green are most likely to be safe. Names in red are considered most at risk for being in the bottom 3. Names in yellow are undecided. The most probable bottom 3 is Dexter, C.J., and Jessica. However, anybody on the list being in the bottom 3 would not be shocking.

Final update 5:43 PM EST.

Updated 3:17 PM EST. No ranking changes, but Jessica and Sam are now too close to call.

Updated 11:15 AM EST. Caleb cedes the top spot to Jena.

As always, these numbers will change throughout the day. I will update. (Probabilities assume a bottom 3. If only a bottom 2 is revealed, multiply each number by 2/3).

This post content is definitely inside baseball, and some of it won’t make sense unless you follow this site or the Idolsphere in general. Sorry about that.

I’m calling it: Dialidol time of death is 4/16/2014. It was a good run, but phone voting is no longer relevant. The index is either non-responsive or anti-correlated with winner each week this entire year. A halfway decent way to guess who was eliminated would be to vote for whoever Dialidol says is best.

So far the only possibly suitable substitute I’ve found for Dialidol is a poll conducted at MJsBigBlog for “Who WILL go home?“. There are a few considerations here. If we look at the entirety of the history of this poll, it’s been relatively predictive of who’s gone home. If we do the logit of the percentage of votes that the person will go home versus whether they were not safe, we find a good fit:

Coefficients:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept)  1.900010   0.625569   3.037  0.00239 ** 
MJWillInv   -0.032278   0.007569  -4.264 2.01e-05 ***

This says that each percentage point a contestant had in that poll increased her log odds of being in the bottom group by 0.03, and it’s quite statistically significant. (For more information, please see the model methodology post, which explains what these numbers mean.)

However, I’m not entirely comfortable with affirming such a high degree of effectiveness. You see, part of the time the data exists (starting in season 10) was when Dialidol was up and running. That means that likely a fair fraction of the people who voted in the poll read Dialidol and then voted in the poll. However, we can look at how the poll has done this year, when Dialidol has never functioned:

Coefficients:
            Estimate Std. Error z value Pr(>|z|)   
(Intercept)  7.80780    2.74607   2.843  0.00447 **
MJWillInv   -0.09734    0.02962  -3.286  0.00102 **

Although less statistically significant, it still certainly seems to be an ok indicator. To be conservative, I have discounted the weight of the index in the model projection relative to what this fit would say. I do this because the poll hasn’t really predicted this season’s huge misses very well. In fact, it missed as badly as the other indices when it came to Jena’s early trip to the Bottom 3, and predicted M.K. was certain to be gone when she was in fact safe in the Top 11.

I want to stress that this is partly guesswork. The safest thing any person can do is make his best guess, but hedge his bets. I know the full history of WNTS and of Votefair, so I know how seriously to take them to a high degree of certainty. But I do not yet have a good quantitative idea of well MJ’s poll does when Dialidol is gone—and neither does anyone else. That’s why I’ve reduced its contribution.

Top 3 Tracker for the Top 7

The Top 3 tracker is a long view of the contest, trying to figure out who will end up in the Top 3 based on the current standings. It works in a similar way to the finals model, but rather than week-to-week, it looks at historical information to decide who most looks like a Top 3 contestant.

Here is the Top 3 tracker over time:

After finding himself the lowest vote-getter, Sam bounced back somewhat, regaining the #4 spot that he held two weeks ago, but just barely. The fight for the third spot is looking like a 3-way fight between him, Jessica, and Caleb. Both Jena and Alex remain in the top spots, albeit declining somewhat from their stratospheric highs last week. Both chose somewhat radical re-arrangements of 80s tunes, which were a bit polarizing on the voting public.

Continue reading

Top 8 redux projection

Name Song WNTS DI VF Not-safe Probability
C.J. Harris Free Fallin 28 0 3 0.574
Malaya Watson Through the Fire 27 4.091 4 0.551
Dexter Roberts Keep Your Hands to Yourself 35 0 5 0.523
Sam Woolf Time After Time 55 0 13 0.347
Jessica Meuse Call Me 59 0 14 0.328
Alex Preston Every Breath You Take 68 0 19 0.246
Caleb Johnson Faithfully 86 0 19 0.241
Jena Irene I Love Rock N’ Roll 42 0 24 0.190

Important!

The methodology for the finals model is described here. The model is 87% accurate on ranking within a margin of error of +/- 3%. Probabilities being what they are, somebody with a not-safe probability of just 0.25 will be in the bottom 3 one out of four times. Please do not comment that the numbers are wrong. They are probabilities, not certainties or even claims. Do not gamble based on these numbers.

Names in green are most likely to be safe. Names in red are considered most at risk for being in the bottom 3. Names in yellow are undecided. The most probable bottom 3 is Dexter, C.J., and Malaya. However, anybody on the list being in the bottom 3 would not be shocking.

Updated 12:00 PM EST. Dexter jumps two spots, Jena and Jessica fall a bit, and Alex gains.

Update: 5:14 PM EST. Large drop for Sam Woolf relative to everyone else. No change in 3 most likely, but Dexter moves to third most likely.

Note that Votefair results have been slow to come in recently, and so these rankings are apt to change as the day goes on. In particular, Alex’s fans tend to pull him up later in the day.

So here’s the dilemma: the way I would normally model this round would be to factor in the figure from Dialidol at a moderate level. Dialidol has frequently shown so-so performance early in the year, and gotten better as weeks went by, until it’s quite accurate in the Top 5-Top 2. This year, with rule changes that make phone dialing nearly obsolete, something quite interesting has happened:

Round Highest scorer Result
Top 13 Kristen Eliminated
Top 12 Jena Bottom 3
Top 11 n/a
Top 10 Majesty Bottom 3
Top 9 n/a
Top 8 n/a
Top 8 (ii) Malaya ???

Dialidol so far has been anti-correlated with the results. That is, the person with a high score (and this year, they have been huge) has never been safe. Now, that’s only a sample set of 3, and unless you had a prior belief that the indicator had gone wacky, it’s no reason to change your behavior. But it’s enough to be quite suspicious now since, let’s be fair, we already had good reason to suspect Dialidol may be D.O.A. in 2014. As such, I’ve severely discounted Dialidol in the above numbers.

Dexter has certainly been underrated by the model so far. It’s projected him among the three most likely to be in the bottom 3 nearly every week, but he’s only been there once.

Sam had fairly good numbers last week, and we know he was the lowest vote-getter, which was mildly surprising but nowhere near shocking. His numbers today are also decent, and he’s likely safe.

The following is a bit technical:

Jena and Alex are definitely the most polarizing contestants. Though I rarely quote the statistic, WNTS releases the standard deviation of their sample in addition to its mean. The standard deviation is roughly thought of, in many contexts, as the width of a bell curve. In the normal distribution, one standard deviation encompasses around 2/3 of all the results. If your results are either Yes (approve) or No (disapprove), it’s not quite as straightforward to interpret. What it means in this case is that if you polled 100 people about whether they liked Alex’s version of “Every Breath You Take”, you will get a number, such as 56, that did like it, and 44 that did not. If you do that over and over again, with different sets of people, you get a range of numbers. Sometimes only 30% liked it, other times 70% liked it. The standard deviation in this context is the width of that range of numbers, which is an indication of something which there is a lot of disagreement about. (You can read my explanation of how polling works here and here.)

Alex and Jena had the highest standard deviation tonight, 24 and 27 respectively. So, though Alex’s approval rating as measured by WNTS is nominally 68%, in reality we can only be confident that it’s between 41 and 95! This doesn’t really help with our predictions, it only tells us we should be cautious about the figure. One part of the “probability” part of the above projection is this caution, though not the only part.