Name | Song | WNTS | MJs | VF | Not-safe Probability |
---|---|---|---|---|---|
Sam Woolf | Sing / How to Save a Life | 34.0 | 50.4 | 17 | 0.533 |
Jessica Meuse | Human / Summertime Sadness | 77.0 | 36.3 | 16 | 0.490 |
Alex Preston | Sweater Weather / Say Something | 63.5 | 2.8 | 14 | 0.452 |
Caleb Johnson | Don’t Want to Miss a Thing / Still of the Night | 57.5 | 4.5 | 22 | 0.315 |
Jena Irene | My Body / Valerie | 60.0 | 5.9 | 30 | 0.209 |
Final update 6:01 PM EST.
Updated 11:22 AM EST. Jessica now 2nd most likely outside the margin of error.
Initial post 12:53 AM EST.
Alex and Jessica are virtually tied for second most likely to be in the bottom 2 (supposing that they actually tell us what that is tomorrow). Though Jessica had the highest average score on her approval ratings, she is considered second-most-likely by MJsBigBlog’s poll (“Who WILL go home?”) and is only third most popular on Votefair.
I am not as convinced as everyone else is that Alex is totally safe. Only 3.6% of internet voters thought Alex would be eliminated tomorrow, but I don’t think his chances of being in the bottom 2 are anywhere near that low.
As I’ve said time and again, a probability as written above refers to the chance, based on historical considerations, that a person will be in the bottom group. If someone has a probability of 30%, then 30% of such people in the long run will be not safe: that is, either eliminated or just in the bottom 2. It’s fair, then, to ask how many people this season have actually followed the probabilities stated by me over the weeks so far.
Below you can see the model performance so far this season (the format is adapted from fivethirtyeight’s assessment of their NCAA basketball predictions). The predicted not-safe probabilities were binned in bins of 10 percentage points. The dot shows the proportion that were actually not-safe. So for people predicted not-safe in the range of 0-10%, 15% of such people actually were not-safe, well within the margin of error. The confidence interval (95% C.I. based on simple binomial model) is shown in gray. The more contestants in the bin, the smaller this interval is and the more confident the assessment is. So, the fact that only a few people were predicted between 40-50% means that the uncertainty is quite high in that category, and there’s no real reason to fret that only 16% of those few people actually were (N = 6). Note that in the categories with any significance, the dot is well within the tolerance, in some cases right in the middle of the range. The model is possibly still too tentative about people who are predicted 50-60% chance of not being safe, nearly all (8 out of 9) of whom have been—that is, their predicted chance should have been higher—but the result is not statistically significant.