Twitter reacts to the Top 8 Men


Adam may have gone through tonight, but the Twittersphere is still not a fan. Adam has almost 30% negative tweets. “I’m sorry, but Adam is no CalebJohnson Caleb, James Durbin, man… those guys could BLOW”, “Adam sounds he is just screaming  I’m not impressed at all.” said tweeps. Qaasim fell a bit, as did Clark.

Rayvon gained significantly, bringing his positive sentiment up about 6 points higher than last week. And Nick is by far the most popular, rising from 91% last week to nearly 95% this week.

Top 24 forecast (final)

NB: I forgot to say earlier, though regular readers know: green means predicted safe (outside of the margin of error). Red means predicted eliminated (outside the margin of error). Yellow means too close to call. I will clarify how this margin of error is computed in a later post.

Men first:

Name Approval Rating Popularity Order Chance of advancement
Quentin 84 22.5 10 98.2%
Clark 73 22.4 6 98.1%
Nick 75 15.5 11 98.3%
Qaasim 65 9.0 12 97.6%
Rayvon 62 7.0 7 80.8%
Daniel 30 5.4 8 63.3%
Riley 33 3.7 9 59.5%
Mark 41 3.2 4 43.9%
Savion 41 3.0 3 40.2%
Trevor 12 3.9 5 37.6%
Adam 33 2.6 1 29.1%
Michael 20 1.9 2 24.4%

Then women:

Name Approval Rating Popularity Order Chance of advancement
JAX 82 27.6 11 98.7%
Sarina 85 27.3 10 98.6%
Tyanna 66 14.6 12 98.6%
Joey 53 8.2 4 89.4%
Katherine 45 5.5 5 73.7%
Maddie 39 2.4 9 67.1%
Loren 43 1.6 7 56.4%
Shannon 14 3.9 6 50.3%
Shi 11 2.3 8 47.1%
Alexis 41 1.8 3 41.6%
Adanna 42 2.0 2 39.7%
Lovey 39 2.8 1 39.0%

We’re in uncharted territory, so I’m just making the best guess I can with the data I have to work with. Several facts to note, some of which I’ve said elsewhere:

WhatNotToSing has not published approval ratings for the Top 24 performances. I don’t know why. In place of those numbers, I’ve used numbers based on my sampling of 18 blogs for approval of performances. While it would be better to have a larger sample space, I have some confidence that these numbers shouldn’t change too much. The numbers did change by more than I thought, but it didn’t affect the rankings very much at all. Shannon moved one spot up, and Alexis dropped two places, which I am skeptical of. Mark and Riley switched places, and Adam fell one spot as well.

The number of respondents to the Votefair poll is pretty bad. In order to correct for this, I’ve averaged the results of three such internet polls, Votefair, TVLines, and MJsBigBlog polls. These together, in equal measure, account for the “Popularity” column. One complaint you might have is that this isn’t the right way to do this: the polls should be weighted according to the number of respondents. Well, I can’t. TVLine and MJs doesn’t seem to indicate how many people voted. So I’ve just made an assumption that they are all roughly the same.

The model here accounts for the order that the contestants sang in, which has been significant in the past. However, it’s not clear that the same significance is still there, given that voting is available as soon as the singing starts. Nevertheless, I’ve left it in, since to remove it is just as much an unfounded assumption as anything.

To be honest, nobody here would be a shocking inclusion. But there would be some shocking omissions. If we lose Jax, Sarina-Joi, or Tyanna, I would be very surprised, as well as Quentin, Clark, Nick, and Qaasim.

Note that Daniel F***ing Seavey is above 50%.

How to calculate these yourself:

The parameter estimates given the data linked to in this article lead to a probability for a given contestant

P =1/(1+EXP(-1*(-2.184291+0.1216*Order+0.017125*WNTS+0.184374*Popularity)))

You can use Excel to calculate this if you wish. The only adjustment made to these probabilities is that the sum of them must be 8, since 8 people will advance. Normally I would just take the sum of all probabilities and multiply each value by 8 divided by this sum, but there’s a problem: some of the contestants will then have a probability greater than 1, which is not allowed. Therefore, you must run a procedure (in whatever language you want)

p = p*8/sum(p);
while sum(p > .99) > 0
    p(p > .99) = p(p > .99) - .01;
    p = p*8/sum(p);

or similar. Doing this will get you the above probabilities. If you want to adjust the numbers, have at it.

Uh, what the hell is up with those approval ratings?

As I stated in an earlier post, I polled 24 websites for approval ratings on the Top 24 performances. Here are those results versus the official WhatNotToSing numbers:

WNTSVsMineAdam, as an example, was rated much lower in the official numbers than in my sample group. The same goes for Mark, Savion, Alexis, and Joey. (Data are here)

There are only a couple ways that you can compile an approval rating. You could ask a bunch of people whether or not they approve, and tally the numbers. That’s basically what I did, except I put in a “eh, it was ok” designation. The other way you can do it is to force people to rate the performance on a scale (like 1 to 10). But if you’re sampling blogs … most of them just don’t do that. So I read the reviews and I tried to guess what they would say. WNTS says “We collect data after every episode from online news outlets, professional music critics, blogs, forums, boards, chat rooms, and polls.” But I find that very weird: how can you interpret data from such disparate sources? If poll x says Adam was 10th most popular, and blog y said they liked Adam … what do you do to combine those two facts? Moreover, does this account for why WNTS scores are so seemingly unpredictive of the results by themselves? Why is it that I need to account for popularity in order to generate anything like an accurate prediction? Shouldn’t the approval rating itself do that job?

But mostly: why is WNTS so secretive about their methodology? Is there some reason why somebody would decide to keep it under wraps? “We sample dozens of sites each week and tally opinions statistically.” How many dozens? 2 dozen? 5 dozen? What exactly does the standard deviation on the site mean? For an approval rating, the standard deviation is quite interpretable: it’s the square root of the proportion times its inverse. But when applying a mapping from “dozens” of disparate sources to a number from 0 to 100 … just what the hell does that mean? If the distribution of scores is normal, how did that normality come about? That would imply a sampling mean is being calculated, but I’m honestly puzzled as to how this is happening.