Uh, what the hell is up with those approval ratings?

As I stated in an earlier post, I polled 24 websites for approval ratings on the Top 24 performances. Here are those results versus the official WhatNotToSing numbers:

WNTSVsMineAdam, as an example, was rated much lower in the official numbers than in my sample group. The same goes for Mark, Savion, Alexis, and Joey. (Data are here)

There are only a couple ways that you can compile an approval rating. You could ask a bunch of people whether or not they approve, and tally the numbers. That’s basically what I did, except I put in a “eh, it was ok” designation. The other way you can do it is to force people to rate the performance on a scale (like 1 to 10). But if you’re sampling blogs … most of them just don’t do that. So I read the reviews and I tried to guess what they would say. WNTS says “We collect data after every episode from online news outlets, professional music critics, blogs, forums, boards, chat rooms, and polls.” But I find that very weird: how can you interpret data from such disparate sources? If poll x says Adam was 10th most popular, and blog y said they liked Adam … what do you do to combine those two facts? Moreover, does this account for why WNTS scores are so seemingly unpredictive of the results by themselves? Why is it that I need to account for popularity in order to generate anything like an accurate prediction? Shouldn’t the approval rating itself do that job?

But mostly: why is WNTS so secretive about their methodology? Is there some reason why somebody would decide to keep it under wraps? “We sample dozens of sites each week and tally opinions statistically.” How many dozens? 2 dozen? 5 dozen? What exactly does the standard deviation on the site mean? For an approval rating, the standard deviation is quite interpretable: it’s the square root of the proportion times its inverse. But when applying a mapping from “dozens” of disparate sources to a number from 0 to 100 … just what the hell does that mean? If the distribution of scores is normal, how did that normality come about? That would imply a sampling mean is being calculated, but I’m honestly puzzled as to how this is happening.

Top 24 forecast

Note: These numbers are preliminary and are likely to change. In particular, they are based on my estimates of the approval ratings. See below.

Men first:

Name Approval Rating Popularity Order Chance of advancement
Quentin 91.7 22.5 10 99.1%
Clark 69.4 22.4 6 97.9%
Nick 83.3 15.5 11 96.9%
Qaasim 72.2 9.0 12 92.8%
Rayvon 61.1 7.0 7 76.1%
Daniel 25.0 5.4 8 58.4%
Mark 72.2 3.2 4 56.1%
Riley 30.6 3.7 9 56.1%
Savion 63.9 3.0 3 49.0%
Adam 72.2 2.6 1 44.4%
Trevor 27.8 3.9 5 43.5%
Michael 33.3 1.9 2 29.4%

Then women:

Name Approval Rating Popularity Order Chance of advancement
JAX 91.7 27.6 11 99.7%
Sarina 94.4 27.3 10 99.7%
Tyanna 77.8 14.6 12 96.4%
Joey 75.0 8.2 4 85.0%
Katherine 50.0 5.5 5 67.3%
Maddie 38.9 2.4 9 60.1%
Loren 47.2 1.6 7 54.1%
Alexis 69.4 1.8 3 52.2%
Shannon 19.4 3.9 6 49.8%
Shi 19.4 2.3 8 48.7%
Adanna 63.9 2.0 2 48.0%
Lovey 38.9 2.8 1 39.1%

We’re in uncharted territory, so I’m just making the best guesses I can with the data I have to work with. Several facts to note, some of which I’ve said elsewhere:

WhatNotToSing has not published approval ratings for the Top 24 performances. I don’t know why. In place of those numbers, I’ve used numbers based on my sampling of 18 blogs for approval of performances. While it would be better to have a larger sample space, I have some confidence that these numbers shouldn’t change too much.

The number of respondents to the Votefair poll is pretty bad. In order to correct for this, I’ve averaged the results of three such internet polls, Votefair, TVLines, and MJsBigBlog polls. These together, in equal measure, account for the “Popularity” column. One complaint you might have is that this isn’t the right way to do this: the polls should be weighted according to the number of respondents. Well, I can’t. TVLine and MJs doesn’t seem to indicate how many people voted. So I’ve just made an assumption that they are all roughly the same.

The model here accounts for the order that the contestants sang in, which has been significant in the past. However, it’s not clear that the same significance is still there, given that voting is available as soon as the singing starts. Nevertheless, I’ve left it in, since to remove it is just as much an unfounded assumption as anything.

To be honest, nobody here would be a shocking inclusion. But there would be some shocking omissions. If we lose Jax, Sarina-Joi, or Tyanna, I would be very surprised, as well as Quentin, Clark, Nick, and Qaasim.

Note that Daniel F***ing Seavey is above 50%.

Let’s predict approval ratings for the Top 24 (updated)

As WhatNotToSing hasn’t seen fit to publish numbers, here are some estimates for the Top 24. First the men:

Contestant Song Approval
Rating (est)
Adam I Wanna Rock 65
Clark When a Man Loves a Woman 72
Daniel I’m Yours 28
Mark The Weight 63
Michael How Am I Supposed to Live Without You 30
Nick Thinking Out Loud 85
Qaasim Uptown Funk 72
Quentin I Put a Spell on You 87
Rayvon Jealous 59
Riley Homeboy 28
Savion Hey Soul Sister 63
Trevor The Best I Ever Had 26

Then the women

Contestant Song Approval
Rating (est)
Adanna Rather Be 60
Alexis Gunpowder and Lead 65
Jax Bang Bang 85
Joey Somebody Like You 75
Katherine Safe & Sound 46
Loren Note to God 48
Lovey Love Runs Out 33
Maddie Love Gets Me Every Time 40
Sarina Mamma Knows Best 96
Tyanna Lips Are Movin 75
Shannon Who Knew 19
Shi Umbrella 19

These were calculated based on a sample of 18 24 write-ups. No doubt better numbers could come from a larger sample, and I have a few more to add. The data are here as a tab delimited file.

Forecast based on these preliminary numbers will be forthcoming.

Twitter reaction to Top 12 women

TwitterSentiment2015-02-27People tore themselves away from tweeting about what color that dress is (it’s white and gold, you weirdos) to weigh in on the Top 12 women, and man: people hate Shi. “song choice was terrible. I kno she is much better than that”, “shi can’t sing for shit”, “Shi really shouldn’t have been put through and she’s bombing so bad on this song” were typical comments. Shi had the lowest Twitter sentiment I’ve seen all year.

People also did not like Shannon (the young girl who sang the Pink song). “And now Shannon. The Kristen Stewart of #idol #atonal #awful #affected”. “I’m a Shannon fan, but that was a weak performance. #idol :
Shannon lacks stage presence. #SeeYa #AmericanIdol #Idol”, “Shannon… fell flat. Shouty, pitchy, breath issues, wobbly notes. Thats all. Grade: D”. Not going to disagree with them.

Maddie Walker still had a lot of negative sentiment from last week, when she was chosen over another contestant after a sing-off. Some examples: “still mad about the rachel/maddie thing? so i hope she goes home”, “Maddie looks like a child from Toddlers & Tiaras.”, “Maddie with [deer] in the headlights eyes. School pageant performance.”

Everyone else was above 80%. People had a lot of love for Joey Cook, who sang a Keith Urban song with a quirky accordion arrangement. “Love Joey’s entire shtick & personna.”, “Well Joey Cook definitely gets the most creativity points.”, “I like Joey Cook. Wish I could say the same about that performance.”

Tyanna also got a lot of positives. “Tyanna Jones was absolutely amazing tonight “, “Tyanna Jones has the vocals from heaven, but her movements are jerky and stiff. She needs to feel the music, and move with more sass.” “Tyanna gone gone take it all this year they matter as well give her the loot and the deal now”. Ok then.

I should have a quantitative forecast sometime tomorrow. By all accounts, the results won’t be announced until next week, so be patient.