Top 16: the pundits versus WhatNotToSing

Top16ApprovalI noted last week that the pundits (a sample of recaps/review articles on prominent websites) had graded many of the contestants much higher than WhatNotToSing did last week. This week is more of the same. The sample is of 24 sites, and I recorded whether or not the person liked the performance. The error bars on both sets are 1 standard deviation.

Adanna, Tyanna, Adam, and Clark are clearly being rated significantly lower by whoever WNTS samples than the pundits. Now, obviously that’s a sampling thing: if you take any different subset of people, especially a small subset like I did, you’re going to get a different answer. However, I wonder if WNTS is oversampling younger people who are not necessarily indicative of the voting public.

I’m thinking, of course, of Caleb Johnson, last year’s winner. WNTS had him way behind Jena and Alex. But he went all the way to the top. I can see Caleb being more likable to the audience than the awkward Alex Preston, but Jena Irene? I have to think the audience actually preferred him. And that didn’t show up very much in the numbers. Caleb overall had a 52% on WNTS from the Top 5 on, fully 10 points below Jena. Maybe that’s not a huge margin, but it has me wondering. White guys in general seem to do a bit poorer on WNTS than the voting results would indicate.

I’m going to keep track of this separately and see if there’s any story there. If nothing else it’s just a different perspective.

Twitter reacts to the Top 8 women

TwitterSentiment2015-03-06Alexis got through to the Top 16, but her performance made a few people sorry she did. ‘It’s just repulsive that people actually had the audacity to vote trash box Alexis’ opened one genteel person. “Oh my heavens. That was so incredibly bad. Alexis… No girl. No. Country isn’t just something you put a fake twang on… Yikes.”, “Alexis… ugh gave me a headache Grade: D”, “Alexis isn’t in any appreciable or identifiable key.”. Ok then. At 21% negative comments, she is the most unpopular on Twitter, falling about 5 points since last week. Joey also fell significantly, ceding the top spot from last week, to 87%, down almost 8 points.

Maddie greatly improved her standing, after having been maligned as the winner of a sing-off earlier, up to 84% positive from 77% last week. Sarina-Joi likewise improved a huge amount, and now registers above 90% positive. However, it’s Tyanna who grabs the top spot, which Joey held last week.

Jax was steady at about 19% negatives, Loren and Adanna gained a few points each.

Twitter reacts to the Top 8 Men


Adam may have gone through tonight, but the Twittersphere is still not a fan. Adam has almost 30% negative tweets. “I’m sorry, but Adam is no CalebJohnson Caleb, James Durbin, man… those guys could BLOW”, “Adam sounds he is just screaming  I’m not impressed at all.” said tweeps. Qaasim fell a bit, as did Clark.

Rayvon gained significantly, bringing his positive sentiment up about 6 points higher than last week. And Nick is by far the most popular, rising from 91% last week to nearly 95% this week.

Uh, what the hell is up with those approval ratings?

As I stated in an earlier post, I polled 24 websites for approval ratings on the Top 24 performances. Here are those results versus the official WhatNotToSing numbers:

WNTSVsMineAdam, as an example, was rated much lower in the official numbers than in my sample group. The same goes for Mark, Savion, Alexis, and Joey. (Data are here)

There are only a couple ways that you can compile an approval rating. You could ask a bunch of people whether or not they approve, and tally the numbers. That’s basically what I did, except I put in a “eh, it was ok” designation. The other way you can do it is to force people to rate the performance on a scale (like 1 to 10). But if you’re sampling blogs … most of them just don’t do that. So I read the reviews and I tried to guess what they would say. WNTS says “We collect data after every episode from online news outlets, professional music critics, blogs, forums, boards, chat rooms, and polls.” But I find that very weird: how can you interpret data from such disparate sources? If poll x says Adam was 10th most popular, and blog y said they liked Adam … what do you do to combine those two facts? Moreover, does this account for why WNTS scores are so seemingly unpredictive of the results by themselves? Why is it that I need to account for popularity in order to generate anything like an accurate prediction? Shouldn’t the approval rating itself do that job?

But mostly: why is WNTS so secretive about their methodology? Is there some reason why somebody would decide to keep it under wraps? “We sample dozens of sites each week and tally opinions statistically.” How many dozens? 2 dozen? 5 dozen? What exactly does the standard deviation on the site mean? For an approval rating, the standard deviation is quite interpretable: it’s the square root of the proportion times its inverse. But when applying a mapping from “dozens” of disparate sources to a number from 0 to 100 … just what the hell does that mean? If the distribution of scores is normal, how did that normality come about? That would imply a sampling mean is being calculated, but I’m honestly puzzled as to how this is happening.

Let’s predict approval ratings for the Top 24 (updated)

As WhatNotToSing hasn’t seen fit to publish numbers, here are some estimates for the Top 24. First the men:

Contestant Song Approval
Rating (est)
Adam I Wanna Rock 65
Clark When a Man Loves a Woman 72
Daniel I’m Yours 28
Mark The Weight 63
Michael How Am I Supposed to Live Without You 30
Nick Thinking Out Loud 85
Qaasim Uptown Funk 72
Quentin I Put a Spell on You 87
Rayvon Jealous 59
Riley Homeboy 28
Savion Hey Soul Sister 63
Trevor The Best I Ever Had 26

Then the women

Contestant Song Approval
Rating (est)
Adanna Rather Be 60
Alexis Gunpowder and Lead 65
Jax Bang Bang 85
Joey Somebody Like You 75
Katherine Safe & Sound 46
Loren Note to God 48
Lovey Love Runs Out 33
Maddie Love Gets Me Every Time 40
Sarina Mamma Knows Best 96
Tyanna Lips Are Movin 75
Shannon Who Knew 19
Shi Umbrella 19

These were calculated based on a sample of 18 24 write-ups. No doubt better numbers could come from a larger sample, and I have a few more to add. The data are here as a tab delimited file.

Forecast based on these preliminary numbers will be forthcoming.