As I stated in an earlier post, I polled 24 websites for approval ratings on the Top 24 performances. Here are those results versus the official WhatNotToSing numbers:
Adam, as an example, was rated much lower in the official numbers than in my sample group. The same goes for Mark, Savion, Alexis, and Joey. (Data are here)
There are only a couple ways that you can compile an approval rating. You could ask a bunch of people whether or not they approve, and tally the numbers. That’s basically what I did, except I put in a “eh, it was ok” designation. The other way you can do it is to force people to rate the performance on a scale (like 1 to 10). But if you’re sampling blogs … most of them just don’t do that. So I read the reviews and I tried to guess what they would say. WNTS says “We collect data after every episode from online news outlets, professional music critics, blogs, forums, boards, chat rooms, and polls.” But I find that very weird: how can you interpret data from such disparate sources? If poll x says Adam was 10th most popular, and blog y said they liked Adam … what do you do to combine those two facts? Moreover, does this account for why WNTS scores are so seemingly unpredictive of the results by themselves? Why is it that I need to account for popularity in order to generate anything like an accurate prediction? Shouldn’t the approval rating itself do that job?
But mostly: why is WNTS so secretive about their methodology? Is there some reason why somebody would decide to keep it under wraps? “We sample dozens of sites each week and tally opinions statistically.” How many dozens? 2 dozen? 5 dozen? What exactly does the standard deviation on the site mean? For an approval rating, the standard deviation is quite interpretable: it’s the square root of the proportion times its inverse. But when applying a mapping from “dozens” of disparate sources to a number from 0 to 100 … just what the hell does that mean? If the distribution of scores is normal, how did that normality come about? That would imply a sampling mean is being calculated, but I’m honestly puzzled as to how this is happening.