Just as before, here is some context for the prediction that will be published later tonight. Here’s how the model rated previous finales.
May 15 2013
May 08 2013
Tomorrow night I’ll post a forecast on the Top 3, and you may want to know how the model that the forecast is based on “saw” the past Top 3 rounds. Indeed, the model was potentially misleading in this year’s Top 4 (though the Top 3 Tracker was pretty good all year) by seeing Candice as the most likely. Of course, for all we know she was second-to-last, but it is peculiar that the person it saw as the most likely was safe and the person it saw as least likely was eliminated. In any case, take this historical information and make of the model’s prediction what you like.
Apr 29 2013
Is American Idol losing its cultural cachet? The signs around seem to point to yes. What was once something that commanded national news, its own wrap up show, and even a weekly write-up on internet juggernaut Slate, now has none of those things. One scarcely has to try to avoid spoilers, and other shows like The Voice have moved in to compete in a serious way. Recently, Idol stopped winning the night for performances, and its results show is regularly killed by The Big Bang Theory. The ratings are even lower than season 1, which aired during the summer months when less TV is typically watched.
The Nielsen ratings for a show are kind of a murky issue. They measure a percentage of all TV viewers watching a show, out of the 100 million or so total TV viewers. The one most people pay attention to is the ages 18-49 demographic, the most important for advertisers. However, the share of viewers watching broadcast (rather than cable) shows has been falling for many years now. In 1984 fully 45% of all viewers were watching major prime time broadcast networks, as opposed to cable, local, or public broadcasting. By 2002 when Idol premiered it had already fallen to 29.6% and was 25.6% in 2009 (the most recent year published by TVByTheNumbers).
If we look at the ratings for Idol season by season, plotted with the x-axis representing the number of remaining contestants in the finals, we can see how things have progressed
In Season 1 the viewers caught on around the Top 5, and Idol jumped a huge amount (points are omitted where data was unavailable). Season 5, with Taylor Hicks and Katharine McPhee, was clearly the biggest year by any account. That’s a bit strange to me, since I always thought Season 7 (David Cook) was really the zenith in terms of cultural awareness of the show.
Season 12 is the lowest rated season by a comfortable margin. But how bad is it when you consider the falling overall audience? To answer this, we could weight the ratings data proportional to the remaining audience. The data is provided by TVByTheNumbers in the above link only through 2009, so we have to do a bit of guesswork. We can fairly reliably think that the overall audience is falling about 1 point per year, which has been the case since 2000 (with the exception of 2005, which had a rise probably due to DVR ratings being added).
Viewed through this lens, Season 12 is … still pretty bad
Season 12 at least starts a little better than Season 1, but after that plummets to last place. Seasons 4 and 5 look pretty similar near the end, accounting for audience size. Season 8 gets a boost relative to Season 1. Season 10 was a bit of an outlier, an unusually strong year that got big viewership relative to that year’s total (absent the horrible finale that year). Other than that, the rankings don’t change much. The viewership, even as a share of remaining broadcast viewers, is falling pretty fast.
One caveat here is that this year’s broadcast ratings appear to have fallen further than previous years, with Fox alone being about 19% lower than last year. Could that account for the low ratings? No, even if you include another point of loss, Season 12 is still the nadir. Idol is sinking fast, and the producers have reason to worry.
Apr 03 2013
In the first stab at analyzing song choice, I looked at artists who are commonly sung on Idol, and how safe (historically) the contestants who sang those songs had been. However, in that case I was only focusing on the artist, not the songs, so long as the artist had more than one of their songs sung. Today, I will focus on a rating for songs that are commonly sung on Idol, regardless of who they were by.
1016 songs have been sung on Idol, with 677 of those having been sung only once. There’s not much I can say about those, since one data point doesn’t even make the slightest of trends. So, focusing in on those 349 other songs, I want to look at how safe they appear to be and rank them in a manner similar to the artists. This means looking at how many times they were sung, how often the contestants were safe, how many had their standing improved by such a song (in that they were people who had previously been in the bottom 3). I incorporate all performances except for reprises (songs repeated for the finale) or American Idol official songs.
More after the jump
Mar 27 2013
How smart is it to sing a Luther Vandross song on Idol? What about a Tina Turner song?
What you might say is that there’s no real answer to this. If you sing a great rendition of a song by one of those artists, you’re likely to be safe, right? I would mainly agree. But the artists whose songs are covered on Idol vary in popularity, and their songs lend themselves better or worse to the contest than other artists, surely. And it ought to be at least somewhat quantifiable.
Singing a Shirley Bassey song (I Who Have Nothing, As Long as He Needs Me) is pretty safe. 7 people have done so, and none of them was ever put in the bottom 3 by doing it. Singing an Adele song is pretty unsafe. 7 people have done so, and only 1 was ever safe.
So, accepting the premise that the success of a song choice is somewhat dependent on artist, here I’m going to discuss a way of rating these artists. It’s imperfect and somewhat subjective, but it is at least consistent, and gives a convenient way to rate song choice. The result is the composite index calculated below.
More after the jump
Mar 15 2013
(Note: The first Top 3 tracker for season 12 should be up later today)
A new feature this year is the Top 3 tracker. Here I give the relative probabilities that the contestants will make the Top 3 given their standing at present.
The T3T works by using the projected IdolAnalytics week-to-week not-safe probability as an explanatory variable. The following chart shows the assigned not-safe probability for first finals voting round (called Round 2 by the model) for seasons 5-11. The points represent outcomes from previous seasons. The x-axis denotes the probability of being in the bottom 3 in that round assigned by the model. The y-axis represents whether or not they eventually made it to the Top 3, with 1 being “yes” and 0 being “no” (a small amount of vertical jitter has been applied so that overlapping points can be seen).
The blue curve is a regression line giving the estimated probability of making the Top 3 for any given assigned round 2 not-safe probability. For example, if this week the contestant had a 0.25 chance of being not-safe, his chance of making the Top 3 is only about 20%. But if he had a 0.1 chance of being in the bottom 3, his chances of making the Top 3 are about 50%. Hover your mouse cursor over the points to see which contestants they belong to.
You can see, roughly speaking, the “yes” (1) points are clustered to the left, and the “no” (0) points are more dense toward the right. Logically, a high chance of being in the bottom 3 in the Top 12 or Top 13 round means you are less likely to make it to the Top 3.
The model is still fairly uncertain during the Top 12. However, anybody with a not-safe probability of > 0.27 is unlikely to make the Top 3. It’s happened only twice in the time-frame we model (Syesha Mercado and Haley Reinhart). However, having a low not-safe probability at this point is not a guarantee of making the Top 3 (see, for instance Chris Daughtry and Siobahn Magnus).
If we look later in the contest, the model has to a certain extent converged:
In round 5 (normally the Top 8), the not-safe probabilities of people who did not make the Top 3 are significantly higher on average than those who did make the Top 3 (note that the probabilities used here are the average of round 4 and round 5). Nobody with a not-safe probability of less than 0.2 has ever not made the Top 3, and nobody with a score over about 0.43 ever has. Rarely has someone with > 0.3 made it to the Top 3. The region from 0.27 to 0.37 represents people who are “on the bubble”. They could go either way, and it depends on whether they can turn things around.
Because the data begins to get sparse in later rounds, an averaging mechanism has been employed. This reduces some of the week-to-week noise inherent in the model and returns a better overall fit to the observations. Rounds 5-7 use a two-week average, and 8-9 use a three-week average. This methodology is consistent with a belief that the contest largely hardens as the weeks go by, as people have chosen their favorite by then.
Why only a Top 3 tracker, rather than a straight up winner tracker? It turns out that it’s pretty hard to tell what’s going to happen in the Top 3 before it actually happens. Some races have turned on a dime as one contestant picks up a lot of new voters from the eliminated contestant. This kind of coalition building has probably happened in a few years, most notably with Kris Allen and Lee DeWyze. Thus, the fact that the tracker only attempts to ascertain who will be in the Top 3 represents the amount of uncertainty there still is.
As with the week-to-week forecast, this represents kind of conventional wisdom. Early rounds the model considered Pia Toscano to be a lock for Top 3, which is what many people thought, and it was wrong. It also thought Syesha Mercado was not going to make it, like most people did, and was wrong. However, it is right much more often than wrong, which is all one can hope for. The model isn’t designed, nor could I design it, to be capable of seeing difficult-to-foresee events. All it can do is make a reasonable inference based on previous years, and in that is seems to largely succeed. Once a precedent happens, the model never forgets that, but it also doesn’t go nuts when a weird result happens.
Dec 29 2012
When it comes to actual predictions of American Idol outcomes week-to-week, I’ve always been interested but skeptical. Idol is a shifting and surprising thing. As a result, I did something of a reasonable but overly cautious nature, which is to compare a couple variables to outcomes of comparable shows. For instance, the Top 8 could be directly compared to all other top 8 rounds. Then, using a simple logistical regression, I let R dictate the coefficients of regression and left it at that. It was a toy model, not claimed to be worth much.
The old model ignored all of the personal observations I’ve made over the years about how the voting plays out, and was therefore not representative of the show as I see it. The data for the comparable rounds gets so sparse that it’s hard to draw much of a conclusion. And as a result, it wasn’t very good. But this year, I expended a lot of effort and a lot of time thinking about the problem in more detail.
What I recognized is that a fairly good statistical model could be built if I used some reasonable assumptions, sound observations, and careful methodology. The result of this I will outline in this post. This model is not necessarily predictive of the next year, but it is a quite good description of the years past, is not overfit, and hence stands a good chance of being an accurate description for the next. I’ve well characterized the confidence and ensured that predicted probabilities work out reasonably well to the actual observations.
May 22 2012
This is a continuation of a thorough explanation of the forecasting model I’ve used. The present article is a wrap-up of the season and an assessment of the model’s accuracy. Please see Part 1 of this series for a somewhat detailed explanation of the analytical model.
Most people’s perception of the model this season is that it’s really terrible, but in fact it made a ton of very good calls early on in the season. Many people did not start reading this blog until later, when its foibles became more evident. The semi-finals, Top 13, and Top 11 went swimmingly. It had great accuracy. Then the wheels came off.
Of the 13 lowest-vote getters, 6 of them were either the bottom contestant or the next-to-bottom contestant in the model:
May 16 2012
This week I had a Twitter exchange with a site called Zabasearch, in which they directed me to their own prediction site. Their system for prediction is totally opaque whereas I have always disclosed how the system works. However, I realized that it’s been some time since I’ve done this, and it’s past time that I did an assessment and consider improvements. This is going to get fairly technical, so I would read over it only if you are really interested.
In Part 1, the present article, I will explore how I have approached this topic. In Part 2, coming next week, I will do an assessment of how good the model was compared to other sites on the internet, principally Dialidol, Votefair, and Zabasearch, and try to explore why the model hit and why it missed.
I first want to clarify what constitutes a “model”. In principle, a given phenomenon can be expressed as a function of independent variables and dependent variables, such that there is a formula
where y is the thing you are trying to predict (the dependent variable) and x1, x2, x3, etc. are variables that you think affect this outcome (independent variables). What you are starting from is a list of data that you’ve collected (sets of y for given inputs x). Discovering the form of this function may be possible, but in many cases it is quite difficult.
There are a couple ways to attack this problem. First, one could assume the form of the function to be something reasonable, and then fit that form to the available data you have by adjusting the free parameters in the function. The other is to start with the underlying phenomenon and construct a rule-based simulation of it. This produces an answer without having to discover the form of f.
May 11 2012
It was apparently news today that White Guy With Guitar (WGWG) winner #2, Kris Allen of season 8, thinks that another WGWG will win this year. He’s referring to Phil Phillips, who would indeed by the fifth such winner in 5 years. I happen to agree with Kris Allen that Phil will probably win. That is not an endorsement, just a thing I think is true.
You know what else is true? It isn’t just American Idol voters: Americans in general like WGWGs.
Take any metric you like. Above is the demographic breakdown of the Rolling Stone Top 100 Artists. About half of them are WGWGs. The lion’s share of the remainder is black guys with no guitars (think James Brown) and black guys with guitars (think Jimi Hendrix). That leaves only 20% for women of any kind.
Maybe you think Rolling Stone isn’t indicative of American Idol (I disagree, since the judges and producers are of that ilk). Fine, then look at the Billboard charts. As I pointed out previously, about 65% of the Top 10 at any point are men. If we expand to the Top 100, the problem gets far more lopsided in favor of men. It’s not pretty, but it appears to be true.
Yes, it could be that Phil is winning because he’s got a PR blast. He could be winning because the judges are obsequious. He could be winning a pity vote due to his chronic illness. And, maybe he could be winning because of VFTW tomfoolery. But my hypothesis is that Phil is going to win because, in the end, Americans just prefer their singers male, white, and strumming.