Semifinal projection methodology

The logic that goes into the IdolAnalytics finals model is similar but not identical to that of the semi-final round(s). People say that Idol normally devolves into a popularity contest, but I tend to think that it begins as a popularity contest, then becomes somewhat more about singing, then lapses back into a popularity contest.

This popularity aspect has become true particularly lately, as we saw fairly concerted online efforts on behalf of people like Phil Phillips, hyping their contestant on Twitter. And, naturally, for that to matter the audience has to know who the person is in the first place. Thus, we would expect, and do observe, that pre-exposure (screen time allotted to a contestant before the first voting round) tends to matter. If your audition was shown, if you had a lot of screen time, that often meant you were more likely to advance to the finals.

There are quite obviously exceptions. Kelly Clarkson, Bo Bice, Kris Allen, and Jessica Sanchez did not have their initial audition shown. However, all other finale participants in the history of the show (18 out of 22) were shown. Only 40% of semi-finalists overall have had their initial auditions shown, but 77% of finalists did.  Now, I don’t want to put too much emphasis on this, since it’s hard to say exactly why this is, but it’s clearly an indicator of something.

What needs to be quantified is how much this pre-existing popularity is already priced into the key variables that the finals model uses: the WNTS approval rating, Dialidol, and Votefair. In fact, a correction to these scores does need to be applied to get good agreement with the observations. This doesn’t mean that these indicators aren’t still the most important factors (they are), but just that they tend to be a little bit more off than in the finals. As such, the variables included in the projection are, in order of importance,

  1. Votefair
  2. WNTS
  3. Dialidol
  4. Whether audition was shown
  5. Total screen time

These variables are normalized year-to-year in just the same way that the finals model is (except audition, which is a categorical variable rather than a continuous one). The first three are in agreement with the finals model, which also has Votefair and WNTS as the most important indicators initially, with Dialidol tending to fail at a much higher rate. With all the adjustments made, a probability of advancing past the first vote is calculated. The history of seasons 5-11 with these probabilities calculated is displayed below:

Model probability of advancing (due to voting) vs whether the person actually advanced past the first voting round. This plot is interactive; mouse over points to see the name of the contestant it represents.

The horizontal axis shows the assigned model probability of advancing. Points are at the top if the person advanced and at the bottom if they did not (some vertical jitter has been added to make overlapping points visible). Mouse over the points to see which people are associated with them.

There have been some big surprises. The model was fairly skeptical of Kris Allen and Michael Sarver, putting them at only 19% and 21% chance of advancing, respectively, which they both did. 66 people have been assigned probabilities lower than that, and none has ever advanced by voting (some advanced due to judge’s Wild Card pick, which is why you will see the names of some finalists if you mouse over all the people who did not advance).

On the other side, the model was surprised that Tyler Grady, Rudy Cardenas, Ashley Rodriguez, Colton Berry, Patrick Hall, and Jannell Wheeler did not advance. Tyler Grady was assigned an 84% chance of advancing, but didn’t. The 35 people assigned a higher probability did without exception.

In terms of ranking, this model fits past rounds quite well. Of the 72 people who advanced, there were 8 mistaken calls, which means 90% accurate. The errors are normally within 3 percentage points, giving a rough confidence measurement..

In the two most recent years, the first voting round was the entire semi-finals. Unless Idol switches back to the old method of having several semi-final rounds (which I wish they would do), this model suffices to project who will be the finalists. I’ll be tracking the pre-exposure numbers (as I did last year) as they happen for the semi-finalists, which will hopefully be known before the first episode airs due to Twitter.

  • mark houdek

    replace all three new judges or your show is doomed! For the first time ever we are done watching this show!

    • jessica_dennis

      Note: we really can’t reiterate enough that we have absolutely nothing to do with the show other than as spectators. Just sayin’! But yeah, it’s difficult not to have some serious doubts about these judges’ ability to steer the public, if steering the public is even really possible.