For those just hitting the site (now defunct), here is a compendium of the knowledge I’ve derived from American Idol. Much of the code used in making the models is found on Github, as is the data in the database.
First, the models:
- A new model of American Idol, the main prediction methodology for finals rounds (i.e. what 90% of people care about)
- The semi-final model (when Dialidol was functional) and the post-Dialidol semi-final model
- The reasoning behind the Top 3 Tracker, used to project the Top 3 as the year went on
I am also particularly proud of the Statistical Snapshot of American Radio (methodology here, see all posts here), a large data-mining effort to quantify what songs are predominantly played on radio by format, time-period. I built some fairly sophisticated data visualization tools for these, also on Github, along with all the data.
On more Idol related issues:
- Brief description of the Twitter Sentiment methodology, code here.
- The distinct advantage of singing last, quantifying how the “pimp spot” is a very good place to be
- How (the also now-defunct) VoteForTheWorst was funny but made bad choices about who to vote for (this post ruffled some feathers)
- A quantitative look at how great Dialidol was (pre-2012)
- An expansive, 3 part series about how screen time during the audition rounds was a way for the producers to affect the contest (part 1, part 2, part 3)
- The save rule, and how it was mostly used to great effect
- Also, a link to one of my most cited posts, Americans love white guys with guitars
It wouldn’t be a website in the 2010s without some listicles:
- The most totally baffling runs in Idol history, a recounting of amazing runs of luck awful singers had
- On the flip side, the contestants who were seriously robbed
Finally, I spent many hours compiling video retrospectives of the first three seasons. You can watch them all here.