This is part 1 of 3 of the series. Part 1 describes how the data was collected and how total screen time influences how far a given contestant gets in the contest. Part 2 looks at how initial audition affects the contest. Part 3 looks at how pre-exposure influences voting in the semi-finals.
“You know it’s all rigged, right?”
I bristled at the comment by a co-worker. He had been to an audition for America’s Got Talent, and was sure that the producers were controlling everything, from judges’ comments to the outcome of the contest itself. Reality TV, and in particular Idol, was just a sham.
Everything I knew about American Idol told me that this couldn’t be right. The voting patterns were not only consistent with themselves, but consistent with broad consumer choice. The results did not always go the way the producers would have wanted, driving down ratings and eventual record sales for the company—a notion not very consistent with the idea that the voting was rigged. And so many people are involved in Idol that surely ballot stuffing would have been exposed by now.
Although the above objections are valid, I couldn’t help but recognize that the producers do maintain some control of the process, through the editing of the show. They cut together the segments of the contestants before songs, and decide what new policies to implement that might help certain contestants (I’m thinking especially of James Durbin’s infamous flaming piano). More importantly, though, they have complete control over what is shown in the Auditions and Hollywood rounds.
Control over who is shown before the voting is certainly at least partially causative in the voting. Voter recognition, also an important statistic in elections, can make a viewer cast a vote for someone they know and like, rather than the one with the best performance of the night. The question is whether this really happens, how we would know, and how big the effect is.
Unfortunately for me, this data was not already collected. The wonderful site What Not To Sing only has in their database whether someone was shown in a particular part of the show (auditions, Hollywood, promos), but not how much the contestant was shown. It’s easy to see why this would be, for reasons I’ll explain in a moment. In any case, it was clear that the only way to settle the issue would be to manually time and record screen time for the contestants, and see what the data said.
How I collected this data
Over 10 seasons, Idol has presented the auditions phase of the competition in a series of episodes starting in January (excepting the first season). While Season 1 had only 2 hours of such material, the show quickly expanded to 8 shows, and last season 12 shows. The total run time for these audition rounds for all 10 seasons, not including commercials, is a gargantuan 76 and a half hours.
So did I actually sit down and watch 76 hours of footage, tabulating the screen time for all the semifinalists? Yes. Yes I did.
It was a terrible slog, to be honest. Although some of the initial auditions can be skipped (on account of the person not having made it through to the semifinals), the entirety of the Hollywood weeks had to be viewed carefully to see if segments that appeared to be about one person in fact cut to another person. Within VLC I could run the video at 1.2x, helping slightly. The result was a large series of notes, of which the following is representative of my eventual state of mind:
Julia Demato – Ep1 20:56-22:35 1st audition. Shown with lyrics written on her hand. HW 15:59-16:15 in a group. Again at 16:42-16:54 waiting for her tardy group. 17:31-17:49 same. 19:36-19:45 Same. 19:51-20:02 SAME. 20:32-20:40 HOLY SHIT THE SAME. 21:18-21:22 OMFG. 22:55-23:03 bitching. 24:07-25:20. Holy fuck, in this segment she sits there while her sister YELLS AT HER GROUP. 42:14-42:51 after frenchie smoothed things over. 43:16-44:03. Man she has big cans. 54:04-54:26. 364s
I’d love to say that I could just apply a facial recognition software to the footage and set the computer doing it, but this was impossible. In the first case, I don’t have access to that software, and my guess is that even if I did, it would take me more than 76 hours to get it working. Secondly, not all footage of a contestant is made the same. A certain clip may have a contestant in the background, clearly not the feature of the segment. So I bit the bullet and did it by hand. Although no set of rules would be perfect, or account for all cases, the guidelines for counting were that a given clip had to
- be a named segment (where either Seacrest said the name or it appeared on screen)
- be 5 seconds or more in duration
- have the contestant substantively on screen
The first two are straightforward, but the third is a bit thorny. Consider the group rounds in Hollywood (which I now loathe more than al Qaeda). If a singer is the first to go, and has her name displayed, and then another member of the group takes over, should the subsequent footage count towards the first singer’s total or not? In some cases, the audience would be cognizant of her being there, and in some cases not. There were a fair number of judgement calls like this. In one instance, a contestant was named and shown briefly, and then the show cut to some people being sent home, but the contestant they showed briefly was singing over the subsequent segment! How do you count such a thing? In this case, I didn’t count it.
Once all such scenes were recorded, I calculated the total screen time for each semifinalist (in seconds). All of this was done using the original aired footage, not replays such as Idol Rewind which aired during later years and had footage cut in from recent interviews. Please don’t ask me where I obtained the footage.
How do the auditions affect the overall contest?
The question I was most interested in answering was whether the entirety of the Idol voting is moved by the pre-exposure, or whether it was just the first few rounds. To answer this, I looked at the pre-exposure time versus the number of voting rounds that the contestant survived (was not eliminated), displayed below along with a linear regression:
Red dots are winners of the contest.
There are many things to point out here. You can see the red dot furthest to the left, which is Kelly Clarkson. She emerged the victor through 9 rounds of voting despite never having been shown in the auditions. Most other winners, however, had to endure a full 12-14 rounds of voting to emerge as the new American Idol. This graph comes therefore with a large disclaimer, that grouping this data in this way is not apples-to-apples, since there are many significant differences between the years. As I said before, the total amount of screen time has increased hugely, as have the total number of rounds. My guess is that Kelly Clarkson would not have 0 screen time if she were to audition this year.
So, my first conclusion is phew! It’s clearly possible to do quite well in the contest with little to no exposure in the auditions round, as is amply shown by Ms Clarkson, Fantasia, Bo Bice, and several other contestants, all shown in the upper left quadrant of the graph. Nor has there been a lack of people unable to parlay tons of exposure into meaningful advancement in the contest (e.g. Tatiana del Toro, Jordan Dorsey, Antonella Barba), shown in the bottom right quadrant. The producers don’t steer the ship.
Nevertheless, we cannot ignore the huge pileup of points in the lower left—those who got little to no pre-exposure and were merely cannon fodder for the semi-final rounds. And, in the end, there clearly is a correlation between screen time and success in Idol. The regression line is weakly positive, and while it may not look it, is quite statistically significant. To declare that the screen time is unrelated to the success in the overall contest would be wrong. But looking at this, it’s hard to say whether the overall effect is even more than 10% of the voting outcome (indeed, the R-squared value is about 8%). Overall, through this mechanism, the producers have a finger slightly on the scale.
Not that all footage is created equally. Those contestants in the lower right were not always portrayed in a good light. Ms. del Toro was a raving lunatic who said that she was the only one who mattered, Antonella Barba commiserated with her friend who declared that God had eliminated a girl who was not a good person, and Jordan Dorsey was a fucking dick to the eventual winner, Scotty McCreery.
Reason to worry
Despite the above observations, I am a little concerned about the future, given the following. Suppose I look just at the data from Season 10, last year:
All of a sudden, the effect doesn’t look so benign, does it?!
Now, proving causation is a bit more difficult than correlation, as you likely know. It could be that the producers merely pick the ones that they think the public will vote for, and the dog wags the tail. On the other hand, it’s tough not to get the feeling that the more time the producers have to play with, the more influential this effect has become. I’ll pick up this topic in part 3 of this series.
Next time: How the Idol contest is influenced by whether the first audition is shown