Category Archive: Post-Game

May 09 2013

Top 3 post-game

Sometimes models go against your gut because they’re wrong, but sometimes the reverse is true. As hard as it was to believe, the data pointed to Angie’s departure. Though she was hugely popular on Votefair, that index isn’t nearly as reliable as Dialidol in this round, which is why Dialidol factors heavily into the model’s predictions in the Top 3.

The model’s ranking accuracy in the Top 3 rises to 75% (6 out of 8 correct). As I said in the Top 3 Tracker methodology, it’s quite hard to predict what happens in the Top 3 early on, even though it’s pretty simple to determine who the Top 3 will be. All year the Top 3 Tracker had Angie, Candice, and Kree as the most likely Top 3. Angie always appeared on top.

The closest historical analog to this situation was Danny Gokey. Like Angie, he appeared to be headed for the finale in the early rounds. But his support suddenly flagged in the Top 3, and out he went. Angie’s Dialidol numbers had been low for weeks, indicating that while she may be popular, she didn’t draw rabid voting in the way Kree and Candice did.

However, if you want to credit one thing for sinking Angie, it was perhaps the judges and the documentary-length home visit clips of the week. Kree’s was heart rending and Candice’s uplifting. The judges lavished praise on Candice (not that she necessarily didn’t deserved it) in a way that they did not for Angie.

I’ll have an overview of the model’s accuracy this year coming within the week, but it’s been a pretty decent run.

Apr 19 2013

Top 5 post-game

Janelle heads home, as predicted. Kree and Amber were quite close in the prediction, but it was a bit of a surprise to see Kree in the bottom 2 (though the model refused to call her safe, so there was reason to worry).

For anyone curious, at the beginning of this season I had to make a judgement call. The model’s parameters are tweaked so that a Top 12 was progressively winnowed to a winner. But this year there was only a Top 10. So the question is: do you count the Top 10 as round 3 or do you count it as round 1? But there’s no way to answer this question on an empirical basis, since there hasn’t been a Top 10 (except in season 1, but the model only considers seasons 5 and later). My reasoning was that the the voting works on calcification of voting patterns, and that those patterns just take time to emerge, regardless of the number of contestants. So I ran it as if the Top 10 was round 1.

Tonight provided a test, but the answer was inconclusive. If I had run the numbers as if this was the 8th final round, Janelle would still have been last, but next would be Candice, and that isn’t correct either. That is, if Dialidol (which had Amber 1st) is given more weight, it sends Candice (who was second-lowest on Dialidol) into the bottom 2, so there was no improving the ranking accuracy by tweaking parameters.

Now, of course, we have to wonder what the show is going to do for the extra week. A lot of people (including myself, eventually) were wrong that the save was a cinch. Perhaps a filler or charity show, or maybe the finale just gets moved up (this is doubtful, since they have likely scheduled a number of famous acts already for that week).

Apr 12 2013

Top 6 post-game

Name Song WNTS
(avg.)
DialIdol VoteFair Not-safe
Probability
Accurate?
Janelle Arthur I’ll Never Fall in Love Again/
The Dance
45 0.957 7 0.570 No (safe, 3rd-4th)
Lazaro Arbos Close to You/
Angels
7.5 3.519 7 0.467 Yes (eliminated)
Amber Holcomb I Say A Little Prayer/
Love On Top
55 2.492 10 0.433 No call (5th)
Kree Harrison What the World Needs Now/
Help Me Make it Through the Night
70 2.951 20 0.244 Yes (1st-2nd)
Angie Miller Anyone who had a Heart/
Love Came Down
47.5 2.559 28 0.166 Yes (3rd-4th)
Candice Glover Don’t Make Me Over/
Lovesong
90 4.623 28 0.119 Yes (1st-2nd)

As I said last week, there was no reason to think that Lazaro being in the Top 3 meant he would not be eliminated this week. Also, as I noted a couple weeks ago, there was no reason to think that the save had to be used. The judges made the right call not to save Lazaro. There was no reason to keep him in the contest for another week, forcing a double elimination next week that might eliminate two people other than him.

I think the saddest part of this was that Lazaro was singing objectively much worse than earlier on. Demoralized and out of his depth, he didn’t even live up to the standard he had set at the beginning of the year.

My guess is that Janelle was 4th, making her out of order by 2 positions in the model’s ranking. Amber was off by one place in the ranking, as were Kree and Angie. This is the second time Janelle has overcome the odds, indicating she may have a bigger base of voters than it would appear.

Angie is in a slump. She’s fallen out of first place in the model’s standings, a place she had occupied all year, though she is still likely in the top 3 in the voting. Candice put in a couple powerhouse performances, and it could be that her position at the top is fleeting. However, I no longer consider Angie a lock for the finale.

Apr 05 2013

Top 7 post-game

There’s no good way to score tonight, since the bottom 2 rather than 3 was revealed. We know that either Amber or Candice would have been in the bottom 3, but not which. The most probable person to be eliminated, Burnell, was. Janelle appeared in the bottom 3. Lazaro did not in the extreme: he was in the top 3.

A few weeks back, when the first top 3 ever was revealed, I wondered aloud about how volatile the voting can be. Tonight we saw an example of it being pretty volatile indeed. Last week we saw Lazaro go from at best 6th place to this week being at least 3rd place. That means he jumped at least 3 positions and at most 6 positions in the vote standings.

This suggests a couple things. First, if someone goes from having a maligned song (last week) to a relatively well-praised one (this week), the voting can change a lot. I would argue this is a good thing in a singing competition. Second, it means that Lazaro can absolutely be eliminated next week.

Here is a list of all Top 6 eliminations:

Season Contestant Bottom group
previously?
Bottom group
previous week?
1 Christina Christian Yes No
2 Carmen Rasmussen Yes Yes
3 John Stevens Yes No
4 Constantine Maroulis No No
5 Kellie Pickler No No
6 Phil Stacey Yes No
6 Chris Richardson Yes No
7 Carly Smithson Yes No
9 Siobhan Magnus No No
10 Casey Abrams Yes No
11 Elise Testone Yes Yes

Note that two contestants were eliminated in season 6′s Top 6 and season 8 had no Top 6.

A majority (6/11) were people who were safe in the Top 7 but had been in the bottom 3 before. Only 2/11 followed a bottom 3 appearance with being eliminated, and 3/11 were totally unexpected eliminations. So, if you’re pining for a Lazaro exit, there’s no reason yet to despair.

Mar 29 2013

Top 8 post-game

Name Song WNTS DialIdol VoteFair Not-safe
Probability
Accurate?
Devin Velez The Tracks of My Tears 45 0.302 2 0.602 Yes (eliminated)
Burnell Taylor My Cherie Amour 45 0 4 0.567 Yes (bottom 3)
Lazaro Arbos For Once In My Life 26 2.813 4 0.536 Yes (bottom 3)
Janelle Arthur You Keep Me Hangin’ On 80 1.677 10 0.380 Yes (safe)
Amber Holcomb Lately 65 4.789 13 0.302 Yes (safe)
Kree Harrison Don’t Play That Song 73 1.407 21 0.235 Yes (safe)
Candice Glover I Heard It Through the Grapevine 76 5.808 17 0.228 Yes (safe)
Angie Miller Shop Around 38 2.580 31 0.150 Yes (safe)

Not much to say. The initial (and final) conclusion of the model was spot on. 8/8 calls correct.

So far the finals model is 100% on ranking eliminated people, and 93% on safe calls, though only 67% on bottom 3 calls (after the battering it took last week). Some of this is, of course, just plain luck.

Random thoughts:

By next week we will know which of the women is the weakest of the bunch. Amber is the only one who has been in the bottom 3 so far, but that may not be the only determiner. Janelle has gotten some bad reviews, but so far has shown no sign that she is weak (my Top 3 Tracker notwithstanding).

If Blake Lewis killed the “curse” of You Keep Me Hangin’ On, Janelle put the nail in the coffin. I rewatched one of those who was eliminated from that song, and let me tell you, the performance in that case was not good. Janelle did a good job, so she was safe. Simple, no curse. Song choice isn’t everything.

Speaking of curses, let’s talk about the curse of the women on Idol. It ain’t looking so good now, is it? I admit, it’s a pretty big turnaround, especially after Season 10′s serial elimination of women at the beginning. Now, yes, these women are way better, not just than the men this year, but better than the women were that year. The fact that the men this year are kind of terrible probably is a big factor. The execrable male group number where they all blamed each other afterward last night was a symptom of the dysfunction—the men are flailing, out of their depth.

Unfortunately, we have no idea whether WGWG phenomenon is finally dead, since all WGWGs were (perhaps specifically) excluded this year. I have no problem with that, as it was to me bad overall for the contest. In fact, Janelle playing guitar last night was the only instrument played by a contestant since Angie last did it, which is peculiar. I wonder if the producers aren’t perhaps dissuading the contestants from doing so. I, for one, don’t miss it much. Though Angie would maybe benefit from having it somewhat, she hasn’t seemed to have any trouble surviving so far.

Mar 22 2013

Top 9 post-game

Name Song WNTS DialIdol VoteFair Not-safe Probability Accurate?
Paul Jolley Eleanor Rigby 39 1.535 1 0.481 Yes, Eliminated
Burnell Taylor Let It Be 51 1.776 2 0.445 No (safe)
Lazaro Arbos In My Life 11 0.951 7 0.432 No (safe)
Devin Velez The Long And Winding Road 50 2.732 3 0.420 No call
Janelle Arthur I Will 69 0.371 7 0.363 Yes (safe)
Amber Holcomb She’s Leaving Home 54 0.493 9 0.352 No (bottom 3)
Kree Harrison With A Little Help From My Friends 77 1.170 18 0.217 Yes (safe)
Candice Glover Come Together 80 1.869 19 0.201 Yes (safe)
Angie Miller Yesterday 68 5.993 34 0.089 Yes (safe)

Paul Jolley was predicted eliminated, and that happened, so if that’s all you care about, great.

Otherwise, it was kind of a dismal night for predictions. Lazaro and Burnell showed they have more staying power than their numbers would suggest. Amber Holcomb was in the bottom 3, which was a bit of a surprise. Clearly Dialidol is underrating Lazaro, and Votefair is overrating Amber, relative to the voting public.

I find it disturbing that any of these women could be in the bottom 3 compared to these men.

I’ve noticed a couple weird things. Votefair seems to be polling a relatively small number of people (440 in this case), which is down markedly from last year (the top 9 last year had 707 votes). I don’t know whether that site is becoming less popular in general, or whether these contestants aren’t inspiring the kind of rabid fanbase that votes in such online polls, but it’s noteworthy. Also, Dialidol’s numbers at around midnight eastern time don’t change from then until the morning. Is Dialidol not counting any votes after 9pm PST? If that’s true, I can’t fathom why. Since I’m not active on their forum, I can’t easily see whether something has changed.

Mar 15 2013

Decent showing for the model in the Top 10

Name Song WNTS DialIdol VoteFair Not-safe
Probability
Accurate?
Curtis Finch I Believe 24 0 2 0.407 Yes (eliminated)
Janelle Arthur Gone 43 0.544 2 0.383 No (6th)
Paul Jolley Amazed 47 0.563 3 0.367 Yes (bottom 3, 8th)
Burnell Taylor Flying Without Wings 50 0 4 0.358 No call
Devin Velez Temporary Home 46 0.563 6 0.336 No call
Lazaro Arbos Breakaway 31 5.171 5 0.326 Yes (4th)
Amber Holcomb A Moment Like This 71 0.139 10 0.276 Yes (5th)
Kree Harrison Crying 64 0.515 14 0.244 Yes (Top 3)
Candice Glover I (Who Have Nothing) 89 0.417 14 0.226 Yes (Top 3)
Angela Miller I Surrender 68 3.266 41 0.078 Yes (Top 3)

When I designed the finals model, I built it to do several things. My first priority was that it be based on sound principles, and not just some overfit, ad-hoc mess. The second was that it be as accurate as possible on ranking. The natural outcome of this was to produce probabilities of being in the bottom group (typically the bottom 3), and then tweak the coefficients to produce the fewest ranking errors.

Finally, I didn’t want the model to be wishy-washy—I wanted it to take firm positions. I get kind of annoyed with sites like Dialidol that publish predictions where all the names are in yellow (meaning that Dialidol makes no official pronouncement). If you’re going to make a forecast, it should god damn well forecast something (Dialidol last night predicted that each contestant would be in the range from 1 to 10 …). The flip side of that is that your model can be totally wrong.

The model had a pretty good night. It got the person eliminated correct, it got 2/3 of the bottom 3, and it ranked 5 of the contestants as the best, and those 5 were the best. Angela, Candice, and Kree were predicted as the Top 3, and they were, though we don’t know whether it ranked them correctly relative to each other. Amber and Lazaro were out of order, but both in the correct group; I ain’t mad at it.

Devin Velez was one of the two that the model couldn’t call, and he was way out of order. Devin and Janelle should have switched places. Janelle was ranked 9th, but was actually 6th. You can’t win em all, but that is an irritating black spot in an otherwise bright record.

I’m intrigued by the amount of information we’re being given this year, but I don’t think that I can make much use of it. Without a historical perspective, it’s difficult to say what the rankings imply about the future. Can someone go from being in the top 3 vote-getters in the Top 10 to being 9th place in the Top 9? Who knows? We’ve never seen the results, so we don’t know how volatile the voting is. I do find all the new information a nice change of pace, though.

As an editorial note, I’m not sure how VoteForTheWorst is going to work this year. Lazaro was their pick, and while I may agree, is he really that bad? I don’t think he is. This Top 10 isn’t a lot of grist for their mill, I’m afraid.

Mar 10 2013

Why the model predicted the Top 10 correctly

The debut of the new semifinals model went off without a hitch. The model chose 100% correctly, somewhat better than I had hoped. I’d like to just go over how the model thought things went down.

Below is reprinted the Men’s prediction:

Name Pre-exposure
(seconds)
Audition WNTS DialIdol VoteFair Probability of
advancing
Lazaro Arbos 1263 Yes 49 3.559 18 0.893
Devin Velez 524 No 71 4.693 17 0.856
Burnell Taylor 969 Yes 74 0 13 0.804
Curtis Finch 893 Yes 48 4.661 10 0.730
Paul Jolley 859 Yes 52 0.984 10 0.591
Charlie Askew 1115 Yes 9 5.641 12 0.575
Vince Powell 585 Yes 45 0 10 0.403
Nick Boddington 657 No 58 0.146 5 0.126
Cortez Shaw 569 No 33 0 2 0.012
Elijah Liu 351 No 42 0 4 0.010

As I wrote that night, the most probable Top 5 Men consisted of Lazaro, Devin, Burnell, Curtis, and Paul. At the beginning of the night, Charlie Askew occupied the 5th spot, with Paul in 6th, but with an update to the final Dialidol and Votefair numbers, that was reversed.

Notice that the various indices were divergent. WhatNotToSing ranked Nick Boddington as third best, but Dialidol showed him just above zero (9th place), and he was also 9th place according to Votefair. That, along with the fact that his audition was not shown, made him an unlikely finalist.

Dialidol, meanwhile, ranked Charlie Askew as being in first place. While this is often a strong indicator, it is less so in the semifinals. Last year, for instance, Dialidol thought Eben Frankewitz was a lock. But WNTS gave Charlie a 9 out of 100, dead last by a large margin, and Votefair showed him only as 4th most popular. Viewed in the aggregate, Charlie’s numbers were weak. That being said, the model, in my opinion, got lucky with this call. It could have easily gone to Charlie instead of Paul.

Lazaro was predicted most likely to advance based on decent to strong numbers on all indices. He was fourth on Dialidol, first on Votefair, and fifth on WNTS. Couple that with a high amount of pre-exposure, and the model assigned him a very high probability of advancing. That he was revealed last was possibly an indicator that he was way out ahead of the pack. At this point, he might be considered a front-runner among the men.

Finally, Burnell Taylor had a 0 on Dialidol. This was the kiss of death for the others with the same (Vince Powell, Cortez Shaw, and Elijah Liu). But Burnell had something those guys didn’t: he was third most popular according to Votefair and had the top WNTS score of the night. That, along with the pre-exposure considerations, made him favored to be included.

Now the women:

Name Pre-exposure
(seconds)
Audition WNTS DialIdol Votefair Probability of
advancing
Angela Miller 904 Yes 75 5.618 45 0.936
Candice Glover 912 Yes 88 3.371 9 0.830
Kree Harrison 723 No 80 2.333 13 0.650
Amber Holcomb 355 No 73 1.457 7 0.447
Janelle Arthur 923 Yes 49 0.131 3 0.446
Adriana Latonio 375 No 30 6.331 9 0.418
Breanna Steer 390 Yes 46 1.920 2 0.412
Tenna Torres 756 Yes 31 1.638 2 0.350
Zoanette Johnson 1277 Yes 7 1.899 8 0.343
Aubrey Cleland 363 No 52 0.223 4 0.169

If Lazaro can be considered the front-runner among men, I think it’s fair to say that Angela (Angie) Miller is the same among women. In fact, she may be, at this point, considered the front-runner of the contest. Her ranking among the indicators was 3rd on WNTS, 2nd on Dialidol, and 1st to a large degree on Votefair. This should be closely watched, because Votefair has severely overrated contestants in the past, such as Jessica Sanchez. But assuming it’s not being juked by rabid fans, Angela seems to be positioned well.

Candice Glover was another cinch for the Top 10. Third on Dialidol, tied for third on Votefair, and with the top WNTS score of the night, her advancement was not doubtful. I retain a bit of skepticism as to her staying power, as some similar singers, such as Mandisa, started strong but didn’t go the distance.

The model was not confident enough to call the contest for Amber and Janelle instead of Adriana and Breanna. Their probabilities were below the threshold where most errors occur. Adriana Latonio had a very strong Dialidol score (both of Dialidol’s picks for the top spot were wrong), but a poor WNTS score and a small amount of pre-exposure.

Janelle squeaked by according to the model, benefiting from particularly low WNTS scores for her competitors. She had weak numbers on Dialidol and Votefair, and she is at a disadvantage going into the finals. Without some kind of game changer, look for her speedy departure.

When I watched the show, I had thought Aubrey Cleland was going to skate by, but the model correctly repudiated my view. She showed no real sign of support on Dialidol and Votefair, and had a decent but unspectacular WNTS score.

Finally, we come to Zoanette. I’ve been saying throughout these rounds that I thought Zoanette had no chance with the voters. She was a goof, much like Normand Gentle, more a laugh-at-her-not-with-her contestant than a real contender. The judges may have tolerated that, even venerated it, but my feeling was that the audience would have little patience for it, and I guessed right. Other than the fact that she had a lot of screen time in the auditions, there was no reason in the numbers to think she would advance.

The three positions that the model was not confident on could, of course, have gone another way. Probabilities don’t imply any kind of certainty, and the most probable events frequently don’t occur. Thus, inasmuch as I use that as an excuse when the model is wrong, I must point out the 100% accuracy of the model this year was a bit of a fluke. It could easily have been only 70% accurate.

The model represents conventional wisdom, in my mind, and as such this has been a very conventional year, so far. There were no real surprises, or anything that makes you just shake your head and wonder how it happened. That’s good for someone trying to predict it, but it’s not necessarily good for the show.

May 10 2012

Top 4 Updated Predictions

Update: Dial Idol rankings were updated after 1am EST and now have Jessica Sanchez as their #1 vote getter. The projections have been updated accordingly.

Contestant WNTS Rating (avg) Dialidol Rank Previous Rating Probability of Elimination (%)
Joshua Ledet 69 4 72.5 40.67
Phillip Phillips 63.5 3 31.5 39.7
Hollie Cavanagh 52.5 2 79.5 13.37
Jessica Sanchez 74.5 1 61.5 6.25

Hollie Cavanagh has been predicted safe for a few weeks now, based on her Dialidol standings, but I’m not buying it tonight. With the lowest rated performances, bad reviews from the judges, and a near historical number of times in the Bottom 3, I cannot imagine she isn’t going home.

Why is the model so bad at figuring this out? With 11 data points (one per season), I don’t know what you can expect. Of course, if there was a systematic way to correct Dialidol’s ranking of Hollie to be in line with reality, I would do it, but you may as well just make up a number. There is no scientific way to do this (but check back with me in season 25, assuming I’m still alive for that).

Historically, Dialidol is quite accurate during the Top 4. However, the service has occasionally shown some blind spots, and this seems to be one of them. The model says Joshua has almost twice the chance of going home as Jessica. That’s nuts. It would be shocking if Hollie was not sent back to Texas/Liverpool tomorrow.

Mar 02 2012

Top 25 post-game assessment and analysis

Projection results

Name WNTS
Approval
Dialidol Probability
of Advancing (%)
Result
Eben Frankewitz 10 44.24 71.1  
Chase Likens 32 20.99 70.2  
Adam Brock 32 19.74 69.8  
Jermaine Jones 51 14.57 68.5  
Joshua Ledet 81 10.9 68.4  
Jessica Sanchez 82 3.08 57.2  
Elise Testone 84 0.12 54.5  
Hollie Cavanagh 76 2.52 54.2  
Skylar Laine 74 2.43 53.3  
Shannon Magrane 62 2.22 47  

The chance of choosing the top 10 correctly from the Top 25 at random was 40%. The model predicted, instead, 70% correctly, but missed badly on the top 3 men. Although the approval ratings would have suggested Eben was not going to be in the Top 10, his Dialidol score was truly humongous. Chase Likens and Adam Brock had nearly identical high Dialidol rankings, quite large, with performance approval that was rather low. None of these 3 made it through.

Before you start knocking me for including Dialidol scores (which I perhaps deserve), note that going just on approval would have made Creighton Fraker and Jeremy Rosado the two men who rounded out the Top 5 guys after Phil Phillips, Joshua Ledet, and Colton Dixon, so it’s not all bad. That would also have incorrectly knocked Jermaine Jones out of the forecast, and would have put Erika Van Pelt into the Top 5 girls, also incorrectly.

Fortunately, the model weighs these factors according to how accurate they’ve been in the past. So the top Dialidol woman (Brielle) was nonetheless still regarded (correctly) as not being in the Top 5 girls, because her WNTS rating was quite low. But no model that takes Dialidol into account is going to be robust to an errant Dialidol rating of 44.2 (for comparison purposes, Heejun registered a 0.24. That indicates that Dialidol thought that Eben did 184 times better than Heejun).

What happened to Dialidol this week? I confess I don’t know. When, last year, they kept predicting Scotty McCreery to sail through, I was incredulous. Sure enough, that proved prescient, as McCreery was a bulletproof contestant, advancing despite bad singing scores, and eventually claiming the title. So, you disregard Dialidol at your own peril. Except this time! I have no idea why Eben and Chase were so favored in the Dialidol sample. The service does more than just measure the busy signal, and actually measures votes from users, and maybe those users are disproportionately … what?! I can’t imagine.

As for Heejun Han, the evidence for his advancement to the finals is simply not there in the numbers. No number except pre-exposure indicated he would get through, and that alone is not usually enough. Indeed, Reed Grimm had a very large pre-exposure time, a nearly identical WNTS rating, and a larger Dialidol ranking, and was not in the Top 5 men. Go figure.

Giving credit where it’s due, the keeper of the Vote For the Worst Twitter feed correctly predicted all 10 of the vote winners. The power of a person going “on feeling” is in some cases very accurate, and this shouldn’t be shocking. If you had asked me whether I agreed that Eben or Chase would advance, I would have answered no. But I’m interested in predictability on a technical level, so I’m not going to juke the results until they match my gut. What would be the intellectual exercise in that?

Older posts «