As we head into the finale, it’s time to look at how predictable the season has been, particularly with regard to the three prediction models on this site. The first is the semi-finals model, predicting who would be in the finals. The second is the finals model, a week-to-week guess as to who would be in the bottom group, safe, and/or eliminated. Finally, the Top 3 Tracker, which tried to guess who the Top 3 would be.
Semi-finals
Here is the semi-finals projection as posted before the results were announced.
Name | Probability of advancing |
Advanced? |
---|---|---|
Lazaro Arbos | 0.893 | Yes |
Devin Velez | 0.856 | Yes |
Burnell Taylor | 0.804 | Yes |
Curtis Finch | 0.730 | Yes |
Paul Jolley | 0.591 | Yes |
Charlie Askew | 0.575 | No |
Vince Powell | 0.403 | No |
Nick Boddington | 0.126 | No |
Cortez Shaw | 0.012 | No |
Elijah Liu | 0.010 | No |
Angela Miller | 0.936 | Yes |
Candice Glover | 0.830 | Yes |
Kree Harrison | 0.650 | Yes |
Amber Holcomb | 0.447 | Yes |
Janelle Arthur | 0.446 | Yes |
Adriana Latonio | 0.418 | No |
Breanna Steer | 0.412 | No |
Tenna Torres | 0.350 | No |
Zoanette Johnson | 0.343 | No |
Aubrey Cleland | 0.169 | No |
Green means the model predicted the person would advance to the finals. Red meant the opposite. Yellow means too close to call.
The model was confident of 7 out of 10 of them, and all of those advanced. It was confident that 7 others would not advance, and none of those did. It was unsure of 6, and half of those advanced. All 10 of the highest ranked contestants advanced, which is really just dumb luck. Because it was so accurate, I’m led to believe that the margin of error I allowed for was perhaps too large, and more names should have been in green/red instead of yellow.
Finals
Here is a listing of all the model calls for the Top 10 through the Top 3. Green means safe, yellow means unsure. Red means one of two things: either the model was confident that the person would be in the bottom group, or it is not confident, but the person is ranked within the bottom 3 or 2 names, so I forced it to be red. I’ve noted when this happened. (I forced there to be a projected bottom 3 for obvious reasons; people come to the website to see who is the likeliest bottom 3.)
Contestant | Not-safe probability |
Result | Forced? |
---|---|---|---|
Top 10 | |||
Curtis Finch | 0.407 | Eliminated | No |
Janelle Arthur | 0.383 | 6th | Yes |
Paul Jolley | 0.367 | Bottom 3 (8th) | Yes |
Burnell Taylor | 0.358 | 7th | |
Devin Velez | 0.336 | Bottom 3 (9th) | |
Lazaro Arbos | 0.326 | 4th | |
Amber Holcomb | 0.276 | 5th | |
Kree Harrison | 0.244 | Top 3 | |
Candice Glover | 0.226 | Top 3 | |
Angela Miller | 0.078 | Top 3 | |
Top 9 | |||
Paul Jolley | 0.481 | Eliminated | No |
Burnell Taylor | 0.445 | Safe | Yes |
Lazaro Arbos | 0.432 | Safe | Yes |
Devin Velez | 0.420 | Bottom 3 | |
Janelle Arthur | 0.363 | Safe | |
Amber Holcomb | 0.352 | Bottom 3 | |
Kree Harrison | 0.217 | Safe | |
Candice Glover | 0.201 | Safe | |
Angie Miller | 0.089 | Safe | |
Top 8 | |||
Devin Velez | 0.602 | Eliminated | No |
Burnell Taylor | 0.567 | Bottom 3 | No |
Lazaro Arbos | 0.536 | Bottom 3 | No |
Janelle Arthur | 0.380 | Safe | |
Amber Holcomb | 0.302 | Safe | |
Kree Harrison | 0.235 | Safe | |
Candice Glover | 0.228 | Safe | |
Angie Miller | 0.150 | Safe | |
Top 7 | |||
Burnell Taylor | 0.725 | Eliminated | No |
Lazaro Arbos | 0.658 | Top 3 | No |
Janelle Arthur | 0.491 | Bottom 2 | |
Candice Glover | 0.340 | Safe | |
Amber Holcomb | 0.335 | Safe | |
Kree Harrison | 0.313 | Top 3 | |
Angie Miller | 0.138 | Top 3 | |
Top 6 | |||
Janelle Arthur | 0.570 | Middle 2 | No |
Lazaro Arbos | 0.467 | Eliminated | Yes |
Amber Holcomb | 0.433 | Bottom 2 | |
Kree Harrison | 0.244 | Top 2 | |
Angie Miller | 0.166 | Middle 2 | |
Candice Glover | 0.119 | Top 2 | |
Top 5 | |||
Janelle Arthur | 0.633 | Eliminated | No |
Amber Holcomb | 0.401 | Safe | Yes |
Kree Harrison | 0.376 | Bottom 2 | |
Candice Glover | 0.347 | Safe | |
Angie Miller | 0.244 | Safe | |
Top 4 (ii) | |||
Candice Glover | 0.565 | Safe | No |
Kree Harrison | 0.485 | Safe | |
Angie Miller | 0.479 | Safe | |
Amber Holcomb | 0.472 | Eliminated | |
Top 3 | |||
Angie Miller | 0.382 | Eliminated | No |
Candice Glover | 0.309 | Safe | |
Kree Harrison | 0.309 | Safe |
I’ll break the assessment off into several categories.
Safe calls
There were 27 safe calls (in green), and 25 of those were safe (not bottom group or eliminated). Janelle in the Top 7 and Amber in the Top 9 were the only exceptions. This gives about a 93% accuracy. Of the people called safe, none was eliminated.
Bottom group calls
Among the group including all of the projected bottom group (all names in red), the ranking is not particularly impressive (nor was it designed to be). If I ignore the margin of error, there were 17 bottom-group calls, and only 10 of those were correct, a rate of about 59%. One reason this is a little lower than it would otherwise be is that the bottom group reveals this year have been remarkably reduced. In addition to cutting the Top 12 and 11 rounds out, we were only told the bottom 2 in the Top 6 and Top 7, which is not normal.
Now, taking into account the margin of error, there were 11 people who were declared red confidently (not forced by me), and 8 of those was in the bottom group, a respectable rate of 73%. Only 3 people that were confidently red ended up being safe. Note that this is lower than previous years (which is more like 87%), so there has been higher than average unpredictability this year as far as the bottom group.
Eliminations
Out of 8 eliminations, 7 of the people were in red (88% accuracy). The person ranked the highest not-safe probability was eliminated 6 out of 8 times (75% accuracy). The person eliminated had either the highest or second highest not-safe probability 7 out of 8 times (88% accurate). The person eliminated never appeared in green.
Most surprising results
As measured by the ranking error (in percentage points), the biggest misses (in descending order) for the model were the following.
- In the Top 7, the model was quite surprised that Lazaro was safe and Janelle was in the bottom 2.
- Amber being in the bottom 3 in the Top 9 was also very surprising
- In the Top 6, the model was pretty surprised that Janelle was not in the bottom 2
Top 3 Tracker
Here is the time series for all rounds showing the Top 3 Tracker’s assigned probability of making the Top 3
The eventual Top 3 (Angie, Kree, and Candice) was always the highest rated. This is obviously the most important measure.
The contestant eliminated in each round was the lowest ranked on the T3T 6 of 8 times (75% accurate). The person eliminated was either lowest or second-lowest rated 100% of the time, meaning that for elimination predictions the T3T was actually better than the finals model. For the bottom group, the accuracy of the T3T was about 69%, again better than the finals model itself. This is due to the fact that the T3T makes use of an averaging scheme to smooth out the system. However, it is also partly luck. In some seasons, incorporating an averaging effect actually had a detrimental effect on prediction accuracy.
Reuben is the statistician and lead writer at IdolAnalytics. Follow him on Twitter. His personal site (run all year round) is IdleAnalytics.