Friday, May 9, 2014


So I recently created a model which I thought was able to perfectly encapsulate the general fan's assessments of the majority top ten American Idol contestants, starting from AI3 all the way to what we have now in AI13, independent of producer and judge manipulation. We start with AI3 because that's when we are able to chart one of our metrics, as will be explained below. This is the likely ranking of the contestant had every contestant played on a neutral ground.

I've been able to scour whatnottosing's databases for this project, utilizing four core metrics which I believe make up the general fan's opinion on the contestants. Three of these metrics will be given equal weighting (out of a 100, upon within-season normalization), while the fourth measures the consistency of values between these three measurements. Remember, this normalization is done WITHIN the contestant's respective season. It is not normalized across all Idol seasons, because I want to capture how the contestant was like within their season, relative to their competition. Without further ado, the metrics are:

1) Song freshness factor. This is my personal favorite metric, based off the song repeatability factor on Idol, the song age, and its freshness.
2) Contestant growth. WNTS uses a linear regression across contestant performances over the course of the season.
3) Contestant performance. Usually the standard to determine the value of the contestant, but just like you can't use points per game to determine the value of an NBA player, we have to consider other metrics. There's a huge reason why the best singers often do not win the show any given year.
4) Deviation between the first three. This acts as a further normalizer, so that one contestant is not gaining the majority of their points in one category while being very weak in other categories (ahem, Tim Urban, Kristy Lee Cook, etc).

So the first three metrics are the ones given equal weighting, and the deviation subtracts off the normalized summation of those weights. We come up with a consolidate value which I think attempts to capture the value of the contestant, which we take the first initials of performance, song freshness, growth and consistency: or the PGSC score. Due to the song freshness and growth weighting (growth is why we start at the top 10, because we get enough performances to measure that mark), it might also give some value into the marketability of the contestant. Several trends emerge: a contestant that has a high consolidate ranking tends to be more marketable than one that does not. Usually, this mark is above 200 or above for the very high ranking, and there are several contestants who have breached that mark. An exception can the highest ranking contestant of that particular season. Obviously, no model is perfect and there are exceptions.

But here are some food-for-thought questions that might be addressed by the model (scroll your cursor over the data points to learn who the data point corresponds to, their respective scores for the four metrics, their total score, and other extraneous data):

1) For AI12: In an even playing field, would Candice Glover have won over Kree Harrison or Angie Miller? Candice delivered great vocal performances, but was sorely lacking in song freshness factor with respect to her season, but she was majorly pimped. There is an argument to made for Amber Holcomb making it over Burnell Taylor as well. In general, TPTB did their job that season: they wanted a girl to win so badly they casted an all-around awful group of guys who would struggle in all four core facets of the algorithm, and right on cue, they all went down easily.

2) The double elimination conundrum, for AI6 and AI8: there was a lot made that producers really wanted to get rid of AI6's Phil Stacey and AI8's Anoop Desai to make way for who they wanted as their true frontrunners. Both Stacey and Desai had consolidate scores well north of 200, and there's an argument to be made that had Melinda Doolittle and Jordin Sparks, or Danny Gokey and Adam Lambert not been so pimped in those respective seasons, both those guys would have went farther than 5th-6th place.

3) AI7 was a weird case, with Michael Johns--he wasn't exactly depimped, but he had such a high score. Which goes to show you--a very high score, as we'll see below, has repercussions.

4) In general...diva belters have been docked off since Jennifer Hudson and LaToya Hudson ranked extremely well in AI3. Largely, this is due to old, un-fresh song choices; but if you wondered why Anwar Robinson, Mandisa, LaKisha Jones, Lil Rounds, Jacob Lusk, Joshua Ledet, and faux belters like Pia Toscano and Jessica Sanchez ranked awfully (and why they were both eliminated early; until Jessica got saved)--that's your answer. Consider all parts of the algorithm here.

Another issue I wanted to explore was developing a regression curve between genders: for a given PGSC score, how does the audience react to particular gender? For those that are stat-geeks, I've used the p-test to compare the correlation between the genders, and these groups were determined to be distinct. The r-squared value is about 41%, so there is a lot of noise, but that's to be expected when you are dealing with so many data points (between both genders, there are 110 total). To ensure the highest r-test value, I've decided to use an 8-degree polynomial model (8 is the highest it can go to). There are several things to note:

1) A PGSC score within 160-190 is the proverbial sweet spot for a male singer. Every male within this range has made it to at least the top five, with winners David Cook, Scotty McCreery and Phillip Phillips coming out of it with scores in that range. The regression curve reflects that as the sweet spot as well. With scores of a 168 and 171, respectively, for Caleb Johnson and Alex Preston, those two aren't going to buck that trend, because they are both in the top three. Within this range, it's been a foolproof method for a top five contestant.  Females tend to have it rougher in this range, but to be fair it's mostly due to the early seasons with shock eliminations (Amy Adams, Jennifer Hudson).

2) It is VIRTUALLY IMPOSSIBLE to win with a PGSC score of 200 or above. There's a saying that you are too good for your own good...well, the contestant is TOO good for the audience's good, in this case. Males especially have it rough with a score this high, but there is ONE major exception: Kris Allen. But AI8 was a weird season: that season had THREE contestants with scores over 200 or above: Adam Lambert, Anoop Desai and Allen. And Allen was third in the pecking order of those three in score, so while he was very good, there were contestants that season who were better. So he wasn't exactly "too good for his own good" relatively speaking. Of the guys, there were also shocking eliminations with Phil Stacey and Michael Johns here too. The girls also produced no winners here, but had several in the range of 2-4: LaToya London, Kree Harrison, Haley Reinhart and Angie Miller. But again: too good for the audience's good.

3) It's hard to make a great argument that the audience votes for competent guys more than competent girls. According to the regression model, there are a few bumps here and there, but up to a PGSC score of about 160, the regression lines between males and females both oscillate and seemingly alternate in favoritism. The biggest issue is when we actually get to the very competent girls: at scores between 160 to 190, the girls get shafted in voting rank compared to their male counterparts. Like every model, there are key exceptions: in fact, our four female winners had PGSC scores between 140 and 170; it's just that there are many other female data points around that range that brings the regression curve downwards. In general, the regression argues that a girl's best chance to get far (but not win) is to get a score over 200. But ultimately, the best chance to win is a score in that 140-170 range for a girl. Jena Irene is right at that range, and she's in the top three. She has a good chance of winning, given that there is precedent.

This season: So yep, we have already spoiled it: Jena Irene, Alex Preston and Caleb Johnson (our top three) are locked in a dead heat. Jena's PGSC is 173, Alex's is 171, and Caleb is 168. (Hell, Jessica Meuse's is 169, making this is the most natural-entropy-guided top four in recent memory; in other words, this is the correct top four, based on PGSC scores, in spite of producer manipulation). All three of those scores are locked within a bracket that can produce winners for a male or a female, so this is literally anyone's game, especially given how tight the PGSC scores are, especially compared to past seasons. In the past, I can use the scores to make key predictions on who would win with a certain accuracy, but it's hard to do exactly that this season. It's a virtual toss-up.

Also,  is the link to use, if you want to play with the larger model than what is shown below.

If you're not into visual data and you need the rankings spelled out for you, they are listed below. Below is the listing, or "camp-should-have-been", if you want to use that term, of the contestants, the way they would have ranked had TPTB or producers not interfered or done any sort of pre and within-show pimping, and let the contestant's performances speak for themselves. Also listed are the average PGSCs of the contestants within the top 10, and that can represent the ratings flow of the particular season (but not necessarily quality), based on those four factors. Without further ado, the PGSCs:

1) LaToya London 223
2) Jennifer Hudson 180
3) Amy Adams 176
4) Fantasia Barrino 169
5) Diana DeGarmo 166
6) George Huff 151
7) Jon Peter Lewis 98
8) John Stevens 60
9) Jasmine Trias 59
10) Camile Velasco 19
AVERAGE: 130.1

1) Carrie Underwood 170
2) Bo Bice 159
3) Nikko Smith 151
4) Constantine Maroulis 147
5) Anthony Fedorov 122
6) Nadia Turner 121
7) Jessica Sierra 118
8) Vonzell Solomon 111
9) Anwar Robinson 58
10) Scott Savol 52
AVERAGE: 120.9

1) Chris Daughtry 200
2) Bucky Covington 157
3) Taylor Hicks 148
4) Paris Bennett 148
5) Elliott Yamin 144
6) Mandisa 121
7) Katharine McPhee 106
8) Ace Young 102
9) Kellie Pickler 57
10) Lisa Tucker 54
AVERAGE: 123.7

1) Phil Stacey 213
2) Blake Lewis 192
3) Jordin Sparks 147
4) Melinda Doolittle 142
5) Gina Glocksen 135
6) Chris Richardson 127
7) Chris Sligh 88
8) Haley Scarnato 81
9) LaKisha Jones 76
10) Sanjaya Malakar 58
AVERAGE: 125.9

1) Michael Johns 230
2) David Cook 187
3) Kristy Lee Cook 143
4) David Archuleta 130
5) Chikezie 129
6) Syesha Mercado 124
7) Carly Smithson 123
8) Brooke White 92
9) Jason Castro 68
10) Ramiele Malubay 28
AVERAGE: 125.4

1) Adam Lambert 240
2) Anoop Desai 224
3) Kris Allen 216
4) Matt Giraud 166
5) Allison Iraheta 155
6) Danny Gokey 110
7) Scott MacIntyre 81
8) Lil Rounds 53
9) Megan Joy 42
10) Michael Sarver 35
AVERAGE: 132.2

1) Lee DeWyze 190
2) Casey James 155
3) Crystal Bowersox 154
4) Michael Lynche 123
5) Tim Urban 109
6) Katie Stevens 106
7) Didi Benami 91
8) Aaron Kelly 73
9) Siobhan Magnus 73
10) Andrew Garcia 72
AVERAGE: 114.6

1) James Durbin 214
2) Haley Reinhart 209
3) Lauren Alaina 188
4) Scotty McCreery 176
5) Casey Abrams 139
6) Pia Toscano 113
7) Stefano Langone 94
8) Paul McDonald 82
9) Jacob Lusk 46
10) Naima Adedapo 32
AVERAGE: 129.3

1) Phillip Phillips 167
2) Hollie Cavanagh 165
3) Skylar Laine 136
4) Elise Testone 129
5) Erika Van Pelt 124
6) DeAndre Brackensick 118
7) Colton Dixon 107
8) Joshua Ledet 107
9) Jessica Sanchez 106
10) Heejun Han 52
AVERAGE: 121.1

AI12: (note: Curtis Finch had no growth rate, so his data was omitted; not that he would have rated well, anyway):
1) Kree Harrison 231
2) Angie Miller 230
3) Janelle Arthur 185
4) Candice Glover 157
5) Burnell Taylor 139
6) Amber Holcomb 111
7) Devin Velez 100
8) Paul Jolley 80
9) Lazaro Arbos 51

And where we are at now...AI13 so far:

1) Jena Irene 173
2) Alex Preston 171
3) Jessica Meuse 169
4) Caleb Johnson 168
5) Majesty Rose 124
6) Malaya Watson 112
7) Sam Woolf 106
8) Dexter Roberts 98
9) CJ Harris 74
10) MK Nobilette 40
AVERAGE: 123.5

Now, below, play with the actual model!:

No comments:

Post a Comment