PART I: THE PGSC SCORE, PART I: PGSC (PERFORMANCE, SONG FRESHNESS, GROWTH AND CONSISTENCY) VERSUS ACTUAL IDOL RANK
THE PGSC SCORE, PART II: UPDATED FEATURES (GEOGRAPHIC LOCATION, AGE AND GENRE)
*Part I contains the ranks of the Idol contestants on a season-by-season basis, based on the PGSC score. Think of this as the way the contestants should be ranked, assuming neutral ground and zero pimpage. It is also the way I would rank them.
*Part II ranks the Idol contestants based on PGSC score, according to genre.
*Part III, which is this article, attempts to use historical rank data since AI3 to put actual rank numbers into contestants, based on PGSC score, and based on the features added in at Part II. Unlike Part I, this is how we predict the audience would vote based on these features, and this model gets more informed as we get more data points (more seasons).
As explained in the title above, from part II of my study I utilized the four features I implemented for the PGSC score, and for greater accuracy of specific trends within the PGSC score, I used the 8-degree polynomial function for each of those features, and took averages of the four. Remember, the smaller the number, the better!--that means you're ranked higher. The average rank corresponds to the average rank within the top ten--an average of 2.54, as was Scotty McCreery's case, means the model is predicting an outcome between 2nd and 3rd place for that particular contestant, or if that contestant is the highest rank of that season, as was in McCreery's case, then he/she is the predicted winner of the season. There are certain errors for sure, but it attempts to create a model using prior knowledge of past seasons to update the current knowledge on how to peg contestant placement. It's not Bayesian, but it's informed to a certain degree.
There are some quibbles--we can ignore Curtis Finch's high scores, because he has no growth so his numbers are skewed. For all intents and purposes, he doesn't count.
Alex Preston has an enormously high rank based on his geography--the reality is, the model hasn't seen anyone from the Northeast have a PGSC score within his range, so it is uninformed. I considered using a logarithmic adjustment which will produce a fairer score for Preston here, though. From this link, a quick extrapolation using the logarithmic adjustment suggests that Alex's score due to geography is about 5.98 or thereabouts, but it is far less accurate. There is a general consensus that the order is Jena, Caleb and then Alex in that order, and that's what my algorithm predicts. But Alex is really docked off based on geography and the very few data points the model has of him within that PGSC inset--otherwise, his other numbers, particularly his singer-songwriter genre, is something that is ABSOLUTELY winning material. I think it's closer than the model thinks, but I think the general order is correct.
My informed model has the ranks generally like this; as you can see below, it's predicted the correct top 4 of AI10, had three of the top five in AI11, had four of the top five in AI12, had three of the top five in AI13, had three of the top five in AI3 as well as the right top two, had three of the top five in AI4 as well as the right top two, had four of the top five in AI5, was almost nearly perfect in relative order in AI6, had the correct top two in AI7, had four of the top five in AI8, and had four of the top five in AI9 and the correct final three. Below is the results of that:
|8||Erika Van Pelt||6.648269||AI11|
|9||Jon Peter Lewis||6.883238||AI3|
|3||Kristy Lee Cook||5.07886||AI7|
The model is presented below, in data format. Play around with it!