Statistical analysis has been used in the National Football League (NFL) since its beginning to attempt to predict this often unpredictable game. Ranking players, estimating win probability, and maximizing starting field position after a punt are problems currently facing NFL teams. In this paper, a multiple regression approach is used to rank players based on their average change in expected eventual points, a logistic regression model is used to estimate in-game win probabilities, and expected starting field position is used to find the optimal starting yard line for punt returners near their own goal line. The regression models are based on data from the 2006 and 2007 seasons, whereas experimental data is used for the punt returner location problem. Using the eventual points model to rank quarterbacks based on their average change in predicted eventual points better approximates a quarterback's worth than the current flawed quarterback rating system because it does not count all interceptions and incompletions the same and accounts for sacks and fumbles. The win probability model is used to both aid in decision making and in the creation of win probability graphs, while results from the punt returner experiment are used to argue that the 8 yard line is the optimal yard line to place punt returners when the punter is kicking from mid-field.
Hometown: Murfreesboro, Tennessee
Major: Mathematics for Teacher Licensure
Thesis Title: Statistical Analysis in the NFL
Advisor: Dr. Jeffrey Bay
Kyle Prince played organized football just once – in middle school – but thanks to his Senior Study at Maryville College, football may figure into his vocational plans.
Prince, a mathematics for teacher licensure major from Murfreesboro, Tenn., just finished an in-depth look at National Football League (NFL) statistics. After making contacts with former pro coaches and football statisticians for his study, he was asked to compete for an internship that would see him contributing to next season's Pro Football Prospectus.
"Through my email exchanges with the book's editor, Aaron Schatz, I learned of the internship. Now, I'm compiling a sample project on player injuries. It's due January 16, and I'm just about finished," he said. "To teach math in high school and also pursue work in statistics would be a dream. Professional football teams and most schools have the same off-season – summer – and that's when statisticians seem to have the time to analyze the data from the previous season. If this internship leads to a regular position where I get to work with NFL stats, I think that would be fun."
Prince's study is entitled, simply, "Statistical Analysis in the NFL," but the simplicity ends there. His 63-page study details why the rankings system currently used by the NFL is flawed and how another rankings system that incorporates more situational data – a system of his own formulation – would better assign player value and account for how the game is played today.
"In the current NFL passer rating system for quarterbacks, all incompletions and interceptions are counted the same, but quarterback sacks and fumbles are not accounted for, and neither are rushing attempts," he explained. "The current system doesn't consider situational differences."
Meeting with Dr. Jeff Bay, associate professor of mathematics (who would eventually become his Senior Study advisor) in late 2007, Prince learned that by applying a statistical methodology called regression analysis, he could compile data that was more concerned with making predictions about players and games than assessing what's already happened.
"Kyle showed a lot of initiative [through the study], and we were fortunate that his contact at footballoutsiders.com, Aaron Schatz, was so helpful," Bay said.
In the summer of 2008, Schatz sent Prince two data sets, which included the outcomes of more than 100,000 plays in the NFL's 2006 and 2007 seasons.
"For the strength of the study, I would have liked to have had a third set, but I took what he sent me and ran with it," Prince said, adding that at that point, he had to clear another major hurdle – writing computer programs that would allow him to reconfigure columns so that he could run his models.
Prince's Senior Study was concerned, primarily, with three areas: Player ranking, estimating win probability, and maximizing starting field position after a punt. Prince used a multiple regression approach to rank players, adapting an idea about how to rank NFL quarterbacks suggested by statisticians Chris White and Scott Berry.
He used a logistic regression approach to explain estimates for in-game win probabilities.
Regarding the starting field position question, Prince had to be creative because there was little to no quantitative data with which to work. Historically, punt returners have been told to put their heels on the 10 yardline when they are near their own goal. But Prince wanted to test a myth among the football community that putting heels on the 8 or 8½ yardline yields better results for the returning team.
He interviewed Gary Zauner, a former special teams coach in the NFL, for his perspective. And for hard data, Prince decided to conduct his own experiment, using punters from Maryville College's football team.
"This [analysis of experimental data] illustrates a role of statistical science – to replace theoretical calculations with actual data," Bay said. "Instead of using physics to describe exactly how far the ball would be expected to bounce at the end of a punt, Kyle recorded numerous repetitions of how far a ball actually bounced at the end of a punt."
Prince's initiative in conducting the on-field experiment was indicative of his entire Senior Study, the professor added.
"He never looked for an easy way out. If he thought there was a better way to answer a question, or if there was an additional question that he wanted to answer, he pursued it even if it meant more time," Bay said. "Kyle modeled exactly what we hope of students in their Senior Study – he was self-motivated, disciplined and enthusiastic."
As a statistician, Bay said he was impressed by the research, also.
"The quality of the overall statistical analysis was very good," he said. "Kyle's rating system does require more data and more advanced statistical methodology, but it is consistent with how fans, coaches and general managers understand the real value of a player."
Does Prince have a prediction for the winner of Super Bowl XLIII? A longtime Tennessee Titans fan, he was disappointed by the outcome of Titans-Ravens contest, but realizes that the games will go on.
"I like the Eagles – not just because of their stats, but because of how strong they finished the regular season. That's what [Super Bowl XLII winners] the Giants did last year."