In the dynamic world of football, there are an infinite number of variables to consider when analyzing the game. Expected Points offers a lens through which to view team performance, moving beyond simple yardage gains. By accounting for critical game state variables, Expected Points provides a robust baseline for evaluating how well a team moves (or stops) the ball.
As an analytics service, we have our own Expected Points model, as many services do, that takes game state variables and quantifies the amount of points a team should score on the play. However, by recently digging into our original model, we found some gaps that we wanted to address when comparing the actual scoring results of the game. Before addressing those changes, let’s dive into how the original model was built.
The Old Model
Our previous model used down, to go distance, distance to the end zone, and whether or not the offense is the home team. Usually, the first three listed are the core of all Expected Points models, but we also added the binary “home offense” feature to add a little more context.
Although effective, we found a phenomenon that our model was less calibrated at the end of halves, especially at the end of games. Also, we found a substantial difference between the actual results and our model in 4th quarters as to whether or not the offensive team was losing. To get to the root cause, we needed to dive deeper into the scoring environment at these times.
Time Left vs. Scoring Percentage by the End of the Half
In the graph above, we can see that the rate of scoring decreases as available time in the half decreases in the NFL (time left = 0 on the left side of the chart). Intuitive, yes, but we also see the severity changes given the times.
In the 4th quarter, we see scoring begin to decrease sharper at the 15 minute mark (beginning of the 4th quarter) and then decrease more and more sharply at the 2 minute intervals outlined above.
This also occurs in the first half and in overtime, but the decline starts much later. The shape in the last 4 minutes of the first half is mostly similar to the shape of the 8-10 minute mark in the second half. This same trend exists on the college side as well.
NFL Model Calibration – Pre Changes
In the graph above highlighting the NFL calibration before the changes, there is a distinct gap in expected scoring and actual scoring on average in the 4th quarter in all 3 scoring margin buckets. The model underpredicts scoring when a team is losing (as those teams are often hurrying to catch up), and overpredicts scoring when a team is winning and tied (as those teams are often slowing things down), but tied is a much lesser degree. At the end of the first half, there is a slight deviation inside 4 minutes, but not nearly as severe as the end of the game.
CFB Model Calibration – Pre Changes
From a college perspective, the model shows more deviance than the NFL. There is still an effect at the same time ranges that were previously highlighted, but there are bigger gaps in the winning and losing phases. The larger gaps in the college model might be attributed to larger gaps in team quality, which we are not addressing in this model. For the purpose of this re-work, the time and lead theories still apply here.
After reviewing this data, we concluded that both end of half situations combined with the lead type at the end of the game factor into a pace of play component that has an effect on expected scoring in a game, and that pace takes effect at the end of the first half and in the 4th quarter.
This is not to be confused with the rating of the teams with the lead component, which we did not want to build into the Expected Points model. This model is centered around the state of the game, factoring in average outcomes against the level at which teams are playing (NFL or college). A model that incorporates team rating is more complex and something that we did not want to attempt at this time. The general trend of “good teams are winning more” is reversed once the lack of time to score comes into play. This specific state of the game factor is what we are trying to account for.
If we did incorporate team ratings, this would help the college model more given the larger gaps in winning and losing.
The New Model
To factor in pace, three new features were created for the model. These new features are described as follows:
- Quarter Grouping:
- 10 minutes and under to go in the 4th quarter
- 2 minutes and under to go in the 2nd quarter
- All other time situations
- Time Left in the Quarter in Minutes:
- Counting down from 10 by 2s (10,8,6,4,2) for the 4th quarter and only a 2 for the 2 minute mark and under in the 2nd quarter
- All other times are labeled as a 15 to be the catch all
- Offensive Team Lead Grouping:
- Losing (<10 minutes left in the 4th quarter)
- Winning (<10 minutes left in the 4th quarter)
- Tied (<10 minutes left in the 4th quarter)
- All other cases (>10 minutes left in the 4th quarter)
The time features were engineered this way strictly to look at the specific time periods under consideration. This is a proxy for the pace of play at the end of halves where a team may operate differently when under a time crunch and if they are winning or losing. The goal isn’t to try and find the difference in play at all times of the game, which is why the time groupings were created instead to only capture the times when the game context imposes a pace on a team.
NFL Model Calibration – Post Changes
CFB Model Calibration – Post Changes
The calibrations are now more aligned at the end of halves and follow the pattern of actual scoring. The college model still sees larger disparities in the winning and losing phases with over-predicting scoring when losing and under-predicting scoring when winning. However, the end of game situations are much better. The NFL model adjusted smoothly to the actual results at the end of halves as well, especially in higher expected scoring environments when a team is losing.
The goal of improving our models incorporating pace at the end of halves given the lead situation has been met here. The calibration to actual scoring on average has improved in both NFL and college. With this improvement at the base level of evaluation, we can now assess EPA metrics more accurately when it comes to teams as well as our Total Points metric to evaluate players.