class: center, middle, inverse, title-slide #
Tired of Misattribution, Modeling Player Fatigue in the NBA ###
Austin Stephen, Matthew Yep, and Grace Fain Advised by Maksim Horowitz ###
7/30/2021 --- ### Why fatigue? -- <span> 1.</span> Fairness and strategic advantage <img src="./images/Motivation.PNG" alt="Not found" width="90%" /> ??? Media is full of things like this and it makes sense why. --- ### Why fatigue? <span> 1.</span> Fairness and strategic advantage <img src="./images/Map.png" alt="Not found" width="90%" /> ??? The literature is very clear that insufficient rest worsens athlete performance. Edwards et al The National Center for Biotechnology Information "Early identification and subsequent management of fatigue may prevent detrimental physical and physiological adaptations often associated with injury and enhance athletic performance and player availability" --- ### Project takeways <span style='font-size: 38px'> **1.** </span> Offer evidence players achieve a high degree of recovery between games. ??? Just list these -- <span style='font-size: 38px'> **2.** </span> Establish a subset of schedule induced factors that impact game outcomes. --- ### What is fatigue? - Cannot use direct measures ??? More concrete on fatigue: Edwards et al. The National Center for Biotechnology Information "the decline in objective performance measures derived from the capacity of the nervous system and contractile properties of muscles over time" In a perfect world we would directly measure an NBA players internal homeostasis either with hematological techniques or heart rate sensors. However, players cannot wear heart rate monitors or tracking equipment in games and we certainly don't have data about in/post-game blood samples for hematological analysis so we need a proxies to measure how fatigued a player was by in game events. -- - Identify events that fatigue players -- 1. **In-game events** - ex. Distance run or speed 2. **Between game events** - ex. Length of flight or rest since last game --- ### The in-game distance data - See appendix A for theoretical motivation
Distance By Player Individual Games
Name
Date
Min
Distance (mi)
Distance Offense
Distance Defense
Speed (mph)
Speed Offense
Speed Defense
Al Horford
10/16/2018
29.95
2.15
1.21
0.94
3.91
4.15
3.64
Alex Abrines
10/16/2018
23.48
1.88
0.97
0.91
4.57
4.96
4.19
Alfonzo McKinnie
10/16/2018
2.35
0.18
0.11
0.07
4.79
4.99
4.61
Amir Johnson
10/16/2018
11.18
0.87
0.47
0.39
4.49
4.83
4.13
Data courtesy of nba.com, Second Spectrum, Patrick Chodowski (NBAr)
??? 3 important things: Gen by computer vision Game by game Player level --- ### In-game distance and performance - More distance does not correlate with better performance ??? Hypothesized more distance meant more effort so ideally better performance. Explain axis and data is 20k observations No correlation between fg percentage and distance a team travels. -- - p-value 0.402 <img src="presentation_10_min_files/figure-html/unnamed-chunk-2-1.png" style="display: block; margin: auto;" /> --- ### Post distance overload regress - Results consistent across 3, 5, 7 and 10 day windows <img src="presentation_10_min_files/figure-html/unnamed-chunk-3-1.png" style="display: block; margin: auto;" /> ??? Moving to the second part of our analysis of in-game distance we are looking to observe players fatiguing over time. We hypothesized a players who's workload deviated from their season average in previous days would exhibit a refractory period due to fatigue. However, we again found compelling evidence this is not the case. It actually appears the opposite is true, the greater workload in preceding days indicated in the observation game the athlete was likely to again take on a higher workload. Hypothesized this is caused by players who are playing well get left in the game longer. --- ### In-game distance back to back - Cannot draw conclusions across multiple seasons <!-- --> --- ### Case study: load management <!-- --> --- ### What does this mean? - Basketball is complex - Evidence indicating players are recovered between games. - Intuition may be misguided ??? 1. Basketball is composed of complex interaction of skill sets and simply working harder does necessarily yield better results. While tracking data sounds exciting without being more closely tied to other in-game events it offers little utility by way of player performance. 2. One of the biggest trends in basketball commentary is players get tired from playing too many games and as a result the quality of game decreases. However, we offer compelling evidence a players workload over the their previous games does not influence their performance in future games. Based on these results, we wouldn't be surprised if often people are often searching to explain why a player did worse in a game when the truth is there may be nothing more than random variations in performance. Aka just because your star player had a bad night it does not mean fatigue was necessarily to blame. That said, we aren't trying to tell you players don't ever get tired because that's obviously not the case. Rather we are saying it appears the complexity of basketball and the techniques used by teams to ensure players are ready to play are more sophisticated than can be captured by measuring the players work. --- ### Timezones and circadian rhythm <!-- --> --- ### Taking a look at structural fatigue
Travel, Schedule and Density Data
date
visitor
opponent
game_net_rating
distance_diff
rest_diff
shift
b2b_2nd
net_rating_diff
2010-10-26
Houston Rockets
Los Angeles Lakers
-0.9
1380
0
-2
TRUE
-4.0
2012-11-21
Portland Trail Blazers
Phoenix Suns
-29.3
1001
-1
1
FALSE
3.3
2014-02-08
Miami Heat
Utah Jazz
-5.8
-421
2
1
FALSE
12.7
2017-11-24
New York Knicks
Atlanta Hawks
-12.2
748
0
0
TRUE
2.3
Data courtesy of Jose Fernandez (airball) and Patrick Chodowski (NBAr)
--- ### Composition of game outcome - Game outcome is the response - Event contributing to that outcome is a predictor - Most important factor is team strength - **Goal:** Extract relationships with the game outcome that are present once the team strength has been eliminated from the picture - See Appendices B and C for details on model construction ??? Outline the appraoch to modeling --- ### Linear model results #### Many of our travel and scheduling metrics had insignificant effects on game net rating... .pull-left[ * Distance traveled * Windowed distance traveled * Distance traveled difference * Flight duration * Timezone shift by -2, -1, 1, 2, and 3 hours ] .pull-right[ * 2nd leg of a back to back * Tipoff time + West coast teams playing in the morning EST + East coast teams playing at night PST ] --- ### Significant schedule induced factors <font size="5"> <i>gameNetRatingResiduals</i> = </font> <br> <font size="5"> `\(\beta_0\)` + `\(\beta_1\)`<i>restDiff</i> + `\(\beta_2\)` <i>3HrsBack</i> + `\(\beta_3\)`<i>3rdGame4Days</i> </font> <img src="presentation_10_min_files/figure-html/Linear Model-1.png" style="display: block; margin: auto;" /> --- ### Further work * Generalize load management to more players + Study effectiveness in the long run * Design a stronger proxy for team strength --- ### Relevance to stakeholders **NBA Teams:** - Allocation of resources - Load management **League Office:** - Schedule design - Viewership and profit maximization --- ## Thanks **Advisor:** Maksim Horowitz, Atlanta Hawks **Carnegie Mellon:** Ron Yurko Rebecca Nugent Beomjo Park Meg Ellingwood Nick Kissel **Special thanks:** Tom Bliss for your insight on fatigue in the NFL --- ### Appendix A #### Theoretical basis for distance's relationship with work W = F \* d, `\(\Delta\)`W = `\(\Delta\)`d <font size="4.5"> From kinematics work is defined as Force * distance. The force to move an object of constant mass is also a constant, therefore if we assume a players mass is constant over a season, the changes in distance for a game captures nearly all of their delta in work. In other words, work is agnostic to the time and cardinality of the force players use to move their bodies. This means despite being a very high level view of the game it is capturing a very high amount information about player exertion.</font> --- ### Appendix B #### Conceptual approaches to modeling team strength 1. __Constrained:__ Limited to the information available to a team when they would have played the game. 2. __Unconstrained:__ All available information about the team not incorporated into the response. <font size="4.5"> We chose to prioritize the second approach because it would allow us to decompose the response into its causes and we were interested in identifying what effect between game factors had on game outcome in the absence of as much of the team strength as possible. We felt understanding fatigue was more valuable that predicting fatigue. </font> --- ### Appendix C #### Investigation into modeling team strength <img src="presentation_10_min_files/figure-html/unnamed-chunk-4-1.png" style="display: block; margin: auto;" /> <font size="4.5"> The net rating of a game is composed of how the team preformed in that given game. If a team won a game with a rating of 10 their opponent had a rating of -10. The teams season average net rating difference served as the most simple and nearly the lowest RMSE. </font> --- ### Appendix C P2 <font size="4.5"> <b>w60_en</b> is a shrinkage method using elastic net trained on a large set of team information. <br> <b>w_40</b> is a window of the last 40 games a team has played as an additional predictor. <br> <b>w40_t</b> is a window of the last 40 games a team has played as an additional predictor with temporal weighting. <br> <b>w40_td</b> is a window of the last 40 games where an interaction with the games played and the season net rating difference allows. <br> <b>w60_tf</b> is a window of the last 30 games and 30 games into the future weighted separately from the season average. </font>