Jump to content
Indianapolis Colts
Indianapolis Colts Fan Forum

How about stat?


SteelCityColt

Recommended Posts

There has been quite a lot of conversation on the forums lately about stats, there use, and are they useful for assessing the level of play on the field. To be clear from the start, stats are not the sole answer to analysing football, and certainly need triangulating with other data sources (e.g. film). In fact what they are best used for is guiding analysis of other sources; you could say they give you the right questions to ask, rather than the answers.

 

Part of the issue with stats is, well, there are so many. It’s hard to judge which are the important ones vs ones that are, at best, vanity. But what if there was a way to relatively compare metrics?

 

It’s often cited that the game is all about winning, so what if we took individual metrics and compare them to a team’s winning percentage to measure the correlation. That is to say, how much a metric potentially affects a team’s chances of winning. Now it’s important to note that correlation does not always equal causation, as explained brilliantly here https://tinyurl.com/y5x3hh9m. We can however use logic, and our more holistic understanding of the game to sanity check the correlations.

 

Now if we’re going to compare metrics based on correlation we need some quantitative way of measuring this. That’s where the Coefficient of determination comes in. More commonly known as R Squared, it is the proportion of the variance in the dependent variable (in our case Winning %) that is predictable from the independent variables (the metrics we’re assessing). Or more simply put how well two sets of data “fit” suggesting a relationship between the two. For my purposes this will be a value between 0 and 1, where 1 would be full correlation (unlikely) and 0 is no correlation. Now as football is a complicated game with many moving parts I wouldn’t expect to see many metrics with strong correlation, but we can use it as a measure to compare them relatively.

 

Before we dive in, some things to make clear:

1)      This is a long post, and will be very stats heavy. If that’s not your thing, please don’t make a judgement without reading it fully.

2)      This is not a way to predict future outcomes. This is examining historical data to identify trends to inform further discussion. If you wanted to predict games, you’d take this analysis and build an analytical model, weighing metrics accordingly.

3)      While it can’t predict, it can be used to give context to stats/metrics and an idea if a team is performing well or not, and which are the more important areas to be doing well in. If the correlation is good we can use it to give values for “tiering” the metrics.

4)      This isn’t intended as gospel these are all the answers, it’s meant as a prompt for discussion.

5)      All data used is from Pro Football Reference (https://www.pro-football-reference.com/), it’s a great site, use it.

 

The methodology used for all the below was to take 10 seasons worth of data and look at each team’s final standings, giving us 352 data points for each metric, with each point in turn made up of 16 games worth of data. Basically the sample is 2,816 NFL regular season games. Now without further ado let’s, as Missy Elliot said, get our geek on and look at some metrics:

 

Points

When we’re talking about winning it’s hard to get away from the obvious. You win by scoring more points than the opposition. So if you take the average points per game for each team:

 

PPGOff.png

 

You can see there is fairly decent correlation which would imply the more points you score on average per game, the more likely you are to win said game. Now, as the correlation is fairly good, we take the line of best fit and use it’s equation to give us some rough values for tiering the metric. If we plumb in win percentages of 25%, 50% and 75% we get:

 

25% - 14.73 PPG

50% - 22.42 PPG

75% - 30.11 PPG

 

When you look at the spread between the tiers, it’s perhaps not surprising that it’s 7.69 points, or just over a touchdown.

 

But what about the defensive side of the game? There is after all a team trying to score on you too:

 

PPGDef.png

 

As you’d expect there is some correlation in that the more points you allow on the whole, the less likely you are to win the game. Interestingly though, the correlation is slightly less that points scored.  If we apply the same tiering:

 

25% - 29.10 PPG Allowed

50% - 22.46 PPG Allowed

75% - 15.83 PPG Allowed

 

However, what wins you a game is scoring more than your opponent on any given Sunday, so let’s look at average point differential over the course of a season:

 

PDiff.png

 

 

A very strong correlation, as perhaps to be expected, and I’d go out on a limb as to say it’s the most important metric. Seems like it’s stating the obvious, score more points than your opposition and you’ll win. But it’s also about by how many you beat the opposition. Dominant teams tend to win more games. Again, not exactly surprising, and looking at the tiering:

 

25% - (-) 8.94 Point Differential

50% - (-) 0.01 Point Differential

75% - 8.92 Point Differential

 

Looking at the above it would suggest bad teams are losing by more than TD on average, and good teams are winning by more than one on average. Which means games decided by less than one TD, or by even less, are going to tend towards that 50% mark as the scores approach parity. This supports the idea that NFL games where the winning margin is a TD or less are “coin flip” games. 

 

Yards

In order to get points we need to be able to move the ball on the field. I’ve often said I’m not enamoured with using season volume stats or even per game stats (unless we’re talking points) as they can be skewed by the number of plays a team runs. Instead if we look per play, we know we’re getting a true apples/apples comparison between teams.  Starting then by looking at average yards gained per offensive play:

 

Off-YPPComb.png

 

 

A surprisingly low correlation, but then perhaps there is some logic here. It’s no good getting yards if you’re not getting points, and there will be plenty of “junk” yards due to game situations. Because the correlation is low I won’t tier the metric as it wouldn’t really be sensical.

 

Looking on the defensive side of the ball:

 

Def-YPPComb.png

 

Even more suprising. Lends weight to the argument that you can “bend not break” on D as long as you’re not giving up points.

 

Conversions

Due to the nature of the game, it’s not just about yards; it’s about the context of those yards too. That is to say, if you’re not converting your downs then you’re offense isn’t going to be staying on the field.

 

Let’s start with just the % of plays on any down that resulted in a 1st Down:

 

1stDowns.png

 

A really surprsing lack of correlation here. But ok, a lot of downs the O might not be drawing up chunk plays so there will be a fair few plays that unless they break will go for less than 10 yards. The key down is often held to be 3rd down, so let’s look at 3rd down conversion percentage:

 

3rd-Down.png

 

This is even more of a surprise. The correlation is so small we’re having to express it as a power of e. I’m not relaly sure what to make of this, other than it would suggest being able to conistently convert on 3rd down might not be as important as you’d think. I’d caveat though there will be a huge amount of game situation nuance that might not come through in the numbers.

For sake of completeness what about teams going for it on 4th down?

 

4th-Down.png

 

No major correlation here either.

 

Time of Possession

It’s been often held up that it’s important to win the Time of Possession battle, and while we know it’s certainly important to milk the clock or hurry up as the game situation dictates, does it overall make you more likely to win?

 

ToP.png

 

As we can see there is some coerralation but not as strong as you might expect.

 

Turnovers

Another area that is put forward as being a key to winning are turnovers, so let’s look at how they affect the Win %:

 

Off-TOComb.png

 

The above graph shows Offensive Turnover percentage, that is the percentage of offensive plays that ended in a turnover against winning percentage as expected there is a fairly strong negative correlation.

 

But what about the defensive takeaways?

 

Def-TOComb.png

 

While there is some correlation between the percentage of defensive plays that end in a turnover and winning % it’s not quite strong as the offensive turnovers.

 

However when we talk about turnovers we often talk about the battle, so similarly to the points discussion above, what about the turnover differential?

 

TODiff.png

 

Unsurprisingly a stronger correlation, if you give up the ball less than your opponent on the day it’s probably going to help you win.

 

Penalties

An area that often frustrates me, as we see such inconsistent officiating week to week, but does getting penalised more affect your chances of winning?

 

Looking first by average no. penalties per game:

 

PenNPG.png

 

I was expecting there be a stronger negative correlation here,  but given the variance in yards depending on the type of penalty maybe the distance given up has more effect:

 

PenYPG.png

 

An even smaller correlation, which could suggest that there isn’t a huge benefit to being a “disciplined” team.   However I would also suspect that this another thing that might not show up in the stats as such but will have an impact situationally.

 

Passing

We previously looked at combined offensive yards but what about if we start splitting between the passing and running games? Looking first at passing yards per attempt:

 

PassOYA.png

 

We can see there’s a stonger correlation than combined offensive yards per play, and it’s strong enough in my opinion to give some loose tiering:

 

25% - 5.03 Yards per Attempt

50% - 6.68 Yards per Attempt

75% - 8.32 Yards per Attempt

 

What about some other metrics that we look at when assessing QBs?

 

CmpPct.png

 

I was somewhat surprised that the correlation wasn’t stronger for completion percentage, given the weight given to accurarcy when QBs are assessed, especially during the draft process. One thing to consider is, how much is completion percentage a measure of QB’s accuracy.

 

Referencing back to points (TDs) and turnovers (INTs)  what effect do they have?

 

POTD.png

Pass-OInt-Pct.png

 

 

We can see there is decent correlation between the percentage of offensive plays that end in a passing touchdown and winning. There is less negative correlation to the percentage of plays that end in an interception. This could suggest it’s better to trade having a reasonably worse TD:INT ration for a higher volume of passing touchdowns.

 

Another area that is considered to be a drive killer is sacks. So does it follow through the more you’re sacked the less likely you are to win?

 

SacksAll.png

 

Perhaps not as much as you might have thought. What about the amount of yards given up on average to sacks?

 

Sack-Yds-All.png

 

Again less negative correlation then perhaps to be expected.

 

As there is fairly decent correlation for the passing metrics, what about if we had some way of combining them into one catch all metric?

 

Passer-Rating-O.png

 

I’ve never been a huge fan of passer rating, as I felt it was slightly outdated, but the correlation would suggest its still viable as a way of grading QB play. Looking at the usual Tiering:

 

25% - 63.50

50% - 86.87

75% - 110.23

 

Another metric I’m a fan of is adjusted net yards per attempt (ANY/A), which is calculated as:

 

(PASSING YARDS – SACK YARDAGE + (20 × TOUCHDOWNS) – (45 × INTERCEPTIONS)) / (PASS ATTEMPTS + SACKS)

 

I like it as it rewards efficient QBs and gives less weighting to sheer volume of yards.

 

Pass-OANYA.png

 

A decent correlation, suggesting it could be used as high level broad metric for comparing QB play. Breaking down the tiers:

25% - 4.50 ANY/A

50% - 6.61 ANY/A

75% - 8.70 ANY/A

 

Overall then it would seem having a decent passing attack is conducive to winning more games, but what about defending the pass?

 

Looking just at our two combined metrics:

 

Passer-Rating-D.png

ANYAD.png

 

 

Again there is less correlation on the defensive side of the ball, which could suggest a good passing offense is more of a factor to your chance of winning compared to a good passing defense. But as we’ve established it’s points that is important, let’s look at passing touchdowns allowed:

 

PTDDef.png

 

Again less correlation when compared to the offensive side of the ball.

 

Rushing

The old mantra that often gets brougght up is “run the ball and stop the run”, but how true is this in the modern league of explose offenses and gaudy passing numbers?

 

Starting with rushing yards per attempt:

 

RshDYA.png

 

Pretty shocking at how low the correlation is here. It would suggest that running yardage really doesn’t have much effect on your chances of winning. Again though I’d suspect the situational value of being able to run, and the other effects such as making the defense account for it aren’t show here.

 

As we’ve already postulated though, it’s points that matter, so what about the percentage of offensive plays that end in a Rushing TD?

 

RshOTd.png

 

More correlation here, I’d love to further split this down by distance to goal, to see the value of being able to punch it in during goaline situations.

 

On the defensive side of the ball:

 

RshDYA.png

RshDTD.png

 

Slightly more correlation to being able defend rushing yardage, but less for being able to stop rushing TDs.

 

 

So what does this all mean? As I said, this wasn’t intended as the answer, but an aid to frame questions and discussion. But some things that may be implied from this:

 

  • 1) Yards, especially when considered as a total volume are somwhat vanity. It’s points that win you games.
  • 2) By how many points you win a game on average is important, the closer to under a TD you get the more likely variance in winning “coin flip” games.
  • 3) The above could be seen as an indicator it gives you marginally more chance of winning to have a good offense vs a good defense.
  • 4) Having a good passing game gives you a better chance of winning than having a good rushing game.
  • 5) It’s probably better to have a aggressive explosive passing offense (High TDs, not getting tied up about worrying about sustainecd drives) vs conservative (Low Ints, sustained drives, but may only be getting FGs).

 

Things I want to look at further:

 

1) This is all based on regular season games, I want to see if it changes for playoff games.

2) Look at some other metrics (QB hurries for example) and maybe stuff like DVOA.

 

 

  • Like 5
  • Thanks 3
Link to comment
Share on other sites

5 hours ago, SteelCityColt said:

@EastStreet as I'm going away and I've been promising this, thought I better drop it before I did. Not perfect, a bit rushed, but some things to consider certainly. 

Hope it's a minimum security short term stay. Watch your back and don't get shanked. 

:D

 

Getting ready to dive in over lunch. The pngs aren't showing up (showing broken).

Link to comment
Share on other sites

27 minutes ago, EastStreet said:

I'll try in a different browser.

 

Okay they show if I'm logged into my forum account, don't if not. 

 

I've hosted them all elsewhere and changed the links. Can @EastStreet or someone let me know they can seem them all ok please? 

 

You may have to click through to get them to be readable. 

  • Thanks 1
Link to comment
Share on other sites

1 hour ago, SteelCityColt said:

 

Okay they show if I'm logged into my forum account, don't if not. 

 

I've hosted them all elsewhere and changed the links. Can @EastStreet or someone let me know they can seem them all ok please? 

 

You may have to click through to get them to be readable. 

It's all showing now. Thank you. My lunch got interrupted so great timing.

Link to comment
Share on other sites

Finally had a chance to sit down and read through it. Absolutely nice work @SteelCityColt

 

While some of the correlations were expected, I expected higher correlation in some areas. The lack of correlation in some views really drive home what is not so important. 

 

I know this is too heavy for many, but really wish some of our posters would take the time to absorb this. Going back to the topics of pass vs run, low INTs, defense (in general), etc., I think your work here would go a long way to informing some of the realities, while killing some of the anecdotal or historical assumptions. 

 

Please send this to Ballard's stat team and coaches lol.... 

 

and...... TTDB

  • Like 2
Link to comment
Share on other sites

@SteelCityColt Thank you for this post.  It seems the Colts fit in to many categories that make us a 50/50 team, which makes sense, considering the stats about our QB.  The third down stat surprised me (for offense) but it makes sense to me, especially thinking about past games.  When moving down the field making first downs on first and second downs AND scoring points, third downs don't matter as much.  With QBs we've had with higher QB ratings where it seemed we did this more (I may totally be off base, please correct me if I'm wrong), it seemed we won more.  Again, thanks for this breakdown.   At first I was feeling unsure if I would understand your post, but you broke it down brilliantly and it was very comprehensible!!

  • Like 1
Link to comment
Share on other sites

2 hours ago, compuls1v3 said:

@SteelCityColt At first I was feeling unsure if I would understand your post, but you broke it down brilliantly and it was very comprehensible!!


Thank you for the kind words, it really is just a very low level first pass. 
 

The conversions confused me as well, it feels like to me if your 3rd down conversions don’t correlate the 1st downs overall should more or vice versa. There is a lot of situational nuance that would get lost in this. Also worth considering is that for all tha I’ve used linear regression for calculating correlation, in order to get continuity of method. For some exponential might have shown more correlation. 
 

Regards the passing metrics, they do seem to chime with what we’ve been seeing, an above average but below franchise/elite type guy. Obviously more than just QB play factors in, but as a yardstick it feels about right.

Link to comment
Share on other sites

Man, congratulations @SteelCityColt really. This post is simply amazing in all aspects of it.

 

Like @EastStreet said, I wish that some people on the forum read it so we can stop talking some nonsense lol.

 

Now about the post. I kind of fell in love with ANY/A right now, really. It's a very nice stat to measure QB play, as stated on the post. As about the correlation of the third downs (and about downs, sacks etc), I guess that the situation and othe variables plays a huge amount on the weight of these ones. So I ain't too surprised by that in the end.

 

You can say that about the turnovers too.

 

Well, congrats again, and now I'm eager for part 2 lol

  • Like 2
Link to comment
Share on other sites

4 hours ago, DiogoZ said:

Man, congratulations @SteelCityColt really. This post is simply amazing in all aspects of it.

 

Like @EastStreet said, I wish that some people on the forum read it so we can stop talking some nonsense lol.

 

Now about the post. I kind of fell in love with ANY/A right now, really. It's a very nice stat to measure QB play, as stated on the post. As about the correlation of the third downs (and about downs, sacks etc), I guess that the situation and othe variables plays a huge amount on the weight of these ones. So I ain't too surprised by that in the end.

 

You can say that about the turnovers too.

 

Well, congrats again, and now I'm eager for part 2 lol

Yup, this was a lot of hard work by @SteelCityColt and thankful to him for taking the time. It's posters like him that bring incredible value to the board. I know it's not everyone's thing, but hopefully it stimulates some critical thinking in some. 

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

19 minutes ago, EastStreet said:

Yup, this was a lot of hard work by @SteelCityColt and thankful to him for taking the time. It's posters like him that bring incredible value to the board. I know it's not everyone's thing, but hopefully it stimulates some critical thinking in some. 


Ahh but when you’re saying the “wrong” things...

 

Thank you anyway, I enjoy it for my own curiosity and understanding of the game, but also for the discussion on here with some posters.

  • Like 1
Link to comment
Share on other sites

7 minutes ago, SteelCityColt said:


Ahh but when you’re saying the “wrong” things...

 

Thank you anyway, I enjoy it for my own curiosity and understanding of the game, but also for the discussion on here with some posters.

I also do the digging I do for my own purposes of understanding. 

I try hard not to care about "wrong things" lol.

Link to comment
Share on other sites

Great stuff, SCC.  Thanks for doing and sharing this. I wonder if you have similar numbers for last season. I would be interested to see if you get similar results.

 

Regarding the low correlations, I expected some because context, situational considerations, and other variables often cannot be captured in statistics.

 

On 11/26/2019 at 7:50 AM, SteelCityColt said:

This could suggest it’s better to trade having a reasonably worse TD:INT ration for a higher volume of passing touchdowns.

 

Too many coaches call "safe" plays to avoid interceptions (INTs). While high numbers of INTs are not good, I believe avoiding passing the ball for fear of INTs hurts a team more. 

 

I look forward to reading more posts like this one. Thanks again. :thmup:

  • Like 1
Link to comment
Share on other sites

19 hours ago, NFLfan said:

Great stuff, SCC.  Thanks for doing and sharing this. I wonder if you have similar numbers for last season. I would be interested to see if you get similar results.

 

Regarding the low correlations, I expected some because context, situational considerations, and other variables often cannot be captured in statistics.

 

 

Too many coaches call "safe" plays to avoid interceptions (INTs). While high numbers of INTs are not good, I believe avoiding passing the ball for fear of INTs hurts a team more. 

 

I look forward to reading more posts like this one. Thanks again. :thmup:

 

The dataset covers every teams' season from 2008 - 2018. So each data point will be a Win-Loss % they finished and their season total/average for the metric. 

 

If it helps, here's the raw data:

https://docs.google.com/spreadsheets/d/1jCZgap7Bsv7tKi7pbGn8Ky5HX7jmSUZCpcFOdmltiUg/edit?usp=sharing

 

I do think we're still seeing a slow movement towards more aggressive coaching, but without good execution it will probably have worse outcomes then more conservative play calling. If say you know your QB is good, not great, you might not try and push the ball downfield as much. 

Link to comment
Share on other sites

18 hours ago, 2006Coltsbestever said:

@SteelCityColt, that is actually great stuff. I almost felt like I was back in college. Great work.

 

I wouldn't quite go that far, it's a pretty rough first pass when it comes to looking at things but it's not a bad start at trying to work out what stats are important. So for example when the commentary brings up "X is only y% for 3rd down conversion so far", that might not really be all that vital. 

Link to comment
Share on other sites

  • 2 weeks later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...