AdERA and AdAVG are alive and well!

This website is dedicated to explanation, analysis and discussion of the baseball statistics Advanced ERA (AdERA) and Advanced AVG (AdAVG). These formulas were created out of a desire to form a comprehensive blend of advanced data available from Pitch F/X and Statcast, with the primary objective being to gauge the true performance of players within a given season, and to a lesser degree, their future performance. I believe that many of the formulas commonly used for statistical analysis (FIP, xFIP, wOBA and WRC, to name a few) have been useful metrics in the past, but are now becoming relatively less useful in the setting of new, advanced data. After years of tinkering, first with Pitch F/X data, and now with a blend of both Pitch F/X and Statcast data (thanks to the help of Tanner Bell’s work on Excel spreadsheets, which you should buy), Advanced AVG (AdAVG) and Advanced ERA (AdERA) were born.

Google sheets (updated on a daily basis) for each of the formulas are here:

Advanced AVG (AdAVG, minimum 50 plate appearances)

Advanced ERA (AdERA, minimum 10 innings pitched)

Here’s how, and perhaps just as importantly WHY, Advanced ERA (AdERA) and Advanced AVG (AdAVG) were built…

As someone who watches a ridiculous amount of baseball, I find it bothersome when I am watching a game and overhear the talking heads discussing how well a player performed when their results were born of sheer luck (e.g. Hitter X swings on a pitch outside the zone and gets 2 RBIs on a check swing blooper into shallow right field, or Pitcher Z gets three outs on three baseballs which were all struck squarely on the barrel of the bat directly at an infielder). In the case of the pitcher; Sure, he sat down three guys without giving up a hit or a walk, but if we play out the same scenario 1000 times, Statcast data from 2018 tells us that those “barreled” baseballs are going to become hits 77.2% of the time. Wow. Also consider that in 2018, balls which met “line drive” criteria during the regular season became hits 62.6% of the time. In the case of the hitter, if we assign the “poor/weak” quality of contact to his hit, we learn that it had a 19.5% chance of becoming a hit. So while it is certainly clear that results are important (batters producing more runs versus pitchers giving up fewer of them, to put it simply), I think that we lose a tremendous amount of value when we base our analysis purely on results (singles, doubles, et al) and forget about the likelihood of each respective result occurring. Since we are now capable of doing so with better capacity, I think that we should be building new formulas which include criteria that we know to produce desired outcomes a higher percentage of the time rather than relying on formulas which engineer backwards from results that happen significantly less of the time. Lets dig around a bit in a few of the older formulas…

Here’s a look at the guts of FIP:

FIP = ((13*HR)+(3*(BB+HBP))-(2*K))/IP + constant)

We can see that FIP takes a simplistic approach and directly rewards one outcome (Ks multiplied by 2) while penalizing others (HRs multiplied by 13, BBs and HBPs multiplied by 3), and then balancing the two relative to one another before dividing by innings plus a constant. While useful to a degree, this formula makes no attempt to consider the quality of batted balls or make any contextual analysis, and because of the way it is constructed, it is also incapable of determining the difference between an elite pitcher who pounds the zone versus a terrible pitcher who gets singled, doubled and tripled to death on a nightly basis but doesn’t give up the long ball. These are the kinds of things that keep me up at night, people.

xFIP takes things a slightly different direction :

xFIP = (13*(Fly balls * lgHR/FB%))+(3*(BB+HBP))-(2*K))/IP + constant)

Instead of directly penalizing home runs, it penalizes fly balls, and uses the league average for HR/FB% as well as a constant (FIP Constant = lgERA – (((13*lgHR)+(3*(lgBB+lgHBP))-(2*lgK))/lgIP)) in order to examine league-wide trends. My primary issue with xFIP is that it forgets to factor in one of my favorite mantras; the idea that “all fly balls are not created equal.” After all, if we are going to make fly balls the most heavily weighted (and penalized) component of a formula, then it should be clear that pitchers who give up more fly balls than their counterparts produce inferior outcomes, right? An example of the flaw in this focus is that data from 2017 and 2018 for all qualified starters tells us that we should fairly significantly devalue names like Scherzer (56th out of 62 pitchers in GB%, 2.52 ERA) as well as Verlander (60th out of 62 pitchers in GB%, 2.94 ERA), with these two pitchers averaging a FIP which is 0.65 runs higher than their actual ERA. That’s a big difference. Chris Sale is another name of a truly elite pitcher who is near the bottom rung of starters in terms of GB% and yet achieves consistently elite results. Even without specific examples such as Scherzer, Verlander and Sale, data from baseball savant tells us that the batting average on fly balls + pop ups in 2018 was .208, whereas the batting average on ground balls over the same time frame was .246. Are we sure that’s where we want to put our focus? Time to keep searching…

wOBA functions similar to FIP in that it rewards results in a direct way:

wOBA = (0.690×uBB + 0.722×HBP + 0.888×1B + 1.271×2B + 1.616×3B +
2.101×HR) / (AB + BB – IBB + SF + HBP
)

Similar to FIP, the beauty of this formula is its simplicity, but once gain, we have no qualitative measures in place and context is not addressed, so we end up with a product which is useful, but one that I think is lacking due to the fact that it does not analyze batted ball quality in order to determine the likelihood that prior results were “deserved.” We have no idea, for instance, if a hitter has been going outside the zone to make contact with pitches. Why is this important, you ask? By digging deeper with baseball savant, we can see that when hitters take swings outside of the zone including baseball savant’s “shadow” surrounding the plate, the batting average on these balls is a paltry .200. Want to use the Gameday zone you see on MLB.com instead? You’re looking at a .155 average on pitches outside the zone. Baseballs within the strike zone, on the other hand, become hits 31.3% of the time, and since we understand that baseball players are human and imperfect, we can also include what baseball savant calls the “shadow” surrounding the plate in our calculation, and even then, we still see a significantly higher batting average at .265.

WRC, which has a similar relationship to wOBA in comparison with xFIP/FIP, includes built in contextual factors including league averages in an effort to appropriately balance and scale the metric:

wRC = (((wOBA-League wOBA)/wOBA Scale)+(League R/PA))*PA)

Similar to xFIP, WRC is a useful metric, but given that it is simply an expansion of a basic metric which at its root rewards individual results in a linear way and without respect to the likelihood of any of the results it rewards, it is limited. There has to be a better way…

So what’s different about Advanced AVG and Advanced ERA?

As I have shown above with a few examples, we now have data available to us via Statcast and Pitch F/X which gives us a much better idea of what happens to batted balls. So when we celebrate (or in the case of many old school metrics, reward) results which were unlikely to occur based on the best data we have available, I think we miss the most important parts of the overall picture. I think it is important to recognize that hard contact is not everything, though, and as the guy who made the documentary about eating too much from one particular fast food restaurant can tell you, we shouldn’t be overly reliant on one source. Line drives are another interesting statistic, but it has been suggested by multiple credible sources that it can take as much as 18 months for data on that particular statistic to stabilize.

Here’s some data (updated for the 2019 season with data from games through 6/28/19) which shows the correlation between specific statistics and ERA:

0.475ERA vs Barrel %
0.514ERA vs Barrel/PA
0.374ERA vs Hard contact
0.327ERA vs Avg exit velocity
0.220ERA vs Zone Contact %
0.252ERA vs Outside zone swing %
0.243ERA vs Hard %
-0.044ERA vs Med %
0.255ERA vs LD %
-0.733ERA vs LOB%
0.259ERA vs Sw Str%
0.430ERA vs K/BB ratio

Here’s some data (updated for the 2019 season with data from games through 6/28/19) which shows the correlation between specific statistics and batting average:

0.330AVG vs BB/K ratio
0.386AVG vs LD %
0.302AVG vs Contact %
-0.137AVG vs Med %
0.334AVG vs Hard %
0.009AVG vs Oswg %
0.250AVG vs Zone Contact %
0.439AVG vs ISO
0.019AVG vs wSB
0.191AVG vs Barrel percent
0.266AVG vs Barrels/PA

In the end, hard hit baseballs and line drives were certainly something I wanted to factor into both of my formulas, but I realized that the data was showing us some other important things that could also be used to build the “perfect” pitcher and hitter, and seeing the correlation figures also helped me see that focusing too much on one or two particular categories would not give me a reliable sense of a player’s overall performance. If we make the connection back to data from the beginning of this post about line drives being hits 62.6% of the time, it might then be a bit confusing to see a somewhat low correlation figure of 0.255 between line drive % and ERA. I knew that I needed to make a formula with a blended approach to get this right…

Here are the categories I settled on

Advanced AVG:

  1. Barrels / PA (number of barrels per plate appearance)
  2. Barrel %
  3. Hard contact % (the harder the better)
  4. Medium contact % (as you can see by the correlation figures above, medium contact % has a very poor correlation with performance in AVG or ERA, and this category is essentially tuned to zero within the formula)
  5. Contact % (balls first need to be contacted in order to be hit hard, right?)
  6. Strike Zone Contact % (swings on pitches in the zone have better results, as shown)
  7. Outside Strike Zone Swing % (swings on pitches outside the zone have significantly poorer results, as shown)
  8. BB/K ratio (an excellent indicator of a player’s eye and discipline, among other factors)
  9. Line Drive % (line drives are significantly more likely to be hits, as shown)
  10. wSB (just to add an element of speed)

In the case of AdAVG, hitters who make more hard contact than their peers, have more barrels per PA, have a higher barrel %, make more contact in general, make more contact in the zone, take fewer swings outside the zone, have a better BB/K ratio and hit more line drives are going to have a higher AdAVG than their peers. wSB was included in order to reward players who create opportunities with their legs via the stolen base. On to Advanced ERA…

Advanced ERA:

  1. Hard contact (Statcast)
  2. Average exit velocity (Statcast)
  3. Barrels / PA
  4. Barrel %
  5. Hard contact %
  6. Medium contact %
  7. Zone contact %
  8. Outside zone swing %
  9. K/BB ratio
  10. Swinging Strike %
  11. Line Drive %
  12. LOB % (to balance out for those lucky pitchers who like to leave more people on base than their peers)

Advanced ERA attempts to build the perfect pitcher by rewarding pitchers who allow less hard contact, have lower average exit velocities, fewer barrels per PA, a lower barrel %, lower hard and medium contact %, a lower % of contact in the zone, induce a higher % of swings outside the zone, have a higher K to BB ratio, a higher swinging strike % and a lower % of line drives. The percentage of runners stranded is accounted for in an effort to balance against those who are more fortunate than the league average strand rate.

Here’s how the formulas were built…

The selected categories within each formula were first balanced against one another. The reason that this is necessary is because without making an effort to balance the categories relative to one another, it is difficult to achieve a meaningful blend of data. After all, if we examine a statistic within which a statistical variation of 10% is quite meaningful (zone contact %, for example), it is difficult to balance directly against a category such as ISO, within which a deviation of 10% is significantly less meaningful due to the relatively higher variation in that category from top to bottom. Particularly on social media like Twitter, where posts are necessarily brief, we often see fantasy baseball information presented in tables with two to three categories side by side (e.g. Pitcher Z has a swinging strike % of x, gives up y % of hard contact and has an average exit velocity of z). When information is presented in this fashion without establishing meaningful deviation in each category, we have a hard time making a qualitative judgment as to the overall blended performance of that player, which is what I think that fantasy baseball fanatics are all ultimately trying to capture.

Once the categories are appropriately balanced by weighing them against one another, I think we begin to see more meaningful trends emerge, and we can continue on the journey to building a better baseball player. The method for balancing is simple. We first calculate the standard deviation within each statistic we have picked out for our formulas. After we have determined standard deviation in each category, we then divide all of the standard deviations by the standard deviation of one particular category which we choose for our reference, which will establish how we balance each category against the next. Now we have what we will call our “base factors” (the ratio of the standard deviation in our reference category divided by the standard deviation in each of the other categories), we use these base factors to magnify the results of player performance in each specific category.

We can then make individual player ratios for all of our selected categories and build a “raw” score for each player by adding up their scores in each category and appropriately weighting them so as to ensure that they are balanced against one another. In the case of hitters, I wanted to construct a statistic that would live in a range similar to batting average but deliver a result that is most consistent with “runs created”, or something to that effect. Pitching was just the opposite; I wanted to put together a statistic that would move in tandem with ERA (dropping when pitchers were excelling and climbing when they were not).

So the end result of each of the formulas is something that looks like this:

(Category A, the reference category, multiplied by priority factor) + (Category B, multiplied by base factor X in order to balance to category A, then multiplied by priority factor) + (Category C, multiplied by base factor Y in order to balance to category A, then multiplied by priority factor) + (Category D, multiplied by base factor Z in order to balance to category A, then multiple by priority factor), etc…

Leave a comment