In a previous post, I presented data showing that the contribution of fielding to variance in run prevention had declined about 75% in the last few decades. I now want to look at a related phenomenon.
I’ll call it rfield inflation. It looks like this:
“Rfield” is the fielding component of WAR. It represents “the value in runs of all aspects of the player’s fielding.”
In the 1950s and ’60s, one unit of rfield equated to one run saved, as it is supposed to. But thereafter, it tanked. In the 2024 season, one unit of rfield was worth about 0.5 runs saved.
This is only an indirect consequence of the increasing contribution of fielding-independent pitching or “FIP” (strikeouts, walks, home runs allowed) to run prevention. Skillful fielding, it’s true, contributes less than previously to avoiding runs. But that just means better-than-average play should be awarded less rfield credit—and worse-than-average charged less—than in previous years. If the calibration of rfield had tracked the contribution of fielding to runs saved, a unit of it would have remained worth one run avoided over time. But because that hasn’t happened, fielding play today is being appraised at a rate that overstates its consequence by as much as a factor of two.
The result, logically, is a substantial bias in evaluation of the significance of players’ fielding skills on game outcomes.
Let’s consider the case of Derek Jeter.
It’s been claimed that Jeter “could be worst short defensive short stop of all time.” Baseball Reference records his career rfield as -253 or 253 extra runs incurred—0.11 runs per game.
But that’s a demonstrable over-estimation.
We can correct for rfield inflation with the fielding regression model used to generate the above figure, which tracks the actual impact of a unit of rfield in runs allowed across seasons.
For this purpose, we should exclude the 1990s. As I explained in my previous post, a short period of bad data renders rfield unreliable for a significant portion of that decade, as has been chronicled by Sean Smith in his book War in Pieces.
So let’s confine ourselves to the years 2000-2014, which make up 77% of Jeter’s career. During those seasons, Jeter was credited with an rfield of -224 in approximately 2109 games played.
But that assumes a unit of rfield was worth one run saved. It wasn’t.
If we perform the appropriate conversion to compensate for rfield inflation (the failure to recalibrate rfield as the relative contribution of fielding to runs saved declined) over the period of 2000-2014, we get -186.
If we assume, as seems defensible, that Jeter’s performance in the field between 1994 and 1999 was comparable to his performance thereafter, then he has been blamed for 20% more runs allowed than his play likely cost the Yankees over his career.
So how bad a shortstop was he? Should he be considered the “worst” ever?
I dunno. But I do know it makes sense to remove what rfield inflation has added to the debate. A 20% misestimation is pretty substantial, as measurement goes, and could have an even more dramatic impact in other settings (take a look at the distortion of the all-time ranking of defensive runs saved by third basemen, e.g.).
I want to conclude with one important point intended to avoid possible confusion about this analysis. Nothing about rfield inflation implies that rfield is an invalid measure of fielding proficiency.
On the contrary, my analysis in the last post strongly supports the validity of the measure, not only in the period since 2003, when Baseball Reference began to use Baseball Info Solutions digital evidence, but also before then, when rfield was based on Sean Smith’s ingenious coding of play-by-play summaries in retrosheets.
Aside from a period in the 1990s, rfield has always made a very valuable contribution to a model that, along with FIP, does an extremely impressive job in explaining variance in runs allowed at the team level. Such an analysis shows that fielding has grown progressively less important as a style of pitching that features spiraling strikeout rates has taken hold. But that is something we can discern precisely because rfield is a valid measure of what fielding adds to averting runs.
Rfield inflation is rather a failure to update. The model used to determine the contribution of fielding apparently hasn’t been re-tested over time to see if the precise weight it is attributing to fielding proficiency is continuing to make the contribution to runs avoided that units of the metric are represented as reflecting, viz., runs avoided.
That’s easy enough to fix. But the failure to make this correction can lead to some serious distortions in evaluating the consequence of how differences in players’ fielding skill—positive or negative—actually affect the fates of their teams.