What about Statcast?… (An update)

This was something that for sure needed inclusion in the fielding paper: an assessment of what Statcast metrics have to say about “fielding shrinkage.” They say: “yup, unh huh, fielding doesn’t matter too much anymore.”

Beyond that, the insight gleaned from adding Statcast data is that Statcast’s fielding measure is not any better than others. Indeed, it explains less variance in team runs allowed than either Smith’s DER (Defensive Efficiency Record; he uses DER as prefix, and I’m calling his measure that now to distinguish it from “Outs Above Average” label Statcast also uses) or Fielding Bible’s Defensive Runs Saved. The same as Ultimate Zone Rating, another digital metric examined in the paper.

So in other words it’s not how big your data are that matters; it’s what you do with them that counts.

Take a look for yourself:

Or better, read the new draft. Tell me what you think of it so that I can make it better!

2 Responses

  1. I can understand why DER will correlate very well, nearly perfectly well, with runs allowed once you combine with FIP. That does not mean it measure fielding better than Statcast or the others. DER perfectly measures the odds of a ball in play turning into an out, by definition. It would do so even if credit for the players was simply the team total divided into 9 equal parts and assigned to each position by innings played.

    DER is a combination of 1. fielding ability + 2. pitcher ability to avoid hard contact + 3. park effects + 4. random variation.

    In my metric I try to estimate 1 by adjusting for 3, but don’t do anything about 2 and 4 beyond hoping it evens out with a large enough sample size.

    Statcast is the best approach to try and get at #1 independent of the other 3 factors.

  2. Hi, Sean. Thanks again for the response.

    Again, I think you are understating the significance of the performance of your own measure here!

    All the systems — DER, Statcast, DRS, UZR, TZR — present “fielding runs saved” metrics. They are all constructed in the same way: by calculating (with one source of data or another) the probability that a ball hit in play will be turned into an out by a particular fielder.

    It’s not clear a priori whether to expect any of them, including DER, to contribute anything to analyses of variance in team runs allowed after controlling for FIP. That depends on the measures’ validity, which is an empirical, not a theoretical, property of them. You say Statcast is “best” for measuring fielding proficiency. It might be, but there’s no way to know that without empirical testing.

    None of the measures, including DER, perfectly correlates with team runs allowed when combined with FIP. It would be unrealistic to expect that! But it isn’t unrealistic to expect them to add something if they are to be relied upon for insight.

    They all do. They are all valid in that sense.

    But it’s interesting to learn that DER actually does explain more than the others, despite its derivation from simple play-by-play outcomes.

    Indeed, it’s not just interesting but I think important. Systems that rely on digital evidence *should* do better than anything else. If, empirically, they don’t, then that means the models used to process the fine-grained evidence they rely on (including, in the case of Statcast, location, velocity, hit angle, etc.) need to be improved. As I’m sure they will be over time—regardless of anything I have to say!

    In any case, the paper isn’t about what metric is “best.” It is about the diminishing importance of fielding in determining outcomes in major league baseball. It wouldn’t be possible to establish that clearly but for the validity and strength of the non-digital measures that you created, since the decline in the consequence of fielding is largely in relation to the significance it had *before* the advent of digital measures. It’s reassuring to discover, too, that the pre-digital measures continue to perform so well in relation to today’s digital ones, since that helps to establish the reliability of the pre-digital as applied to earlier eras of baseball.

    I really am not trying to make the case that your work is better than anyone else’s, certainly!

    But I hope everyone who is curious about how baseball works, and interested in studying that empirically, recognizes how indebted they are to you for these metrics.

Leave a Reply

Your email address will not be published. Required fields are marked *