Are pitchers who suppresses the proportion of balls hit in the air more successful?
Somewhat disconcertingly, it sort of depends on who you ask.
Digital data compiled by Baseball Info Solutions (BIS), and reported by FanGraphs, suggest no.
Indeed, if one weren’t paying close attention, one might think the opposite were true. After controlling for fielding-independent pitching and team fielding, an increase in the proportion of BIP that are either line drives or fly balls indicate, counterintuitively, a very tiny decrease (r = – 0.06) in runs allowed per game.
But this is a statistical illusion. The zero-order correlation between the proportion of line drives and fly balls and runs allowed is very slightly positive, as you’d expect. But it is such a small relationship (r = – 0.05) as to be practically meaningless.
The impact of FIPr and team fielding, in contrast, is very very large: r = 0.67. After taking those influences into account, non-ground-ball BIPs have a slightly negative correlation, but the reversal is just the unsurprising result of statistical churning as the regression model avails itself of an opportunity to use that variable to nudge the regression line just a tad bit closer to odd-ball post FIPr/team fielding residuals.
The results display exactly the same pattern—the same sign reversal, the same signature of practical irrelevance—if we look at the ratio of ground-balls to fly outs generated by a modern-day pitcher.
So if I looked only at this, I’d conclude that being a pitcher who forces batters to hit a larger proportion of balls on the ground is not of any genuine consequence. And I wouldn’t be all the surprised, given that FIPr and team fielding account for upwards of 85% in the variance in team runs allowed in today’s game.
But the reason I don’t buy this completely is that data extracted from Retrosheet assign suppression of balls hit to the air in the OF a modest effect.
As I mentioned a couple posts ago, I’ve now constructed a program for coding the information contained in the event fields of Retrosheet play-by-play data (that information is aggravatingly convoluted, and although there is a companion Project Scoresheet classification of BIP outcomes, it is of questionable reliability).
The coding I’ve done coheres. It reflects consistent pitcher tendencies to generate BIPs of various sorts—ground balls, infield pop ups, balls hit to the air in the OF.
And those tendencies have a discernible, if modest, influence on the success of pitchers in limiting runs.
For the period covered by BIS’s data (2002-2024), that data suggest (pace the BIS data) that suppression of balls hit to the outfield does reduce runs allowed, even after taking account of FIPr and fielding. A pitcher in the 90th percentile in generating infield BIPs (either grounders or pop-ups) would be expected to yield about 0.15 fewer runs per game than an average MLB pitcher.
That’s not a meaningless impact.
Naturally, I worried that I might have coded the BIS data incorrectly. But I’ve ruled that out.
The BIS data are presented in a straightforward and easy to code way, and of course I double-checked everything.
But even more telling is the coherence of the data.
The BIS BIP metrics suggest consistency in pitcher tendencies: a high Cronbach’s α’s of 0.88 for the tendency to yield or avoid line drives and flies and of 0.89 for ground-ball-to-fly-ball ratio, based on three seasons of pitching. That strongly implies the data, as coded, are measuring something—and that would be an astonishing coincidence if the data were miscoded, whether by me or by anyone at BIS or FanGraphs.
What’s more, there is a very high congruence between the pitchers shown to suppress fly balls in the BIS data and to do so in the Retrosheet data. Fly-ball annihilator Brandon Webb has many of the best seasons in both sets of data. Derek Lowe, too. No way there’d be this sort of agreement if a coding error had occurred.
Indeed, I turned to the BIS data very much to determine whether I could be confident in the validity of my Retrosheet data.
I do feel pretty confident about that at this point.
But I’m mystified about why the BIS data seem to be missing the outfield-balls-in-air signal that comes through in Retrosheet.
Very mystified!
Maybe you can help dispel the anomaly? Take a look at the data for yourself!