Stephen Vogt: manager of the year–again (introducing w162p)

December 10, 2025

Xavier

Here’s one fun feature of the manager-value estimator tools featured in the new “Monkeys vs. Earl Weaver” paper: they use different instruments for measuring season-by-season performance, on the one hand, and settled value, on the other.

The Hybrid Estimator proper is for “settled value.” It uses season-by-season performances but only as data that are then integrated via Bayesian updating to form an evolving but eventually stable evaluation of managerial acumen. Measured as wins added to the record of the .500 team under an average manager, this “w162” score reflects the manger’s “true” skill level, which in theory is the invariant innate ability the manger possesses and has brought to bear, not a cumulative measure of his impact on team performance.

But because it is also interesting to measure performance quality in particular seasons, the paper develops another instrument for that: the “provisional Hybrid Estimator.” The provisional Estimator too measures impact in terms of marginal wins for a projected .500 team under an average manager. That score is labeled “w162p”—for “provisional.”

The input for w162p, the paper explains, is akin to the season-by-season data estimate the fully matured Bayesian Hybrid Estimator uses but with an important adjustment. Season-by-season estimates are noisy and often extreme because they are ignorant of the longer-term baselines that the mature Hybrid Estimator uses based on its knowledge of manger performance since the beginning of baseball history in (approximately) 300 BCE. That knowledge is passed on to the provisional estimator, which appropriately “shrinks” the season performances.

The provisional estimator scores definitely aren’t as good a gauge of true managerial ability as the Hybrid Estimator w162s. But the w162ps are definitely realistic (in part because they are appropriately modest) and reliable season over season (that is, consistent).

Here are the w162ps of the last two seasons:

Note that 3 of the top 6 are the same in both seasons. So are the top and bottom scorers.

That top scorer, Stephen Vogt, has really had an astonishingly successful first 2 seasons. His mature Estimator w162 is 2.1, which puts him in the top 3% or so of all managers ever.

Remember, the Hybrid Estimator aggregates data it acquires in season installments, and starts with the presumption that everyone is average. It ordinarily takes two or three seasons for a manger to move much off that prior, and usually many more to settle into a stable score. It took Casey Stengel, who ended up with a w162 of 2.5, twenty seasons to convince the Estimator he was at least a w162 of 2! But that’s an extreme case: it usually takes about 5 seasons for the Estimator to determine what a skipper’s “true” ability level is.

But no one has burst out of the gates as quickly as Vogt. Maybe he’ll stay stuck at 2? Maybe he’ll go up more. But it is unusual to go downwards once the mature Hybrid Estimator becomes convinced you’ve got the “right stuff.”

In contrast, according to the provisional estimator, Craig Counsell had a pretty meh year last season: w162p -0.2. But no way that datum will change the mind of daddy Hybrid Estimator. Having pegged Counsell as w162 2.3–10th highest in AL/NL history–based on 10 seasons of data, the mature Estimator will likely stand pat on Counsell’s value until the end of his career.

So, analytics tells that in fact analytics have not reduced the value of managers today! It’s possible they’ve actually enhanced it!

A list of all the w162ps ever recorded—along with data and instructions on how to compute these scores—can be found in the data library.

Stephen Vogt: manager of the year–again (introducing w162p)

Xavier

Leave a Reply Cancel reply

Cool Cards

Stats & data

Quick Links

Contact

Stephen Vogt: manager of the year–again (introducing w162p)

Xavier

Leave a Reply Cancel reply

Cool Cards

Stats & data

Quick Links​

Contact

Quick Links