Explaining Away Regression To The Mean
Odds are you’ve read a story lately about how Russell Branyan is struggling as he reaches the summer of his first season as a full-time player. After a monstrous first half, he’s not hitting as well lately, and the explanations are pouring in. He’s tired. His back hurts. Pitchers are figuring him out. Managers have figured out how to shift against him and he hasn’t adjusted. If you’re looking for a reason for Branyan’s struggles, you have a buffet of choices to blame them on.
Of course, there’s a simpler explanation – it’s just natural regression to the mean.
In April, Branyan posted a .405 batting average on balls in play. In May, it was .391. These are outrageously high totals that nobody in history has been able to sustain, much less a first baseman whose hardest hit balls end up in the seats. There was basically no chance that he’d be able to continue getting balls in play to find a hole 39% of the time. We talked about this quite a bit, warning that regression was coming. A guy who strikes out as much as Branyan does can’t hit .300. It’s almost impossible.
Indeed, regression did come. In June, his BABIP was a more normal .286, right around where we’d expect Branyan’s true talent level to be, based on his skillset. His monthly line was still a good .265/.376/.590, but the batting average didn’t get inflated by balls avoiding gloves in record numbers. July, though, has been uglier – .180/.288/.426, giving rise to all the various theories for the cause of the slump.
Branyan’s BABIP in July? .200. His other, more stable numbers?
13.6% BB% in July, 12.8% BB% for the season
33% K% in July, 28.5% K% for the season
.246 ISO in July, .292 ISO for the season
His walks and strikeouts are barely up and his power is very slightly down. Over 70 plate appearances, we’re talking about basically no difference at all. And, the extra strikeouts are actually just due to some coin flip calls by the home plate ump – his contact rate (69% in July) is higher than it was April-June (67%). There’s literally nothing to worry about here – Branyan’s slump is just normal BABIP variation. He got some good bounces in April and May and he’s got some bad bounces in July. He’s the exact same player he was, and reacting to the results will simply lead to making a bad assumption about what’s going on.
But this happens all the time. Not just with Branyan, but across the board. Remember Sean White’s struggles a few weeks ago? The local media decided it was because he was getting tired after being worked too hard for the first few months. White himself said he felt great, and had no problems, but that didn’t matter. He was giving up hits, and that meant he was running on fumes.
Sean White’s BABIP by month: .182, .182, .333 (he’s exhausted!), .125
White drastically overachieved the first two months of the season thanks to some good defense and good luck. The results started to match his talent level in June, and this was blamed on overwork. He’s been lucky again in July, but there’s no reason given to why he’s no longer tired. And remember, White claimed he felt great the entire time.
Players understand how this stuff works. Branyan was asked about why he’s slumping, and his response was basically “This stuff happens. The season is cyclical. Sometimes you run hot, sometimes you run cold.” (paraphrase because I can’t find the actual quote right now)
For whatever reason, though, people just can’t accept that there is not always a primary driving reason for a change in results. That’s why we get stuff like “so and so has changed his batting stance and is now hitting .500 for the last two weeks”, but you never hear about the new stance again after he goes back to hitting .260. Or, from a Mariner-centric point of view, you’ll hear a lot of talk about how the M’s need to keep their pitching rotation strong to keep the bullpen from regressing due to overwork.
Bad news – the bullpen is going to regress either way. Whether the M’s keep Bedard and Washburn or not, there a bunch of relievers on this team with numbers that are unsustainable. The M’s bullpen has an ERA that is 0.69 runs lower than their FIP, and while the defense is a decent chunk of that, there’s a luck component in there too. Sean White and Chris Jakubauskas are running crazy low BABIPs. 1.8% of Aardsma’s fly balls are leaving the park. These numbers are going to regress. They have to.
And when they do, you’re going to hear explanations for why. White will be tired again. Jakubauskas will have lost the command of his fastball. Aardsma will feeling the pressures of his first pennant race as a closer. We could write the stories right now. But, in the end, it’s just going to be simple regression to the mean, just like we saw with Branyan in June. He ran lucky for two months, had a normal Branyan month, and now is running unlucky. It doesn’t mean anything.
The sooner that we can get the world to embrace the concept of random variance, the better. Results fluctuate wildly in small samples due to uncontrollable factors. That’s just a fact of life, and when we’re forming our opinions, we need to realize just how powerful regression really is.