Us

Email the authors
Meta
Reference Material

The Future Forty 3/19/09
Evaluating Defense Evaluating Pitcher Talent
Evaluating Managers
Bugs Bunny, Greatest Banned Player Ever Selected for Best American Sports Writing!
The Attrition War
Disclaimer, Copyright

The U.S.S. Mariner is in no way affiliated with, condoned or given any notice by the Seattle Mariners baseball team, who have their own website. Similarly, we have no association with the ownership group or any businesses related to the Mariners. All article text is written by the authors, all pictures are taken by the authors, who retain copyright to their works. No copying or reproduction of any content here, photographic or otherwise, is authorized. Please email us if you wish to reproduce our work.

Small Sample Size Craziness

Dave · April 11, 2008 at 9:12 am · Filed Under Mariners

Note – none of these numbers mean anything. They have zero predictive value, and you shouldn’t draw any conclusions from them at all. Seriously, don’t believe that there’s any information in here that should change your opinion about anything. It’s just interesting to me. These numbers come The Hardball Times and Fangraphs, by the way.

Edwin Jackson, in his two starts so far, has allowed 18 flyballs. 14 of those 18 flyballs have been infield flies. That’s a 77% IF/F rate. Last year, the major league leader in IF/F rate was Bronson Arroyo – 15.4% of his flyballs were infield flies.

It’s not just Edwin Jackson, either. The league average IF/F rate is 18% in the American League and 13% in the National League. While it’s almost certainly early season random variation, that AL infield fly rate is absurdly high, and is probably one of the main reasons offense is down across the league.

Fausto Carmona has thrown 13 innings, walked 9, and struck out 7 in his two starts so far. He has a 0.69 ERA. A 78% ground ball rate covers a multitude of sins.

The San Francisco Giants, as a team, are hitting .230/.276/.331. That’s a .607 OPS. As a team. Willie Bloomquist’s career OPS is .642, and in his worst season, it was .613. The San Francisco Giants, as a team, are hitting like a slumping Willie Bloomquist. That’s what you get for having a Molina hitting cleanup.

Speaking of the Giants, Jonathan Sanchez has the best strikeout rate in the majors through two starts. In fact, his season line of 10 IP, 10 H, 1 HR, 4 BB, and 18 K suggest that he’s been one of the more dominating starters in baseball so far. His ERA is 6.30. 6.30! Yet another reason why ERA is useless as any kind of predictor of things to come.

Okay, one more reason ERA is pathetic. Steve Trachsel couldn’t be any less effective if he tried – 12 innings, 6 walks, 3 strikeouts, and a 36% GB rate. He’s not throwing strikes, missing bats, or getting groundballs. He’s doing exactly zero things that lead to sustained success. He has a 6.32 xFIP during the part of the season when team’s aren’t scoring runs. His ERA? 3.00.

The average velocity on Barry Zito’s fastball in 2008 – 82.7 MPH. Seriously, Bill Bavasi should send chocolates to Brian Sabean every day for the rest of his life for outbidding him for that disaster of a contract. Zito is teetering on the edge of replacement level, and the Giants are on the hook for $18 million per season. This is the cost of not learning the lesson of the uselessness of ERA.

And, I’m off my soap box now. Go M’s.

Tags:

Comments

71 Responses to “Small Sample Size Craziness”

HamNasty on April 11th, 2008 1:50 pm

50- How about we use Adam Dunn to make this easy. Adam Dunn is not having the best start to they year. You can actually look at BABIP and see it is .190 and realize he is just unlucky and will eventually start hitting again. There is no reason to think Adam Dunn lost his timing. Now if another 3 months into the season Adam Dunn is still hitting .172 you can start to look at other factors, eyesight, swing mechanics, whatever. But to look at 29 AB’s and think his swing must be broke isn’t right when you can see he is being unlucky with his balls in play.

Someone who has read The Book can help me with this, in a study done on “slumps” I am guessing 80-90% of them deal with luck rather then a inability to perform a skill.
b_rider on April 11th, 2008 1:55 pm

Someone who has read The Book can help me with this, in a study done on â€œslumpsâ€ I am guessing 80-90% of them deal with luck rather then a inability to perform a skill.

Sure, but “80-90% are not x” is a lot different than “there is no such thing as x”. That’s all I’m saying.
HamNasty on April 11th, 2008 2:03 pm

By X I am saying that skill is gone. As in Barry Bonds lost the skill to steal bases because his knees went bad and he got slow. He didn’t “slump” in stealing bases. He lost the ability to perform the skill.
smb on April 11th, 2008 2:03 pm

DMZ,

When you say high GB% covers a multitude of sins, do you mean that in the obvious sense, or did you mean in terms of Carmona specifically? If so, what are the holes in his game? Just curious…
HamNasty on April 11th, 2008 2:06 pm

Does anyone have any recommendations about billjamesonline.net? I know 3 dollars a month is very small, still would like to know any feedback on the site as I am looking into a subscription. As I am sure topics like this are covered.
joser on April 11th, 2008 2:20 pm

smb: Dave wrote this post, not DMZ.
Jeff Nye on April 11th, 2008 2:23 pm

Iâ€™m not saying that Burke has got to the point where heâ€™s a better option than Johjima. But it seems to me to be theoretically possible.

I’d need a lot more than one home run to convince me of this.

I like Burke as a backup catcher, but that’s all his playing history leads me to believe he has the potential to be.
BurkeForPres on April 11th, 2008 2:23 pm

48- I couldn’t agree more. Players may go in a slump because they HAVE experienced a series of random events where they got the short end of the stick, and maybe had a couple of games hitless. The player starts to try to do too much to get a hit because he’s struggling, his mechanics go a bit out of wack, and bam, until the mechanics are back, that’s a tailor made slump.

Same with pitching. You get a couple of calls that don’t go your way, maybe a couple of strikes are called balls, you walk the bases loaded, and it gets into your head. You try to overthrow, and instantly everything is belt high and above.

What I’m getting at here is the mental aspect of the game cannot be measured with statistics. Having played sports for the majority of my life, although obviously not at the level of professional athletes, I can say that often “slumps” or having a “hot hand” is very self propagated. Of course, if no one ever changed their mechanics at all, and never got it in their head that they need to do something different, which could be true for many players, than I could see the randomness, but that isn’t the case. I absolutely believe in “the zone” and I feel like level of focus has a large effect on performance. For instance last year Lopez had to deal with the death of his brother. His focus probably wasn’t on baseball as much as it should have been, and I believe he underperformed because of it.

Now I suppose it depends on your definition of “slump,” and exactly how much time has to pass for a player to be in a “slump.”

*shrugs*
BurkeForPres on April 11th, 2008 2:25 pm

Also

I just don’t get it, man. No one ever said: “When I was a kid, if we were going to cut off your leg we’d give you a shot of whiskey and a rope to bite down on, and we’d just take a dirty hacksaw and just hack away, outside, on the ground. Why do all these nerds keep talking about ‘anaesthesia’ and ‘sterilization?!'”

HAHAHAHAHA!
smb on April 11th, 2008 2:33 pm

joser,

Thanks for pointing it out. Bonehead mistake, sorry.
joser on April 11th, 2008 2:47 pm

I apologize in advance for the length of this:

People need to stop seeing patterns in every piece of data.

The trouble is, seeing patterns is what defines us as a species. It’s what we do. Birds fly, fish swim, and humans find patterns. More than opposable thumbs or bipedal locomotion, finding patterns is the defining human trait. It may have started as just a better way to find food and avoid predators, but it’s the thing we do better than anything else. Without patterns language is just noise. But patterns in noise became language, and so we piled pattern onto pattern, abstraction onto abstraction; patterns scratched in the sand, hammered into rock, stoked onto papyrus, turned that patterned noise into writing and broke our knowledge free from a single lifetime in a single place, giving us culture and institutional memory, institutions and constitutions, math and science, civilization. A baby can identify the faces of everyone it knows, can tell a sad face from a happy face, and discern one voice from another; eventually it figures out what those voices mean. Thatâ€™s nothing but pattern recognition. We still have trouble programming computers to do that, and humans do it almost from birth.

And it never stops. Pattern recognition has worked so well for us, for so long, that we can’t give it up. It operates below the level of rationality, often below the level of consciousness. We see patterns in everything, and if there is no pattern we’ll find one anyway. We can find the landscape in a Monet, but we also see shapes in clouds and faces on Mars and the man in the moon. In the real world, governed by physical law, there’s very little that is truly random — and it is easy to tell when we found it, because we’ve historically given it supernatural agency. Lightning really is random, so of course there must be a god tossing it around — a god implies some intelligence, and with intelligent agency comes some pattern even if it is unknowable to us (and — who knows? — if we are devout, perhaps the pattern will be revealed).

So confronted with truly random data, we don’t just throw up our hands and admit it’s random — at least not until we’ve tried everything else. Numerology, astrology, the Da Vinci Code. The “hot hand.” Perhaps we no longer invoke Zeus or Athena, but we still look for explanations, for correlations, for patterns. And of course there are patterns in random data, sitting there for us to find. The coin really does fall heads-up ten times in the row sometimes. You really do sometimes get a series of good outcomes when you’re wearing your lucky socks. Sometimes your â€œsystemâ€ for betting on craps really does pay off. It goes against everything we are as a species, against our very nature, to not take the next step, to not think we have an insight, to not extend correlation to causation and conclude it means something. Thereâ€™s a pattern there, right? Patterns always mean something. It canâ€™t just be noise, an empty signal, a permanently blank spot on the map. There lies monsters.

It takes stern stuff, and a cold firm grasp of rationality (and a denial of millions of years of our primate selves) to step back, wave our hands, and say â€œthereâ€™s nothing there.â€

I think anyone whoâ€™s played little league understands that good performance does not lead to positive outcomes, and that dumb luck can make almost anyone a hero. It baffles me how mainstream announcers and analysts – even ex-players – donâ€™t acknowledge this randomness more often. Isnâ€™t it amazing how â€œprofessional hittersâ€ always seem to receive so much credit for their seeing-eye or bloop singles?

Because people don’t want to think it is dumb luck, bad or good. People want to have control over their destinies. It’s why people have lucky charms, or habits, or superstitions. It’s why people pray. When the tornado takes out one house and not another, people don’t want to chalk it up to non-linear atmospheric physics and chaos theory and simple bad luck. It’s “god’s will” (with the implication that if they had done something differently, god would’ve willed otherwise, or if they change their ways going forward, god will smile on them in the future). Baseball, it has been said, is a profoundly humbling pursuit, where even the best players fail more often than they succeed. It takes incredible ego, and faith in oneâ€™s abilities, to continue in the face of that. So when a player does succeed, he doesnâ€™t want to ascribe any part of it to luck. It may be better to be lucky than good, but we want to be fans of a good team, not a lucky one, and players think that â€œgoodâ€ is under their control and know that â€œluckyâ€ is not. Nobody wants to believe that some portion of their success is random, that a few bad bounces are all that separate them from the next guy, the also-ran. And itâ€™s true, over the course of a season, of a career, talent does win out. But in any given game, on any given play, thereâ€™s an element of randomness, an element of luck. But itâ€™s against our nature to see it, and we certainly donâ€™t want to want to admit it.
Jeff Nye on April 11th, 2008 2:53 pm

Good post, joser, despite being long.

We all want to think we can understand and attribute causes to everything, and it’s difficult as a species for us to get our heads around the concept of randomness.
BurkeForPres on April 11th, 2008 3:25 pm

Awesome post joser. The content definitely justified the length.
Mike Snow on April 11th, 2008 3:31 pm

Because people donâ€™t want to think it is dumb luck, bad or good. People want to have control over their destinies. Itâ€™s why people have lucky charms, or habits, or superstitions.

And baseball players are notoriously superstitious.
JMHawkins on April 11th, 2008 3:47 pm

But let me say that what looks like randomness or probability at a distance might in reality determined by something on the smaller scale.

Studies have been done on this. There is such a thing as “being hot”, but it’s too small to warrant any attention. A player’s performance over the past three years is a far better predictor of how he will perform over the next few games than his performance over the last few games. A hitter on a “hot streak” might expect his OBP to be 3 or 4 points higher than his “normal” average.

So, if Brad Wilkerson, based on his career, is expected to have a OBP of .363 and is on a cold streak, and Willie Bloomquist, based on his career, is expected to have an OBP of .313, then if Wilks is cold and Willie is hot, Cold Brad is still likely to be 40 points better than hot Willie.

Yes, there’s predictive value in hot streaks. It still doesn’t matter. Play your best players.
rea on April 11th, 2008 5:04 pm

Every player is bound to get a hit at his next at bat, because all players are either hot or due.
Typical Idiot Fan on April 11th, 2008 6:00 pm

That Jim Armstrong article has to be an intentional troll / satirical piece. There’s no way somebody like that still exists.
Steve T on April 11th, 2008 6:44 pm

Another thing that’s hard for old-school fans to grasp is this: if it looks random, it IS random. If it is indistinguishable from what a random toss would suggest, given an underlying rate of probability and so on, IT IS RANDOM. Even if it’s accomplished with mega effort.

Or, if that’s too painful to accept, then think of it this way: it adds nothing to the understanding to think about the non-random aspect. Nor anything to prediction of the future, which is even tougher.
Steve T on April 11th, 2008 6:45 pm

@67 — I would bet that about 90% of all the fans in the ballpark on any given day would agree with him 100% — which is why that sort of thing gets published.
Graham on April 12th, 2008 3:10 am

We might have to put them in their place. Know your role, LL!

Little does Dave know of our master plan to infiltrate USSM by posing as volunteer moderators. Heh heh heh.
joser on April 12th, 2008 7:00 pm

Well, now that LL is flush with corporate money, they should be able to do more than infiltrate.

Of course, they may not bother now that they’re watching M’s games from their solid platinum hot tubs filled with champagne and endangered salmon.

Leave a Reply

You must be logged in to post a comment.

Recent Posts
Author Links
- Cheater’s Guide to Baseball Blog
- Hate Life, Will Travel: Derek’s blog
Local M's Coverage
M's Blogosphere
M's Official Sites
Resources
Twittah

U.S.S. Mariner

Us

Meta

Reference Material

Disclaimer, Copyright

Small Sample Size Craziness

Comments

Recent Posts

Author Links

Local M's Coverage

M's Blogosphere

M's Official Sites

Resources

Twittah