Baseball Hacks
I’d recommend Baseball Hacks to anyone who has ever hung around here (or other baseball analysis sites) and thought “I wish I could get detailed stats like those” but didn’t know where to start. If you want not just to digest baseball research but check it and tinker with it yourself, and you’re willing to get your hands dirty, this is your book. And the dirtier you’re willing to get, the more you can get out of it.
Here’s my quick-and-dirty summary of the book
Chapter 1, Basics of Baseball.
Baseball information is on the internet! Whee!
Chapter 2, Baseball Games from Past Years
This is good stuff: getting yourself databases with all kinds of past game stats, hooking it up, querying it… and this is where we start to get into the real work: using Perl makes an appearance. Still, it’s almost all database-and-SQL stuff, and isn’t that heavy – if you’re not scared of the word ‘database’ you’ll be fine.
Chapter 3, Stats from the Current Season
Noooow it starts to get heavy. Hack 25 is “Spider Baseball Sites for Data” for instance. Soon it’s into building and keeping current year stats updated.
Chapter 4, Visualize Baseball Statistics
This is cool stuff, and instead of being programming/technical heavy, it’s much more into statistical analysis and visualization.
Chapter 5, Formulas
How to calculate a bunch of stats.
Chapter 6, Sabermetric Thinking
This is where you’d think things get interesting, and that’s kinda true. Here it’s about how to use the data you’re getting to look for good stuff. I disagree with how he goes about some of it (Hack 64, on clutch hitting, specifically) but it is good to see what kind of things the data can offer you.
Then there’s some fantasy stuff, which I’m sure would be great if you were interested in using your newfound data to try and find some crazy advantage. I skipped it, because that’s not me at all. And really, when Baseball Prospectus has a pretty good budgeting-and-forecasting thing, it seems a little pointless.
So what can this all get you? If you’re interested in historical baseball stats, and know or are willing to learn a little bit about databases, it’s a nice walkthrough from getting a freely available database of historical stats (The Baseball Archive) and setting it up nicely so you can do cool stuff. From there, well… even I don’t get into the kind of data-scraping that’s in here: I’d rather put up with ESPN’s ads and use their splits, or build it out of Retrosheet box scores the hard way, or whatever. And I’m fairly technical and willing to tinker with this stuff. Some of the more advanced stuff seems geared towards someone with fair technical skills who wants to tinker with both baseball data and with building thier own framework, rather get started in baseball analysis.
I will say that there’s a lot of value in having access to even a nice historical database of raw stats: I find myself pawing around it all the time, looking for interesting stuff that ends up a throwaway reference in a piece here.
So this is a book where if you’re looking to get a lot more technical and want to do a lot more research independently, you’re going to dig it. If you’d just like to be able to baseball-reference-y things, that part’s fairly easy too, and I’ve found it quite rewarding.
However, it’s not about baseball, or really about baseball statistics, or anything. It’s about (as you’d guess from the title), using computers and freely available data to hack stuff together.
Anyway, I hope this helps determine whether it’ll be a good book for you or not. check it out if that sounds interesting.
AJ the joking gamer
From ESPN: in a long article about AJ Pierzynski, who, it turns out, is just misunderstood:
Two hours earlier, the same man stands behind home plate during a team workout. A ball just dropped in front of outfielder Jermaine Dye, and the catcher won’t let his friend hear the end of it. “Don’t you know how to run?” he yells to Dye. “We run here.”
Dye mumbles his rebuttal, but the catcher has one of his own. “What’s that Jermaine? I’m sorry — we don’t speak Ebonics,” he says jokingly. “I can’t understand you.”
How, exactly, is it a joke to say that to a black teammate?
Wednesday Mariner update
PI: M’s beat the Cubs. Sexson wants to hit well. M’s going to make first cuts on Tuesday.
TNT: Wladimir Balentien and his strikeouts. Includes some comments from the team’s new hitting coach, Pentland:
“I’m not as worried about strikeouts as some people. A lot of young Latin players know they have to hit just to get a shot at playing professionally, so they don’t take a lot of pitches,†Mariners batting coach Jeff Pentland said.
“There are a lot of factors in striking out in the minors – including young umpires. If you’re a young hitter working on discipline and you take a pitch off the plate and it’s called strike one, that doesn’t help.
“You put too much focus on the strikeout, you make it harder for a young hitter because now he’s getting away from his natural skills. You have to let experience come, and you work on their timing, their swing, their pitch recognition.â€Â
The TNT also put up a story about this one Mariner outfielder from Australia yesterday after the day-in-news post which is, uh, worth reading. Cough.
Times: If you’ve been dying for a Dave Burba story, well, you’re now sated.
Update: aaand we get the first sighting of the team commercials. I think the Ichiro one’s clearly pick of the litter.
Tuesday results and ramblings
So Jeff and I showed up to laze about Jonah’s Baseball Prospectus 2006 (and
Baseball Between the Numbers which I reviewed here) book reading/signing last night. I had to keep slinking off during the first part due to pre-event beverage intake, but Jonah’s a personable guy and everyone loves him so everyone loved him. He managed to plug the poker book
he co-wrote, and I managed to plug Bugs Bunny, so everything works out. Jeff plugged Okinawa. He’s a big Okinawa supporter.
In most exciting news, USSM is now the only Mariners site of any kind to feature the writings of a Fulbright Scholar, our own Jeff Shaw. Clap clap clap clap. He rocks. Remember when you write in to congratulate him: only one “l” in Fulbright.
Oh yeah, the M’s.
They lost to the Angels (Times). Over in the PI, Thornton must improve his control. Either there’s increased competition for spots or the team contract wonks discovered an option year hidden somewhere. Also, Johjima threw a runner out. It’s going to be great to have a catcher that can hit some and field his position.
Which reminds me. Having hashed on some parts of this off-season at great length, I want to be clear that there are some things I am genuinely enthusiastic about: Rafael Soriano, Felix, Johjima, Lopez, Betancourt, of course I’m a huge Ichiro! fan, and the team could put together a really good, young bullpen that would be a lot of fun to watch.
USSM bracket pool
Already filled out 182 brackets today? Make it 183, and join the USSMariner blog pool.
Group Name: USSMariner
Password: wfb
No fees, no prizes, just a bunch of baseball geeks pretending like we know basketball too.
Monday Marinerathon
PI: “Soriano shows his stuff in ninth inning“. Soriano’s one of the storylines I’m really excited about this year. I’d also like to point out, for people who continue to say that he’s got only two pitches: a) no and b) we don’t know if he can throw all three for strikes against major leaguers yet. Don’t count against it.
TNT: Felix had a nice outing. In the Times, Stone writes that both Felix and Appier were notable.
And, uh, if you missed it, “Bugs Bunny, greatest banned player ever“. Or you could page down a bit.
For those of you with nothing to do tonight, Jeff and I are going to go heckle Jonah at Third Place Books in Bothell at a Baseball Prospectus 2006 event. That’s:
Third Place Books
17171 Bothell Way NE
Lake Forrest Park , WA 98155
Daniel Bard and Andrew Miller
One of the joys about living 2,500 miles from Safeco Field is a chance to see top flight college baseball. Wake Forest University is a stones throw (okay, if Ichiro’s throwing it) from my house, and Wake just happened to be hosting the University of North Carolina this weekend, kicking off the ACC season by bringing the #4 team in the country to my backyard.
UNC is led by two of the best starting pitchers in the country, RHP Daniel Bard and LHP Andrew Miller. Both have been in the national spotlight since their senior year in high school, and not much has changed in the past three years. The 2006 draft is headlined by a strong group of college pitchers, and no team in the country boasts a better pair than North Carolina. Over the weekend, I had the opportunity to watch both and compose some thoughts on a pair of players who should both be rich men this summer.
Saturday was Bard’s turn in the rotation. While he’s a legitimate prospect in his own right, he has played second fiddle to Miller throughout his career. I liked the fact that I got to see Bard before Miller, giving me a better chance to evaluate him on his own merits rather than comparing him to his more heavily hyped teammate.
Bard is listed at 6’4, 202 lbs, but I wouldn’t be surprised if the height was fudged by an inch or two. He’s not a big kid, but he’s tall enough to overcome the short pitcher stigma. He throws from a 3/4 slot with solid leg drive and okay mechanics. There’s some unnecessary head movement and his release points weren’t consistent, but he’s in college, so that’s to be expected. There wasn’t anything in his delivery that isn’t fixable, and he’s got the foundation of good enough mechanics.
He came out in the first inning pumping gas. 96, 97, 95, 96, 96, 96, 97, 97. Just a steady diet of four seam fastballs. He clearly believes in the “establish your fastball” mantra. His command was shaky, mostly due to the aforementioned issues with his release point. He missed away alot, and he appeared to overthrowing. After a hit batter, he settled down and started blowing the ball past hitters, including Wake’s star third baseman Matt Antonelli. He busted out a slider that had some diving movement but wasn’t located particularly well. In college, though, an 84 MPH slider with movement after a 97 MPH fastball is good enough to miss bats, and Wake’s hitters were clearly overmatched.
He stuck with fastballs and sliders in the second inning as well, and not long after I mentioned to a friend that he’d have to show a third pitch eventually to show the scouts something, he broke out the curveball. It needs work. It doesn’t spin tight, and he hung a good percentage of them up in the zone. The slider is clearly his go-to breaking ball, and the curve is to show a different look. On the plus side, he did a good job of keeping his arm slot the same on both the slider and the curve, which is a problem for many kids.
His command continued to come and go, but it didn’t really matter. Wake wasn’t going to hit him and he knew it. He fired more 96 MPH fastballs by the weak hitters in Wake’s lineup (and there are some really weak hitters there) and mixed in the slider for the punchouts. The rains started in the fourth inning and pretty much stuck around the rest of the game, but he did well pitching through it and throwing strikes for the most part. He ended up hitting 3 batters, walking 2, and throwing a wild pitch, but you can get away with that when you only give up one hit. Box Score is here, if you’re interested.
Sunday was Miller Time. Come on, you knew the joke was coming at some point. This stuff writes itself.
The reports I’d read on Miller basically made him sound like a typical raw flamethrower; 6’6, mid-90s fastball, control and secondary pitches need work. Chapel Hill’s own Matt Thornton, basically. So, going in, that’s what I was expecting to see.
Apparently, Andrew Miller is tired of hearing it, because he was pretty much the anti-Matt Thornton. He’s tall, yes, but not super lanky, and his delivery is actually a bit lower than 3/4. I’d call it 5/8, but it’s not exactly that either. He doesn’t drop down, but the arm comes out from his body, and his release is certainly in the left hand batters box. He’s going to be murder on lefties with that release point.
Like Bard, he came out throwing fastballs, but unlike Bard, they were all two seamers. 91, 92, 91, 88, 92, 90, 87. His command was off as well, hitting the second batter of the game and walking Antonelli to put a couple men on. So, he busted out a top-down slider that is just pretty much unfair. Coming from his arm slot, it bores in on right handed hitters while having the bottom fall out, and ends up forcing an awful lot of fisted foul balls. He wasn’t using it as a knockout pitch, but it clearly could be.
As the game wore on, he worked in a few four seam fastballs, hitting 93 a couple times, 94 once, and 95 once, but mainly stuck to the two seam variety, getting a ton of choppers up the middle. While the box score won’t show it, he was a groundball machine. There was a lot of weak contact. The first hit off him was a slow roller (struck by the left fielder, who came into the game hitting .147 with aluminum bats. I hope he’s going to class) that went about 40 feet up the line and died for an RBI infield single.
Again like Bard, Miller clearly knew that Wake’s hitters weren’t going to be able to touch him, and he just focused on inducing contact and letting them get themselves out. While the DIPS theory has gained momentum at the major league level, it’s clearly not true in college. You watch guys like Andrew Miller knock the bat out of a kid’s hands and you know that he had everything to do with the weak ground ball.
Miller’s two seam fastball was impressive, his slider lethal, and he varied the speed on his fastballs enough to keep hitters off balance even without a change-up. His command wasn’t great, but he’s clearly not Matt Thornton, or anything like a raw fireballer just getting by on velocity. This kid can pitch.
In the end, Bard and Miller lived up to the hype, pitching 14 innings and allowing only an unearned run (seriously, this run was unearned – two errors and the aforementioned 40 foot single) while just outclassing Wake Forest’s hitters. This wasn’t a competition as much as it was a showcase of superior talent. Wake’s not a great college team, but I’m not sure it would have mattered.
Bard and Miller are vastly different animals. Bard looked like the velocity guy who lights up the radar gun, consistently hitting 97 and showing a good enough slider to miss a lot of bats. Despite the advanced reports, however, Miller’s not a project getting by on arm strength; he’s got a variety of weapons at his disposal and he showed the better idea of how to pitch.
Both have a ways to go; they aren’t polished, major league ready pitchers. But they aren’t supposed to be; they are starting their junior year in college, and there is enough there to like to see why major league clubs are getting excited.
They’re going to be lumped into the same conversation quite a bit this year. You’ll hear Bard and Miller become a phrase much like Laverne and Shirley or Bert and Ernie, but in the end, they’re going to be separated by the draft. At some point, teams are going to have to decide whether they prefer the right-hander with velocity or the left-hander with movement. I liked Miller’s package quite a bit more than Bard, but I wouldn’t cry if the Mariners selected Daniel Bard with the number five pick in the draft, either.
Bugs Bunny, greatest banned player ever
Update 2/21: I found out today from Glenn Stout, the series editor, this has been selected for the 2007 Best American Sportswriting annual! Woo! This makes USSM the first blog and the first non-ESPN.com/Slate site to be so honored.
With the DVD release of "Looney Tunes Golden Collection" it is at last possible for us to examine in detail one of the most famous baseball games ever played, and see what lessons the contest holds for the analytical community.
"Baseball Bugs" (1946) depicts a game held at the Polo Grounds. No date is given, but artifacts shown such as public address equipment and advertisements ("Filboid Studge," "Nox, 2 for 25," "Manza Champagne") definitively place it during the 1946 season. The visiting Gas House Gorillas are playing against the home team, the Tea Totallers. It is a day game and conditions are good.
The first view of the scoreboard shows the Gorillas at 94 runs (10-28-16-40) after the first four innings. This appears to be footage inserted out of order, as we’ll determine later the score then was not 96-0 but rather 54-0. While obviously neither team was a major league affiliate and it is almost certain that the game played is an exhibition, the score is already notable. The total of 54 runs was far more than the previous all-time run scoring record for a team in a game (held by the Chicago Colts, who scored 36 against Louisville in a game on June 29th, 1897), and the score of forty runs in an inning would be significantly above the most runs scored by any inning by one team (18, by the Chicago Colts in the 7th inning on April 14th, 1883).
The stadium is entirely filled, and as we know that the Polo Grounds could hold 55,000 fans in that year’s configuration, it is fair to assume that this was a game of some note, and that the players participating were extremely popular.
We open to see "a screaming liner" hit by the home team. The outcome of the hit is not defined, and the hit itself seems an indicator that the game was not official: the ball appears to be a shade of grey, and makes an almost-human screaming noise as it travels, neither of which was normal behavior for a regulation baseball in play. Since the balls used in the remainder of the game are white, and since we also see that the Teatotallers are a horrible offensive team, it is reasonable to conclude that this footage is from some kind of pre-game hitting contest, or perhaps an entirely different game.
The initial comparison of the teams’s players offers a startling contrast, as well as a further confirmation that this is not an official game. In 1946, baseball was in transition. During the first half of the decade, as the equipment and personnel needs of the war took precedence, baseball had become a slap-and-dash game, characterized by little hitting and little power, but with many stolen bases. After the war’s end, with returning players came plate discipline and power hitting, and almost all of the wartime players were quickly forced out.
This is obvious even in the first shot of the Gorillas pitcher as compared to the Teatotaller. Both wear uniforms without a team name, number, or other identifying characteristics, but they otherwise could not be more different:

Illustration 1: A visual comparison of players
I have summarized major differences in Table 1.
|
Characteristic |
Gorilla pitcher |
Teatotaller batter |
|
Height |
Over 6" |
Apx 5’5" |
|
Weight |
Over 220 lbs |
Under 125 lbs |
|
Uniform colors |
Dark grey and blue |
Light grey and red |
|
Eyeglasses |
No |
Yes |
|
Grey hair |
No |
Yes |
|
Illegally ragged uniform |
Yes |
No |
|
Visible facial hair |
Stubble |
Sideburns |
|
Slouching |
Yes |
No |
|
Smoking cigar while playing |
Yes |
No |
Table 1: A comparison of characteristics of players
Weekend Mariner newsathon
A couple of nice articles in the News Tribune: Lawton’s steroid use, Sexson’s still working on his swing despite a hot start, Bryan Price is working out in Arizona, and the good news on Meche is the strain’s not so bad. I’m not sure what bad news would have been. Just kidding, Gil, just kidding.
From the Times: Mariner pitchers getting hammered. Also, Mateo’s left camp because of a death in the family.
Deja Vu All Over Again
Ah, memories.
First, there’s me remembering what it’s like to post. Second, I seem to recall doing a book signing a few months back with Jonah Keri (for a book I didn’t write, which is kind of cool, like eating nothing but dessert).
Double your pleasure, double your fun, I always say. This Monday Jonah and I are back for a second go-round, and Jonah’s doubling up on the book promotion. He’ll be signing copies of both Baseball Prospectus 2006 and Baseball Between The Numbers. I don’t know if it’s possible to have twice as much fun as I had last time — maybe if Derek shows up.
Third Place Books
Monday March 13, 7 p.m.
I’ll even have a special in-person only, non-baseball related announcement that will excite at least two of the likely attendees.
Appetite whetted? Then let’s have dessert.
