Fielding statistics and defense

DMZ · May 7, 2008 at 3:00 pm · Filed Under General baseball 

I’ve been thinking about defense with the team’s recent woes. Dave wrote a large article on evaluating defense a while back that stands up nicely, and I came across an interesting post randomly that I thought I’d pass along: “Comparison of Fielding Statistics” which compares 2006 data from six different stats and comes to some interesting conclusions about their utility.

I have some quibbles with the piece’s logic in places, specifically the comparison of stat “features” leading to

So, based on that table, I would have to say that UZR and PMR have the best methodologies, with a nod to the Fans data because they can provide such unique insights into player skill.

The problem is that this doesn’t at all evaluate methodologies. If I came up with a defensive metric called Random Runs that claimed to be built on hit-location data, zones, ball type, batter handedness, ballpark-adjusted, and player skill types, and I did all of those things horribly, that’s not a better system than something that does fewer things the right way, even though you’d check off those boxes.

The particularly interesting thing is the easy-to-scan graphs of system-to-system results. It’s interesting to see that in the 2006 data, the correlation is both highly significant and not anywhere near as good as you see from offensive contribution measures.

It all goes to reinforce something I’ve been saying for years — recognize that defensive tools are still pretty rough, but looking at a couple of them you’ll be able to get a pretty good idea of how good a particular player is with the glove.

Comments

28 Responses to “Fielding statistics and defense”

  1. ghug on May 7th, 2008 3:12 pm

    Interesting.

  2. lailaihei on May 7th, 2008 3:19 pm

    It all goes to reinforce something I’ve been saying for years — recognize that defensive tools are still pretty rough, but looking at a couple of them you’ll be able to get a pretty good idea of how good a particular player is with the glove.

    True, although team defense is a lot easier to evaluate with % of balls in play turned into outs, which is still better than going through each player individually and trying to figure out who is hurting the team and how much.

  3. murphy_dog on May 7th, 2008 3:28 pm

    The defensive stats would need to be detailed to the point where it indicates the speed with which the ball came off the bat; arc of the ball leaving the plate; its expected coordinate of landing (X,Y) as well as the coordinates of the defensive players involved at the time of the hit (not at the pitch, but the hit since players are moving at the time of the pitch). If you could successfully measure all of that information on every ball put into play, you’d be able to determine what range a player really has, just in getting to the ball. But that’s just the first piece. Then you’d have to be able to determine the speed of the throw, was the throw to the optimal base (did you get the lead runner or not), was there a delay while the fielder checked a runner at third before going to 1st, timing for the fielding of the ball and preparing to throw, and so on. Speed of the batter and other runners would also come into play, what was the coordinate of the runner at the time of the pitch/contact with the bat, and on and on.

    Until then, we’ll have to live with some people thinking Derek Jeter is/was a great defensive shortstop.

  4. Mike Snow on May 7th, 2008 3:35 pm

    Another good point there is the importance of using stats derived from both sources, STATS and BIS.

  5. Dave on May 7th, 2008 3:37 pm

    If you look at UZR, PMR, and Plus/Minus, average the three, and give yourself a margin of error of +/- five runs, you’ll be fine. They’re not perfect, when viewed through a prism, they’re just fine.

  6. jinaz on May 7th, 2008 3:38 pm

    Hi David,

    Thanks for the link.

    In response to your critique, you’re right that there is an unstated assumption that all of those systems are not making substantial logical errors in their implementation of each of those criteria. In my previous piece, which is linked from the page you linked, I walked through the logic of most of those systems in some detail, so I think this is a defensible assumption. Once you accept that assumption, the argument is simply that more information will generally be better (or at least not worse)–and that shouldn’t be controversial.

    Nevertheless, my primary intention with that chart was not so much to provide a mechanism of evaluation of the various systems, per se, but rather to provide a quick reference of the key differences between the various systems I was about to compare quantitatively. I think it does that quite well.
    -j

  7. jinaz on May 7th, 2008 3:43 pm

    Another good point there is the importance of using stats derived from both sources, STATS and BIS.

    I think that’s one of the key findings of this and other studies comparing fielding studies (Michael Humphrey’s 3-part series last year found the same thing, and if anything was more convincing). And it’s something that’s not always widely appreciated.

    Another key point is that the Fans’ Scouting Report holds up quite well vs. the objective measures despite being based on a radically different sort of data. That’s another thing that isn’t always appreciated.
    -j

  8. murphy_dog on May 7th, 2008 3:45 pm

    For now, I’ll just assume that anyone who wins a Gold Glove, isn’t likely to be the best defensive player at his position; but rather had a great year at the plate, or made a few highlight reel plays (does Ichiro really win a GG as a rookie if he doesn’t make the throw in Oakland?).

    Jim Edmonds might be the best example of this, the guy was always out of position, playing to shallow, and had to make fantastic grabs running down balls that most CF’ers would have played on three steps.

  9. jinaz on May 7th, 2008 3:55 pm

    For now, I’ll just assume that anyone who wins a Gold Glove, isn’t likely to be the best defensive player at his position; but rather had a great year at the plate, or made a few highlight reel plays.

    What?! You don’t think Raphael Palmiero deserved a gold glove in 1999 when played all of 28 games at first base?! The nerve! 🙂
    -j

  10. Steve T on May 7th, 2008 4:04 pm

    I think Dave’s “give yourself a margin of error” point is the most important one. There’s a good chance there’s no such thing as a perfect defensive stat; not everything is knowable. But you can get reasonably close, and reasonably close is usually good enough. It’s the same with hitting stats or anything else, really; people pretend that a player with a VORP of 28.1 is “better” than a player with a VORP of 27.9, but it’s not really a useful distinction. 28.1 vs. 11.5 (or 73.5) is.

    I do have another rough and ready defensive guide I use — it’s not as accurate as UZR, but it’s a lot easier to figure, and you don’t even need a computer nearby. Just look at the uniform — if it says “Mariners” on the front, probably not so good, plus or minus 1.5 Beltres.

  11. RoninX on May 7th, 2008 4:10 pm

    Just look at the uniform — if it says “Mariners” on the front, probably not so good, plus or minus 1.5 Beltres.

    Even a sad panda cannot deny the truth of this statement (though Ichiro! might).

  12. DMZ on May 7th, 2008 4:10 pm

    Hi David,

    I’m Derek. He’s Dave.

  13. jinaz on May 7th, 2008 4:14 pm

    I’m Derek. He’s Dave.

    Sorry–you’d think I’d know that, given that I’ve read your book and subscribe to your blog. -j

  14. DMZ on May 7th, 2008 4:18 pm

    Woo hoo!

  15. Dave on May 7th, 2008 4:33 pm

    We can subscribe? Why didn’t I know this?

  16. Jeff Nye on May 7th, 2008 4:39 pm

    You got grandfathered in under the old subscription rate, I think.

    If not, I’ll be happy to send you a bill.

  17. galaxieboi on May 7th, 2008 4:39 pm

    We can subscribe? Why didn’t I know this?

    Derek didn’t tell you? Huh. We all got emails the last few days with a Paypal link in it. Weird.

  18. marc w on May 7th, 2008 4:43 pm

    15 – you mean you don’t charge a monthly fee to every reader? Deerrreekk!!!!

    Jinaz – thanks, this is great stuff. I caught the Humphreys piece in THT, but missed this somehow. Nice work!

    To me, Jinaz’ study reinforces some skepticism about defensive metrics, especially in certain positions (i.e. outfield; this was discussed more in Michael Humphreys article I think). The variance between the two data sources often results in the kind of weirdness we see with Ichiro – UZR thinks he’s the worst in the league, RZR or PMR think he’s good. It’s nice when they all line up to a degree, and then you can use a rough average (along with the error bars Dave talked about), but for a lot of players you just can’t.
    To me, that’s NOTHING like what we have w/offensive metrics. wOBA/OPS/GPA whatever – the different weights applied to different skills will result in Player A coming out in top in one metric and not in another. But there’s nothing *approaching* the situation we have here, in which the metrics, collectively, can tell us nothing about Ichiro’s defensive value.

  19. Sports on a Schtick on May 7th, 2008 4:50 pm

    Tiger Tales complied a whole bunch of defensive stats to rank 2007 fielders. Good stuff. Jinaz has an interesting write-up regarding catching defense too.

  20. smb on May 7th, 2008 5:00 pm

    I’ve been sending checks every month to Derek’s exiled Nigerian uncle. When he gets his throne back I’ll be rich, so really it’s just a loan. But that was my second best idea, my best idea is to bring Griffey back!!!11!!!

  21. OppositeField on May 7th, 2008 5:06 pm

    I’m pretty sure I speak for everybody when I say I hope jinaz will be sticking around.

  22. Sports on a Schtick on May 7th, 2008 5:14 pm

    #20

    Dude, that’s a scam. The checks should be going to Dave’s exiled Nigerian uncle.

  23. HamNasty on May 7th, 2008 5:14 pm

    M’s have the best defense in the AL West. Steve Phillips told me at the start of the season so I know its true. We don’t need to bother with these fancy numbers. You just take the simple formula, sweat(dirt+grass)^2 = Defensive Grit

  24. jinaz on May 7th, 2008 5:34 pm

    @15 – bloglines, baby.

    @18 – thanks. I agree that we should be skeptical about defensive metrics, but that’s part of the reason I (and Dave, and Sean Smith) advocate taking an average of the best available defensive statistics. If there’s agreement, you’ll get a solid number +/-. If there’s disagreement, you’ll estimate league-average or so, which is a good baseline to estimate anyone’s performance.

    @19 – thanks for the link on the catcher work. It’s admittedly very incomplete, but it gives us a baseline to work from.
    -j

  25. tiger337 on May 7th, 2008 5:54 pm

    I share the skepticism on defensive stats but I think if you take averages of several measures as jinaz and I did last winter (Thanks Justin for doing it first and making my job easier.), you’ll get some good information. The players who do well on all systems are probably very good fielders and those that do poorly on all systems are probably bad fielders and the average will reflect that. Those players who do well on some systems but not others will end up in the middle of the pack which I think is probably appropriate in most cases.

    Plus, I was pleased to see that the correlation between the fan fielding survey and numerical measures (Justin’s study) was reasonably strong. That gives me a little more confidence that the measures are working.

  26. msb on May 7th, 2008 6:06 pm

    Derek didn’t tell you? Huh. We all got emails the last few days with a Paypal link in it. Weird.

    noble of you to refrain from rickrolling that Paypal link

  27. DMZ on May 7th, 2008 6:14 pm

    One of the other valuable things you can get from something like this is finding systematic biases.

    I know a couple years ago I was able to look at a couple systems and figure out that whatever one of them was doing at one position, it was just wrong — compared to all the other systems, they might as well have been randomly ordering players.

    That’s extremely useful information when you’re doing evaluation.

  28. marc w on May 8th, 2008 9:19 am

    If there’s disagreement, you’ll estimate league-average or so, which is a good baseline to estimate anyone’s performance.

    “Those players who do well on some systems but not others will end up in the middle of the pack which I think is probably appropriate in most cases.”

    I think that’s true for a great many players; the variance is fairly tight and centered around zero.
    But I’m still concerned about the huuuge variance seen in a number of OFs. I’m just not comfortable averaging a -35 and a +33 or whatever Ichiro’s actual numbers are. Same w/Grady Sizemore. One of these numbers paints both players as hideously overrated when you factor in defense, and the other paints both as MVP candidates. When we’re trying to get reasonable estimates of team defense or OF defense specifically, that’s pretty important. Again, there’s zero doubt that what jinaz and tiger337 have done here is helpful and moves the ball down the field (and since I’m lavishing praise here, I can’t believe we haven’t even talked about Justin’s fielding translations with THT data here; I use it a lot). I know Derek’s said that if a system doesn’t work for one guy, it’s not reason enough to discard it. That seems fairly reasonable.

    And yet, as M’s fans, it matters a hell of a lot if Ichiro is a lifesaver or an anchor in CF. It matters a lot if park factors or something else will make *every* RF (except ichiro) look like crap. The team can’t keep hemorrhaging runs through poor DER, and the team can’t keep trying out new OFs only to see each one worse than his predecessor. These data should clearly help the FO make decisions about who might perform well, and yet, we’ve all been surprised. I don’t think anyone thought a healthy Jose Guillen would’ve been one of the worst RFs in the league, and I know I didn’t think Wilkerson would’ve been at least as bad.

    In summary – we need to quantify/explore the BIS/STATS differences, especially for the OF. We need to figure out how much natural variation to expect from year to year in these numbers. We’re getting closer, but we’re still at a point were we need 2 or 3 years of data from multiple sources and then we need to hope that those sources don’t contradict each other. That’s tough for teams to use to really inform their decision on extending, say, Wlad or Yuni a few years ago.

Leave a Reply

You must be logged in to post a comment.