![]() |
![]() |
|
Baseball Articles | Index
|
|
Evaluating Defense
By Tom Tippett We rate players based on performance, not reputation. Our outfielder throwing rating, for example, measures the fielder's ability to prevent runners from taking extra bases and to throw them out when they try to advance. A fielder earns a good rating by positioning himself well, getting to the ball quickly, making a quick release, throwing with power, throwing with accuracy, and throwing to the right base. Someone with a powerful arm might still get a subpar rating if he doesn't get into throwing position quickly, throws wildly, or throws to the wrong base. Similarly, our running rating is not just a measure of raw speed, but also the ability to read the ball off the bat, get a good jump, and make good decisions about when to go for the extra base. And our rating for defensive range measures the ability to make plays. An infielder can earn a good rating through positioning, quickness, soft hands, and effective throwing (quick release, arm strength, accuracy). It's not always the flashiest player who makes the most of the balls that are hit his way. Because our ratings measure the ability to succeed in a certain phase of the game, we evaluate performance by analyzing play-by-play data. This approach is not a radical one. Baseball people have been doing this for over a century to measure batting and pitching performances. They don't, after all, give the batting title to the guy with the prettiest swing, they give it to the player who hit for the highest average. They don't give the Cy Young to the pitcher with the best mechanics or the guy who throws the hardest, they give it to the one who considered to be the most effective. Using statistics to evaluate performance is part of the tradition of the game. But this tradition extends only to hitting and pitching. You never hear a television or radio analyst talk about meaningful measures of baserunning, throwing or defense. Instead, they talk about their impressions of the player -- how fast he looks, his quickness, strength and athleticism. If they applied the same standard to hitters and pitchers, they'd never talk about slugging averages or walk-strikeout ratios. Our approach is to apply the time-honored tradition of using well-crafted statistics to evaluate baseball performance. The difference is that we don't stop at hitting and pitching. We design ways to measure results in all phases of the game. This approach can be controversial, because we sometimes find players whose performance is better or worse than you would guess by watching them a few times a year. And our ratings are occasionally at odds with the opinions expressed by some of baseball's most famous writers and TV personalities. But we sincerely believe that doing original research into player skills is an important part of producing an accurate baseball simulation. Suppose a player has a reputation for great defense but our analysis doesn't show a superior performance. If we gave in to public opinion and rated him higher than his performance justified, we'd have these options:
We don't think it's fair to downgrade teammates so we can give a popular player a better rating than he deserves. And we don't think you'd want us to disregard the side effects and publish a season disk with players and teams who will overperform. So we do our best to rate players based on performance, even if that means we might occasionally take a little heat for a few of our ratings. Judging by WatchingFor a couple of years now, I've wanted to write a little piece about how difficult it is to judge defensive ability, or any baseball skill for that matter, just by watching a lot of games. Then I found an essay by Bill James in his 1977 Baseball Abstract (a self-published book that predated his debut in bookstores by about five years) that says what I wanted to say far, far better than I ever could. Here are a few excerpts from this wonderful essay, starting with a comment on how differently most people tend to approach the assessment of hitters and fielders:
And he talks about the difficulty of trying to judge effectiveness simply by watching:
In that essay, Bill went on to propose a scoring system that accomplishes essentially what STATS Inc. is doing now -- recording the location of every batted ball so that we could build a record of fielding performances similar to the statistical records that we use to judge batting and pitching performances. Measuring Defensive RangeDefensive range is one of the hardest elements of performance to measure, but we have made some good progress in recent years. Official fielding stats provide information such as games played, putouts, assists, errors, double plays, and fielding percentage. But using these numbers to assess player skills is extremely difficult, if not impossible. The list of reasons is very long, but they all boil down to two things:
For these reasons, it's very difficult to measure fielding ability using stats such as assists per game, putouts per game, total chances per game, or fielding percentage. It's sometimes possible to pick out the very best and very worst fielders using these numbers, but it's very hard to evaluate the majority of players. In 1999, for example, Troy O'Leary led the majors in putouts by a left fielder. Is this because:
Baseball analysts, ourselves included, have made many attempts to devise methods that deal with some of these other factors so that we can isolate the contribution the player is making. Let's review them, and then talk about some newer methods that we've been using for the past few years. Range Factors and Defensive InningsIn the 1970s, Bill James introduced the idea of range factors to compensate for playing time. A player's range factor is generally computed as successful chances (putouts plus assists) per game. This was a good first step, even though Bill acknowledged at the time that it wasn't meaningful for pitchers, catchers and first basemen. One thing that frustrated Bill was the fact that not all games played are equal. Some players play almost every inning of their games. Others split the playing time with a platoon partner. Late-inning defensive specialists often pick up a lot of games played without actually playing a lot. For a while, Bill devised methods to estimate how many innings each fielder was actually in the game at his position, but this is very hard to do. Fortunately, companies like STATS Inc. have been publishing accurate counts of defensive innings for the last ten years. So we can now compute range factors on a per-nine-innings basis, just like we do for earned run averages. Using a range factor based on defensive innings, Brian Hunter moves to the top of the list of 1999 left fielders with 2.41 putouts per nine innings. O'Leary drops to seventeenth. So can we now annoint Hunter as the best left fielder in baseball? Not yet. We still don't know how many chances he had to make plays. What if his pitching staff generated more fly balls and line drives to the outfield than others did? Without knowing more about the number, type and location of balls put in play, it's hard to learn anything meaningful from a simple ranking based on putouts per nine innings. Adjusted Range FactorsEven if we use defensive innings to measure playing time, we still haven't taken into account (a) the number of opportunities presented to each fielder and (b) the fact that some putouts and assists are harder to come by than others. So, about a dozen years ago, I developed a new type of range factor that adjusts for many of these variables in the following ways:
This approach produces much better information than does an ordinary range factor, and has the advantage of being useful even when we're using play-by-play data that doesn't contain detailed hit location data. But we're still left with the fact that we're using these adjustments to make an educated guess at how many opportunities each fielder had to make plays. It goes without saying that it's possible to do better when we have access to play-by-play data that records the location of every batted ball. Fielding RunsBefore moving on, let me take a moment to say that the Fielding Runs numbers in the Total Baseball encyclopedia can be extremely misleading. I don't enjoy saying this, because they were developed by Pete Palmer, and Pete's a friend and one of the nicest guys I've ever met. The first problem I have with fielding runs is that they're just a glorified range factor, with different weights for different events. So, like range factors, you cannot interpret them accurately unless you know the strikeout rate and groundball/flyball ratio of the pitching staff and what percentage of left-handed batters the fielder faced. For a good example of the distortions that often creep into the fielding runs numbers, see the comments on Frank White and Ryne Sandberg in an article I wrote for ESPN.com in September, 1998 (www.diamond-mind.com/espn9809.htm). I don't agree with some of the formulas, mainly because they put too much weight on some events. For example, the formula for outfielders is .20(PO + 4A - E + 2DP), meaning that catching a fly ball with the bases empty earns you .20 fielding runs, while catching the same fly ball and throwing out a runner for a double play earns you 1.4 fielding runs. In both cases, the fielder made the best play available, but one counts for seven times as much as the other. And suppose one center fielder reaches a ball but muffs it for a one-base error, while another lets it go up the gap for a double -- the guy who reached the ball has .20 fielding runs deducted and the second guy isn't penalized at all. The fielding runs formula mixes range, errors and throwing into one number, which is appropriate for what Total Baseball is trying to accomplish (an overall player rating), but useless for someone like me who's trying to come up with separate ratings for these aspects of a fielder's play. Zone RatingsThe next logical step beyond range factors is a system that counts actual opportunities to make plays. We weren't able to do that until 1989, because nobody tracked the location of every batted ball until then. The folks at STATS, Inc. were the first to do it, and they were quick to develop the zone rating to take advantage of this new information. The zone rating should have been a tremendous breakthrough, but STATS made some serious errors in designing this statistic. STATS says the "zone rating measures all the balls hit in the area where a fielder can reasonably be expected to record an out, then counts the percentage of outs actually made." This is a step in the right direction. Instead of having to estimate the number of opportunities to make plays from defensive innings, percentages of balls in play, the left-right composition of the pitching staff, and the staff groundball/flyball ratio, we can actually count the balls hit to each fielder while they are in the game. The first problem is that they don't count all the balls. For example, no infielder is charged with an opportunity when a grounder is hit down the lines, in the holes, or up the middle. The only plays that go into the zone ratings are the ones where the ball is hit more or less at a fielder. The ones that are left out are the ones that only the best fielders get to. The net result is a system that places a lot more emphasis on good hands than range. The second problem occurs when an infielder starts a double play. STATS credits him with two outs and one opportunity. Manny Alexander, for example, has a 1.017 zone rating for 1999, meaning that he created an out more than 100% of the time. While I agree that starting double plays is an important skill for an infielder, this approach gives a significant boost to infielders who play behind pitchers who put lots of runners on base and/or with a pivot partner who turns the DP well, and it clouds the effort to measure defensive range. The third problem is that errors are mixed in with the ability to get to the ball in the first place. For example, in 1999, Edgardo Alfonzo had a zone rating of .921, while the norm for his position was .905. At face value, you'd think this means that he covered more ground than the average second baseman. But he also made 8 fewer errors than the average second baseman, given the number of chances he handled. If you change 8 of his outs into errors, his zone rating drops to .902. Now he looks like a fielder with average range and very good hands. Once again, let me say that the idea behind the STATS zone rating is sound. Done properly, it would be an improvement over the adjusted range factors I talked about. But these problems are enough to remove much of its value for evaluating defensive range.
(March 29, 2000) In the STATS 2000 Baseball Scoreboard book, STATS announced that they have changed their zone rating calculation to credit an infielder with one out and one opportunity when he starts a double play. We applaud them for making this change. Keep in mind, however, that all of the zone ratings they have published to date (through the 1999 season) use the old two-outs-one-opportunity formula. Defensive AverageFor a few years, we used a type of zone rating called Defensive Average (DA) . It was developed by Pete DeCoursey and Sherri Nichols and used play-by-play data from The Baseball Workshop. Like the STATS zone rating, defensive average uses the same principle of counting batted balls hit into each fielder's zone and counting the number of plays he made. But it covers the whole field and doesn't mix apples and oranges by double-counting double plays. We felt we got better results from defensive average than from the STATS zone ratings. But DA isn't perfect either. One of the perplexing problems in any system is how to assign responsibility for balls hit between fielders. In both the STATS and DA systems, the player making the play gets one opportunity and one play. But things get tricky when the ball falls in for a hit. In DA, each player gets charged with half an opportunity when the play results in a hit. That means that someone playing next to a weak fielder tends to look worse than he is, because if the other guy makes the play, there is no opportunity charged, but if the ball falls in, it costs him a half of one. In past years, when we put together our player ratings, we were aware of this limitation and did our best to make intelligent adjustments to compensate for it. But we always wanted to see if we could do better. The Diamond Mind SystemIn 1996, we came up with a new approach to computing zone ratings that does a better job in three areas -- it counts more batted balls, it handles the plays between fielders better, and it takes into account the difficulty of the play. Other systems, like the STATS zone rating and DA, ignore certain types of batted balls. Bunts are excluded from both systems, so we get less help in trying to evaluate catchers, pitchers, third basemen and first basemen. Popups are excluded from both systems, in the belief that all popups are routine and don't therefore measure range. But some popups, like the not-so-high ones that an infielder must go back on, are plays that only the best fielders will make. The system we've developed counts all batted balls except popups on the infield, which we omitted for two reasons. The first is that over 99% of these plays result in an out, so they don't help us distinguish the good fielders from the not-so-good. Second, because these plays are easy to make, most popups can be handled by any of several fielders. We noticed, for example, that the best defensive first basemen tend to take all the popups in their area, making their second-basemen look less effective, while the weaker fielders would leave many of these plays to the second baseman. Our new system handles plays between fielders better because it calculates the percentage of those plays that are made by the average fielder at each position. Instead of arbitrarily assigning half an opportunity to each fielder, it assigns responsibility more fairly. If a line drive between short and third is handled 20% of the time by the third baseman, 25% of the time by the shortstop, and is a hit the rest of the time, these are the percentages we use to assign responsibility. Each player gets credit for the number of plays made compared to the league average for his position given the mix of batted balls he faced. This approach also takes into account the difficulty of the play. Other systems charge the player with one opportunity for every ball hit into the zone, regardless of type (grounder versus line drive) or location (at the fielder or in the gap), so a player can be made to look bad if he faces a higher-than-normal percentage of tough plays. Our system evaluates each type of batted ball and each zone separately, thereby giving a fielder more credit for making a more difficult play and penalizing him less if he fails to make that difficult play. This way, if a fielder happens to face a tougher array of chances, he can still look good.
Before I move on to talk about some of the results of using this new system, let me take a moment to discuss some remaining challenges in the seemingly never-ending pursuit of a better fielding metric. Park EffectsIn Coors Field, it's harder for outfielders to make plays because the ball doesn't stay in the air as long at that altitude. It's hard, especially for visiting outfielders, to pick up the ball against the Metrodome roof, so there are quite a few cheap doubles and triples in that stadium each year. Some balls that can be caught in Tiger Stadium hit high on the wall in Fenway. Some infields, such as Mile High Stadium in 1993-94 are so choppy that a lot more errors are made there. This, in my opinion, is where the next innovation in fielding ratings needs to be made. To help us get a grip on these effects, our fielding analysis software creates a detailed report of home-road fielding performance. We use this report to quantify the effect on the percentages of outs made on different types of batted balls in various ballparks. With this information, we can make appropriate adjustments for the outfielders who play in Coors Field and other unusual parks, and this helps us come up with more accurate player ratings. Pitcher QualityRegrettably, we may never be able to separate completely the contributions of the pitcher and the fielder. If a ball drops in, is it because the pitching was bad or because the fielder failed to make the play? Our system helps a little here. If a bad pitching staff showers its defense with lots of line drives, that's ok. Our system measures fielders on how often they field those line drives and doesn't penalize them for facing more than the average number of line drives. But I still think good pitching makes fielders look good (just as good fielders makes pitchers look good), so we keep the quality of the pitching staff in mind as we assign our ratings. A Brief RecapThese are the points I've tried to make in this review of fielding statistics and approaches to evaluating them:
Converting Analysis into Player RatingsI'm sure you've gathered by now that our approach to rating players is to look at the evidence as carefully as we can. Our analysis software computes our own version of a zone rating for each player, but more importantly, it also determines how each fielder's performance compared to the league average on each type and location of batted ball. That type of analysis tells us, for example, that a certain third baseman did well on balls hit down the line but was below average on bunts and other softly hit balls. We then add up his ratings for each type of batted ball and give him an overall number that represents the number of plays he made above or below the league average given the mix of opportunities presented to him. Andruw Jones, for example, led the majors by making 51 more plays than the average center fielder in 1999. In any given season, the major league leader tends to make 30-50 more plays than the average fielder at his position, given the number and type of opportunities presented to him, though this varies from year to year. But this "net plays versus the league" analysis isn't enough, by itself, because it doesn't tell us whether he has been affected by his home park and the other fielders he plays with. So we analyze the fielding data on a home/road basis and factor that into our thinking. And we use reports of team defense to clarify the relationship between neighboring fielders. Our team fielding reports show how a pair of fielders did as a tandem relative to the league average, and that helps us make sure the individual player ratings combine to form an accurate reflection of the team's defense. With these reports, we can look at the zones between fielders and get a very good picture of what happened there. We can see, for example, that Scott Brosius of the Yankees was way above average on balls to his left, and that cut down on the number of plays Derek Jeter could make in that zone, but since the overall team defense in that zone was very good, there's no reason to penalize Jeter for it. And we can see that Lee Stevens of Texas was well below average on balls hit to his right, and while Mark McLemore turned a bunch of them into outs, quite a few went through for hits as well, meaning that Stevens was hurting the team by failing to reach more of these balls. Another of our analysis programs counts the number of times a player is used as a defensive sub or is removed for a defensive sub. This information doesn't tell us anything about performance, of course, but it is very helpful to know that one fielder was regarded by his manager as being superior to another. Like many of you, we read a lot and we watch games and highlight shows on Fox and ESPN and DirecTV, because it helps to have an image of a player when we evaluate the performance data. And we compile an extensive database of player notes, because it's helpful to know who's coming off a knee injury or a shoulder problem that might have affected their ability to make plays. And when the evidence doesn't match the player's reputation, we double-check our work and look very, very hard for the reasons why. Whenever possible, we talk to people who really know baseball -- local writers, broadcasters and sophisticated fans -- and who have seen the player quite a bit, to see if we can gain some additional insight into each player's performance. Other Approaches to Rating PlayersBefore I discuss the Gold Glove winners and some of the fielding performances from the 2001 season, I want to take a moment to talk about other sources of information that we could use to rate fielders if we didn't want to go to all the trouble of developing software, licensing play-by-play data, and spending weeks poring over the analysis. We could rely on the opinions of sportswriters and members of the broadcast media, but this is problematic for several reasons:
We could rely on the opinions we hear from other players, managers, and team executives, but they don't see all the players either, and their remarks can be influenced by the needs of the team. It's to their advantage to talk about players in certain ways, whether it's to hype someone for marketing purposes, or to talk them down in a salary squabble. We need information that is less prone to bias. We could use the opinions of professional baseball scouts. This is better
than using the media because scouts are trained to see things that other
people don't see. But it's difficult to find a collection of scouts who
have seen every player and can make their evaluations available to people
outside the organizations they work for. And, of course, scouting is not
an exact We could base our judgments on how often someone shows up on SportsCenter. But the photogenic play isn't always the best play. The exact same fly ball might produce a routine play for a great fielder, a diving catch for the average fielder, or a single for the poor fielder. The diving catch is the only one that makes the highlight films. The majority of highlight-film plays are made at the edge of the fielder's effective range, whatever that range happens to be.
We could give a lot of consideration to Gold Glove awards. But we can't award a player an Excellent range rating just because he won the Gold Glove. Why not? First, the Gold Glove is given for overall fielding performance, including range, throwing, and avoiding errors. Our fielders have separate ratings for these three factors. It should not be surprising that a fielder may occasionally win a Gold Glove by virtue of excellence in throwing and avoiding errors, while having only average or above-average range. Second, the voters do not have access to the information we compile in our fielding studies, and they make mistakes. Third, I'm not convinced that the voters put all that much effort into the process. Comments on the 2001 Gold Glove WinnersPitchers. There's a very strong tendency for Gold Glove voters to fixate on one guy and keep giving him the award year after year after year, as long as he doesn't get hurt or do anything to make it clear that something has changed. This tendency is especially strong for pitchers, perhaps because the voters don't get to see them as often. At other positions, we can judge performance over a span of 1,000 to 1,400 defensive innings, but even the most durable starting pitchers are in the field only for 200-250 innings. And relievers get only a fraction of the innings of a starting pitcher. With 14 or 16 teams in the league, a voter might get to see a certain shortstop play 80 innings in the field. That's not much in the context of a whole season, but it sure beats the 10-20 innings they might see of a starting pitcher or the 4-5 innings a reliever might pitch in those games. So it's hard for anyone to evaluate pitcher defense just by watching, because nobody is in position to watch enough pitchers in enough situations to get a complete picture. And it's hard to evaluate pitchers just by looking at their putouts and assists because a pitcher's tendency to induce ground balls can have a major impact on those numbers. Even if you're a brilliant fielder, you're not going to look good next to Greg Maddux if you're a fly-ball pitcher and they're using traditional fielding stats to evaluate you. This year, Mike Mussina was chosen for the fifth time, and he's a pretty good pick. He had a good year, handling 43 chances successfully while participating in 5 double plays, making only one error, and doing a very good job holding opposing runners. But there are other deserving candidates. (By the way, I'll leave it up to you to decide whether holding runners is a pitching skill or a defensive skill. But I'll mention it for those of you who think it's relevant to a Gold Glove debate.) Freddy Garcia also participated in five double plays and made only one error while handling 68 chances successfully, more than half again as many as Mussina. On the other hand, Garcia creates more chances for himself because he's a ground ball pitcher, and he doesn't hold runners well. Steve Sparks had 62 successful chances, only one error, and held runners well despite throwing a pitch, the knuckleball, that is easy to run on. He was involved in one double play. Brad Radke had 57 successful chances, four double plays, and only one error, but wasn't quite as good as Sparks and Mussina at holding runners. Andy Pettitte was error-free in 49 successful chances with one double play and has a terrific pickoff move, though he is less successful holding runners close when he goes home with the pitch. Jeff Weaver also handled 49 chances without an error. He was in on four double plays and was in the middle of the pack in holding runners. All things considered, my vote would have gone to Garcia this year. In the other league, Greg Maddux won his 12th straight, and there's no question that he's a very good fielder. But it must also be said that he has a head start on his competition because he's an extreme ground-ball pitcher who creates for himself a ton of opportunities to make plays. This year, he led the majors by handling 72 chances successfully, making only one error in the process. But there are two arguments against Maddux's iron grip on this award. First, quite a few others have ranked above Maddux each year in plays made per batted ball in his zone. And Maddux has made 14 errors in the past five years; that's a lot for a pitcher, and only three other pitchers have made more in that span. Consider Kirk Rueter. I'll bet if the voters had picked him a few years ago, they'd keep picking him every year just like they do with Maddux, because if Rueter had once been deemed the best, he's definitely doing enough to reinforce the view that he still is. This year, Rueter handled 61 chances without an error and took part in eleven (!) double plays. Among players with at least 50 balls hit into his zone, he ranked #1 in converting those chances into outs. And he was almost impossible to run on. Last year, Rueter handled 52 chances without an error and took part in four double plays. He converted an extremely high number of batted balls into outs and was almost impossible to run on. In 1999, Rueter handled 45 successful chances but made one error. Over the past five years, Maddux has made 14 errors in 424 chances for a fielding percentage of .967. In the same span, Rueter has made 3 errors in 265 chances for a fielding percentage of .989. Rueter has been involved in seven more double plays (26 to 19) despite pitching about 240 fewer innings. Rueter has converted a noticeably higher percentage of batted balls into outs. The only area where Maddux has the edge is raw totals, and that's only because he generates so many more come- backers than the average pitcher. Getting back to the 2001 season, the pitchers who bested Maddux in converting opportunities into outs are Adam Eaton, Rueter, Chris Reitsma, Livan Hernandez, Russ Ortiz, Tom Glavine, Javier Vazquez, and Mike Hampton, in that order. Eaton only pitched for half the season and made two errors, so I don't consider him to be in the same league as the others, though he's someone to watch for the future. Rueter, Reitsma, Hernandez, Glavine, Vazquez, and Hampton each handled more than fifty chances without making an error. Maddux was a good choice. Any of these guys I just mentioned would have been a slightly better choice. Rueter was the best of the bunch and deserved the Gold Glove this year. Just as he did last year. Catchers. Ivan Rodriguez is the owner of one of the best throwing arms in history, and has been a lock for this award for many years. He had another great throwing year, and even though he missed a third of the season due to injury, and he's the hands-down choice again this year. For some reason, the best arms have found their way into the other league in the past few years, and there's nobody left in the AL to challenge him. A year ago, I argued that Brad Ausmus should have been the choice in the AL, partly because he had a great year defensively and partly because Rodriguez missed half the season. Ausmus is now in the NL and had another good year throwing, though others bested him in that department, and backed it up by allowing only one passed ball (best in the majors) and making only three errors (tied for second best in the majors). There were other candidates, of course. Jason LaRue, Mike Matheny, and Henry Blanco threw out a higher percentage of enemy base stealers. But LaRue allowed 15 passed balls, second most in baseball, despite starting only 95 games behind the plate. Blanco started only 94 games himself, and didn't quite match up to Ausmus at any rate. In my eyes, it's almost impossible to choose between Ausmus and Matheny. Playing time was similar. Ausmus made one fewer error and was charged with five fewer passed balls. On the other hand, Matheny had a better year throwing, though he got more help from his pitchers than Ausmus did. All in all, I think Ausmus was a worthy victor. First basemen. Based on our analysis, there are four men who could reasonably be thought of as viable candidates at this position, two in each league: Doug Mientkiewicz and Tino Martinez in the AL, Kevin Young and Todd Helton in the NL. The voters got it right when they chose Mientkiewicz over Martinez. Doug had a better fielding percentage, turned a higher percentage of batted balls into outs, and led the majors in highlight-reel plays. It's actually an easy choice, but I wanted to mentioned Martinez because he's a very good fielder who had another very good year, and he deserves some recognition. It's not quite so clear in the NL. The voters picked Helton, who I thought should have won the award over J. T. Snow in 2000, but Young had a terrific year, too. Both the Diamond Mind and STATS methods for assessing range give Young a slight edge over Helton. And after making a boatload of errors in 1999 and 2000, Young got his act together and finished around the league average in fielding percentage. Helton led the league in this category. Over the past four years, Helton has shown more range than any other first baseman in baseball. Young is second. You rarely hear good things about Young's range because he made far too many errors in two of those four seasons. But the man can cover ground at first base. Helton and Young were almost on par with each other this year, but I'd agree with the voters and choose Helton. He's been the best in the league since 1998 and this year sustained his high level of play over 157 starts (compared to only 125 for Young). Second basemen. Here's some of what I wrote a year ago: "Here we go again. Roberto Alomar won his ninth Gold Glove, and there isn't a baseball writer or television commentator who doesn't gush incessantly about Alomar's brilliance in the field. And I've seen him make some very spectacular plays myself. Problem is, year after year, our analysis (and other measures such as range factors and the STATS zone rating) shows that he doesn't make many more plays than the average second baseman. Alomar was one of three Cleveland infielders to be rewarded with Gold Gloves this season. But that infield was below the league average in turning ground balls into outs. And according to the STATS Major League Handbook, they were fourth worst in the league in converting double plays when grounders were hit in double-play situations. And even though they used a lot of different pitchers this year, I don't think you can argue that this defense was made to look worse by a lousy pitching staff. They did, after all, get almost 600 innings from three good starting pitchers (Burba, Colon, Finley) and a bunch more from a group of veteran relievers who have fared quite well playing in front of other defenses in the recent past. The bottom line is that somebody isn't making nearly as many plays as people think ..." I'm repeating so much of last year's comment because it's still relevant. This season, Cleveland's infield was 13th in the league in the percentage of ground balls turned into outs. And they were only a hair above the league average in double-play percentage. You could argue that the infield looks bad because the corner guys -- Jim Thome at first, Travis Fryman and Russ Branyan at third -- don't cover much ground, and you'd be correct. Problem is, there's absolutely no evidence that their middle infielders are doing more than their share, either. The best case for Alomar's Gold Glove is that he won the fielding percentage title by making only five errors all season. His nearest rivals, Ray Durham and Bret Boone, made ten errors each. But Alomar's range factor was .12 below the league average despite playing behind a ground-ball staff. His STATS zone rating was thirty-five points below the norm for his position. According to our method, Alomar made 20 fewer plays than the average 2B, and he was consistently below average on all types of plays -- line drives, ground balls and popups. And he was 33 years old this year, an age when many middle infielders struggle to keep up with their younger rivals. Those numbers are indicative of a player who deserves our Fair rating. But we gave him an Average rating anyway. Why? Because he has a great reputation and because it's possible that his pitching staff did indeed make him look worse that he really is. This is the fifth time in the past nine years that we've given Alomar a rating that's better than our analysis shows is justified. Not once in those nine years has his play-making score been far enough above the league average to merit a Very Good rating. But every year we say to ourselves that there must be some aspect of his ability that doesn't show up in fielding studies. But don't you think that if Alomar was truly the best at his position in the history of baseball, he'd score well at least once in nine years? Is it really possible that external factors or quirks in the data would make him look worse every single year? I know that some people will look at this rating and conclude that (a) we're vastly underestimating his ability, (b) we have something against Alomar, and/or (c) we know nothing about baseball. Looking at all of the evidence, however, I have to say that, if anything, we've been generous in how we've rated him over the years. I'll end this commentary with a quote from The New Bill James Historical Baseball Abstract: "[Alomar is] an overrated fielder, in my opinion; a good fielder, even a very good one, but no better than some guys who don't win Gold Gloves, like Fernando Vina." That was written before the 2001 data was available, and I agree with Bill's assessment of Alomar's career. We're now in the late stages of that career, however, and we're seeing evidence of a decline in Alomar's play-making ability. Other worthy candidates for the AL Gold Glove were Adam Kennedy, Ray Durham, Bret Boone, and Jerry Hairston. Kennedy was the best of this group, but started only 123 games. Nevertheless, I'd go with Kennedy. The other league's Gold Glove went to Fernando Vina. If Pokey Reese had played the entire year at second, instead of splitting his time between second and short, he would have gotten my vote. But he didn't, and that left things open for Vina, who I nominated as my choice a year ago. Vina had another good year, with above-average range and a low error rate, and the Cardinals were second in the NL in double play percentage. Those are solid credentials. And he played a lot more than some of the other guys (Ron Belliard, Damian Jackson, Mark Grudzielanek) who could be considered viable candidates. Third basemen. The voters got it right at this position. Scott Rolen was so amazing that he managed to stand out in a league featuring several other very good players who had very good years. His closest rivals were Robin Ventura and Jeff Cirillo. But Rolen was so good that if there was an award for defense -- an MVP or Cy Young for defense, single award that crosses all positions -- Rolen would be my choice for NL Defensive Player of the Year. The AL produced three strong candidates, Eric Chavez (the winner), Corey Koskie, and David Bell. Of the three, Chavez was best in range and sure-handedness, and he played a lot more than Bell. So I agree with this selection, too. Shortstops. As I mentioned above, the voters tend to settle on one guy and give him the award year after year as long as he doesn't blow it. By posting the second-best fielding percentage in the majors (.989, trailing only Rey Sanchez's .991), and by continuing to ply his trade with grace and style, Omar Vizquel did enough this year to keep the voters' trust, and he was rewarded with his ninth straight Gold Glove. I'm not going to spend a lot more time writing about the Cleveland defense because I did that in the second base comment above. Suffice it to say that Vizquel's range wasn't all that good this year. If Rey Sanchez hadn't been traded out of the league, I'd nominate him, as he bested Vizquel in both range and steadiness. But Sanchez WAS traded out of the league, and in his stead, my vote goes to Toronto's Alex Gonzalez. Interestingly, I don't recall hearing any gripes about Orlando Cabrera getting the nod in the NL. I figured that with Rey Ordonez healthy and playing a full season, some in New York would have pushed for him to get it back. But Ordonez' range was nothing special according to the measures we use, and it may be that the lingering effects of his arm and shoulder injuries affected his ability to make certain plays for at least part of the season. On the other hand, Cabrera showed above-average range and was among the steadiest fielders in either league. Rich Aurilia also looked quite good, but in my opinion, Cabrera was a deserving winner. Outfielders. There are a lot of good outfield candidates this year, and with one major exception, all of the winners were drawn from that pool. In other words, five of the six choices were at least in the right ballpark. According to our analysis, five center fielders stood out this year, and all of them are in the AL. They are, from top to bottom, Chris Singleton, Kenny Lofton, Mike Cameron, Darin Erstad, and Torii Hunter. Bobby Higginson and Jacque Jones were the two left fielders who separated themselves from the pack. In right, the top performers were in the NL, with Jermaine Dye and Ichiro Suzuki being the best of the AL contenders. The voters and I agree on Mike Cameron, so I'll focus on the voters' selection of Torii Hunter and Ichiro. Given that center field is the most demanding outfield position and that we have a large number of deserving candidates there, I see no reason to choose a corner outfielder. Furthermore, according to our analysis, Ichiro had above-average range and an above-average arm, but he wasn't as far above average as the media would have you believe. Ichiro's range factor was .26 above the norm, but he played behind a pitching staff that produced almost 200 more fly balls than the average AL team (according to the STATS Player Profiles book). His STATS zone rating was seven points below the major-league average for right fielders. Nevertheless, based on his reputation and the fact that our fielding analysis shows that Ichiro would almost certainly have made more plays if he wasn't playing next to Cameron, we believe he's worthy of a Very Good rating. But we don't see evidence of Gold Glove range here. In addition, he had only 8 assists, a below-average number for a RF who played as much as he did. And it's not as if nobody was willing to test him. Runners tried to advance on him a little less often than against the average RF, but not that much less. It does appear as if runners got a little more wary of his arm as the season progressed, but not a lot more wary. So we've rated him Very Good in throwing as well. The media seems to be saying that Ichiro is unquestionably excellent in all phases of the game. According to our methods, he's excellent at a lot of things (hitting for average, hitting in the clutch, sacrifice bunting, running the bases, stealing bases, avoiding errors, staying healthy), very good at some things (getting to balls in right and keeping runners from taking extra bases), and below average in some ways (drawing walks, hitting for power). That's quite a package, and I'd definitely want this guy on my team. But I just don't see the evidence that he's among the top defensive outfielders in the game. So, if Ichiro doesn't get my vote, then who does deserve the other two outfield Gold Gloves for the AL? Singleton topped the charts in plays-made-per-opportunity, but he only started 102 games. Lofton only started 123 games. Singleton and Hunter have subpar throwing arms. (Hunter tied for the league lead in assists by a CF with 14, but several of those came on plays where the lead runner scored, and he allowed lots of runners to take extra bases.) Hunter plays in a tough park -- it's easy to lose balls in the Metrodome roof -- so he's better than his numbers suggest, and his numbers are very good to begin with. Erstad made only one error all season, leading all major-league CFs in fielding percentage. It's a very close call, but there are some big differences in playing time to consider. Performance rates are very important, but when it comes to seasonal awards, the volume of performance is more important. So when someone performs at a high level for 145 games, that trumps someone else who performed at a slightly higher level for 120 games. On that basis, my other two votes would go to Erstad and Hunter. Over in the NL, the top candidates (in my mind) were Geoff Jenkins in left, Andruw Jones in center, plus Larry Walker, Vladimir Guerrero, and Brian Jordan in right. J. D. Drew would have been on this list were it not for the injury that cost him about 50 games. The voters chose Walker, Jones, and Jim Edmonds. I agree with the selections of Walker and Jones, but in my opinion, either Jenkins or Guerrero would have been a much better choice than Edmonds. Jenkins is a terrific left fielder, but I have to give it to Guerrero because (a) Jenkins started only 104 games, (b) Guerrero showed great range too, and (c) Guerrero has a cannon for an arm. Guerrero does make too many errors, but his range and arm more than compensate for them. Jim Edmonds has made some of the most amazing plays I have ever seen, but he simply doesn't cover as much ground as some of the younger players at this position. This year, he was below average in range factor and the STATS zone rating, and according to our method, made 16 fewer plays than the average CF given the opportunities presented to him. He battled groin, toe and knee problems, and he's starting to get up in years. I just don't see any reason to believe that he's a more valuable outfielder than the other guys I mentioned. Recap. Here's how my selections would agree or disagree with those of the voters: Pos Voters Diamond Mind P Mussina, Maddux Garcia, Rueter C Rodriguez, Ausmus same 1B Mientkiewicz, Helton same 2B Alomar, Vina Kennedy, Vina 3B Chavez, Rolen same SS Vizquel, Cabrera Gonzalez, Cabrera OF Cameron, Walker same OF Hunter, Jones same OF Ichiro, Edmonds Erstad, Guerrero
Other playersNow that we've offered our two-cents worth on the Gold Glove winners, there are some other players worth mentioning: Bobby Abreu, RF -- According to our system, Abreu's play-making scores have been very erratic lately -- quite good through 1998, subpar in 1999, very good in 2000, and average this year. Looked at in the context of the past three seasons, it now seems as if the Excellent rating we assigned for his performance last year was generous, even though he was clearly in the top tier statistically that season. I'm at a loss to explain these ups and downs. Craig Biggio, 2B -- This former Gold Glover missed the last two months of the 2000 season with a knee injury that required surgery. In January, his general manager warned that Biggio's range and baserunning ability would most likely be limited, especially early in the year. Those comments proved to be accurate, as Biggio's range was far below its previous level and he stole only seven bases, down from 50 only three years ago. His baserunning instincts are still good, so he was a little above average in that regard, but nowhere near the Excellent level he sustained before he hurt his knee. Tony Clark, 1B -- A great athlete who has earned our Very Good rating for defense the past two years, Clark has been battling back problems that have kept him out of the lineup and hurt his power and defense. We downgraded his range rating to Fair as a result, but if he regains his health, you can expect it to rebound next year. Ken Griffey, CF -- Spent much of the season trying to play despite a torn hamstring and its after-effects, and it clearly showed. In a little more than half a season of playing time, Griffey made ten fewer plays than the average CF, thereby earning a Fair rating. Expect that to rise next year if he's back at 100%. Derek Jeter, SS -- I know we're going to take some heat from New York fans on this one, but I assure you that there is no bias in our decision to assign Jeter a Fair range rating this year. According to our analysis, Jeter made 32 fewer plays than the average shortstop given the opportunties presented to him. He was below average going to his right, below average going to his left, and below average on balls hit more or less at his position. His STATS zone rating was fifty points below average. His range factor was lowest in the majors among those who played at least 100 games at the position. At one time, Scott Brosius's superior range affected Jeter's numbers, but Brosius has declined from Excellent to Average in recent years and is no longer a factor in evaluating Jeter. The New York infield ranked 10th in the league in the percentage of ground balls that were turned into outs. And it was 13th in double play percentage. Alfonso Soriano probably deserves most of the blame for the low DP rate, but if Jeter was an outstanding fielder, he would have compensated for Soriano's limitations to some extent, and the team would have been closer to the league average. In his defense, he played behind a staff that produced 5% fewer ground balls than the average team, so his range factor was artificially depressed. Take that into account, and Jeter's range factor would have been only the second- or third-worst in the majors. And, of course, in the playoffs, he made a couple of very heady and gutsy plays that had everyone talking about his courage, his will to win, and his intelligence. But a couple of attention-getting plays aren't enough, in my opinion, to offset the mountain of evidence indicating that Jeter simply didn't get to as many balls as most of the other shortstops in the game. Ryan Klesko, 1B -- Earlier in his career, before he was traded to San Diego, Klesko didn't show much range at first base in the limited amount of time he played that position for Atlanta. In 2000, he showed average range in his first full season as a 1B. We gave him an average rating for that performance, even though we weren't certain that he had improved that much. But there was a major drop this year, and his Pr rating reflects that. Klesko has surprised a lot of people by stealing 23 bases in each of the past two seasons, but his career record is quite poor in both left field and at first base, so it seems as if his 2000 season was the anomaly. Carlos Lee, LF -- Different fielding metrics suggest that Lee's range in left was anywhere from a little above average to a little below average. Yet his defense was sharply criticized in Sports Illustrated's pre-season baseball issue and again late in the season in a Baseball Weekly note. He was replaced defensively 39 times, and that normally happens only to players who are major liabilities in the field. In this case, however, the guys replacing him were superior defenders like Chris Singleton, so it doesn't necessarily mean that Lee was terrible, only that the other guys were better. We asked several people who follow the Sox, and their opinions ranged from "he's under-rated" to "he looks awkward but gets the job done" to "he's as bad as they say." We've chosen to assign him an Average rating this year. That may be a little generous, and I wouldn't be surprised if he slips back to a Fair rating next year. Raul Mondesi, RF -- Has a very good reputation for defense, but that's mostly based on his great arm. In terms of range, our analysis shows that he's been slightly above average throughout his career. In the spring, it was reported that Mondesi came to camp carrying some extra weight, and his defensive numbers took a big dive. Coincidence? Maybe, but we felt a Fair rating was an accurate reflection of his 2001 performance. He could easily rebound next year. Todd Zeile, 1B -- A year ago, we wrote that his Excellent range came as a complete surprise even though third basemen often move across the diamond and look very good relative to the men who play first. But we were skeptical. He's never had a reputation as a good fielder, and we wondered whether he'd be able to keep it up. He didn't, so it may be that last year was a fluke or a case where the various fielding measures over-stated his value for some reason. We rated him Average this year. |
![]() |