Measuring team efficiency
By Tom Tippett
December 9, 2004
The baseball world seems to be gaining a better understanding of the relationships between (a) wins and runs and (b) runs and the underlying offensive events that produce runs. Maybe that's just wishful thinking on my part, but I think it's true, especially with prominent baseball writers like Rob Neyer, Alan Schwarz, and the Baseball Prospectus crew making frequent references to these ideas.
In a nutshell, you
win games by outscoring your opponents, so the connection between runs
and wins is very strong, even though every season produces
a few teams that win more or less than you'd expect given their run differential. And you score runs by putting together
hits, walks, steals, and other offensive events, and you prevent runs
by holding the other team to a minimum of those things. In most cases, there's a direct relationship between runs
and the underlying events that produce runs.
To explore the relationship between runs and wins, we'll use the pythagorean
method that was developed by Bill James. To explore the relationship between
offensive events and runs, we'll compare total bases plus walks with runs.
We'll use the term efficiency to represent the ability to turn events into runs and runs into wins. An efficient team is one that produces more wins than expected given its run margin, produces more runs than expected given its offensive events, or allows fewer runs than expected given the hits and walks produced by their opponents.
In the 2002 edition of this article, we showed that teams that are unusually efficient
(or inefficient) have exhibited a very strong tendency to revert back
to the norm the next year. That's good news for some teams and bad news
for others. If you'd like to find out who falls into which category, read
on.
Converting runs into wins
The Bill James pythagorean method, a well-established formula based on
the idea that a team's winning percentage is tightly coupled with runs
scored and runs allowed. This method is gaining currency, as the expanded standings on ESPN.com and baseballprospectus.com include run
margins and expected win-loss records derived using this formulas of this type.
Bill's formula is quite simple ... take the square of runs scored and divide
it by the sum of the squares of runs scored and runs allowed (RF = runs
for, RA = runs allowed):
RF ** 2
Projected winning pct = -----------------
RF ** 2 + RA ** 2
In 2004, for instance, 19 of 30 teams finished with win-loss records within three games of their projected records, and 26 of 30 teams finished within five games. That 26 of 30 mark was the same in 2003.
The exceptions are what interest us for this article, and we had a big one this year. The Yankees won 12 more games than normal for a team with a run margin of +89. On a run-margin basis, they were more like an 89-win team than the squad who led the AL with 101 victories. Since 1962, when the 162-game schedule was first used in both leagues, no team has ever been more than 12 games better than their pythagorean projection.
But
43 years of baseball history tells us that such large deviations are unusual
and tend not to be repeated the following year. In other words, the Yankees must dramatically improve their run margin in 2005 if they are to come close to matching this year's win total. The same is true of the Reds, who finished 10 wins to the good in 2004 after going +7 in 2003, putting them on a very short list of teams that overachieved by a large number of games two years in a row.
The two teams that most underperformed their pythagorean records were the Tigers (-7) and the Cubs (-6). That Cubs deficit was enough to deny them a spot in the postseason.
Converting offensive events into runs
Just as there is a strong relationship between runs and wins, it's almost
always true that the more hits and walks you produce, the more runs you'll
score. Sometimes, of course, a productive team comes up short on the scoreboard
because they didn't hit in the clutch, didn't run the bases well, or hit line
drives right at people in key situations. But this relationship holds
up most of the time.
To shed some light on this relationship, we need a way to take batting
stats and turn them into a measure of overall offensive production. There
are several good options here, including Runs Created (Bill James), Batting
Runs (Pete Palmer), Equivalent Average (Clay Davenport), and OPS (on-base
average plus slugging average).
For this exercise, we'll use the sum of total bases and walks, or TBW
for short. TBW is not a perfect measure, but it does have a few things
going for it. It captures the most important things a team does to produce
runs -- singles, extra-base hits, and walks -- and it's easy to figure without
a computer.
As with other statistics, a team's TBW total can be significantly influenced
by its home park. For that reason, we focus on the difference
between the TBW produced by a team's hitters and the TBW allowed by its
pitchers. This effectively removes the park from the equation and helps
us identify teams that outproduced their opponents.
The following table shows the offensive and defensive TBW figures for
the 2004 American League, along with the difference between these two
figures and each team's league rank based on those differences. It also
shows runs for and against, the run differential, and the rankings based
on run differential. Finally, because we're trying to trace a path from
TBW to runs to wins, it lists the team's win total and league rank for
the year.
---------- TBW ---------- ------- Runs -------- - Wins -
AL Off Def Diff Rank Off Def Diff Rank Num Rank
NY 3200 2883 +317 2 897 808 + 89 3 101 1
Bos 3361 2734 +627 1 949 769 +180 1 98 2
Bal 3004 2961 + 43 7 843 830 + 13 8 78 9
Tam 2690 2993 -303 12 714 842 -128 13 70 11
Tor 2744 3005 -261 11 719 823 -104 11 67 12
Min 2938 2748 +190 4 786 720 + 66 4t 93 3
Chi 3028 3028 0 9 865 831 + 34 7 83 7
Cle 3126 3129 - 3 10 863 863 0 9 80 8
Det 3044 3007 + 37 8 827 844 - 17 10 72 10
KC 2662 3207 -545 14 720 905 -185 14 58 14
Ana 2885 2832 + 53 6 836 734 +102 2 92 4
Oak 3086 2837 +249 3 793 742 + 51 6 91 5
Tex 3064 2973 + 91 5 860 794 + 66 4t 89 6
Sea 2760 3066 -306 13 698 823 -125 12 63 13
As you can see, the team rankings using TBW and those using run differentials
are similar. The most efficient team was Anaheim, which turned a rather ordinary +53 TBW advantage into the league's second-best run margin.
Boston's TBW differential of +627 was the third best in the past 30
years and a 33-base improvement over their 2003 mark. We already know why didn't they run away with the division. Had the Yankees not found a way to win 12 more games than usual given their run margin, Boston would have coasted to the division title.
At the bottom of the AL East, you can see that the Blue Jays weren't quite as bad as the Devil Rays statistically. Yes, Tampa Bay is improving, but they still have a long way to go, and their 4th-place finish was a bit of an anomaly.
In the AL Central, Minnesota was clearly the best team, but the differences among the next three were quite small. At the TBW level, they were close enough to finish within a game of each other. But Chicago was a little better at turning TBWs into runs, Detroit was a little inefficient, and the Tigers also fell far short of their pythagorean mark. Put another way, you can discount the differences in wins and argue that they're all starting from about the same place as they prepare for 2005.
The AL West was a good example of the importance of efficiency (if you believe these things are the result of team skills) or luck (if you don't). Anaheim was third in TBW differential, barely in positive territory, but were efficient enough on both sides of the ball to turn that advantage into the division's best run margin. Oakland's downfall was an offense that produced only 793 runs, 49 short of the total projected by the Bill James Runs Created formula, thanks in part to a disappointing .239 batting average with runners in scoring position and two out. Otherwise, they might have clinched the division with ten days to spare.
Moving on to the National League:
---------- TBW ---------- ------- Runs -------- - Wins -
NL Off Def Diff Rank Off Def Diff Rank Num Rank
Atl 3002 2744 +258 3 803 668 +135 2 96 2
Phi 3144 3022 +122 7 840 781 + 59 8 86 8
Flo 2729 2758 - 29 9 718 700 + 18 9 83 9
NY 2772 2848 - 76 10 684 731 - 47 10 71 12
Mon 2640 2974 -334 14 633 769 -136 14 67 14t
StL 3101 2648 +453 1 854 657 +197 1 105 1
Hou 2975 2816 +159 5 801 697 +104 4 92 4
Chi 3068 2722 +346 2 789 665 +124 3 89 6
Cin 2904 3312 -408 15 750 907 -157 15 76 10
Pit 2614 2837 -223 12 680 744 - 64 11 72 11
Mil 2662 2795 -133 11 634 757 -123 13 67 14t
LA 2881 2748 +133 6 761 684 + 77 6 93 3
SF 3134 2916 +218 4 850 770 + 80 5 91 5
SD 2872 2814 + 58 8 768 703 + 65 7 87 7
Col 3104 3347 -243 13 833 923 - 90 12 68 13
Ari 2618 3108 -490 16 615 899 -284 16 51 16
St. Louis ran the table in 2004, ranking first in TBW differential,
run differential, and wins, all by a very comfortable margin. The biggest surprise was the Dodgers, which finished third in wins but only sixth in TBW and runs. Their mirror-image was the Cubs, who managed to lose the wildcard despite ranking second in TBW and third in runs. Cincinnati once again defied gravity by finishing only five wins short of a .500 season despite being second-last in both TBW and runs.
There's a clear division between the haves and the have-nots in the NL Central, with the top half of the division looking strong and the bottom half looking very weak in all categories. Interestingly, the underlying stats for the Astros were the least impressive among the haves, yet they qualified for the postseason and made an impressive run. The +453 TBW differential for the Cardinals was good enough to put them in the top twenty in the thirty years for which we've got the data to compute these numbers, and while the Reds mark of -408 wasn't quite bad enough to put them in the bottom twenty, it wasn't far off.
In the NL West, the big story was the Giants. Our preseason simulations identified them as the team most likely to win the division, though there were anything but a juggernaut, averaging only 87 wins in our 100 simulated seasons.
But San Francisco struggled when the real season began, languishing in third place with a 28-28 record through the 6th of June. And that's the good news. With a TBW differential of -49 and a run margin of -35, they were fortunate to have won as many as they had lost. The Dodgers and Padres were three games ahead in the standings.
Over the next four months, the Giants turned on the jets, posting a very impressive TBW differential of +267 and outscoring their opponents by 115 runs. Meanwhile, the Dodgers were playing well, gaining 84 bases and 54 runs, but not all that close to the Giants statistically. But San Francisco was able to gain only one game on the Dodgers despite outproducing them by almost 200 bases and 61 runs. And that wasn't good enough to bring any postseason action to the Bay Area.
Looking ahead
As we've pointed out, it's unusual for teams that are especially efficient or inefficient to sustain those levels the next year. Instead, they tend to revert to the normal relationships between TBW and runs and between runs and wins. That means we can identify teams that are likely to improve or fall back even if they don't make moves that change their talent level significantly.
For example, the Baltimore Orioles were above average in TBW differential and run margin, but managed only a 78-84 record. History tells us that they're a few games closer to being a contender than their 78-win mark suggests.
Similarly, Oakland was unable to translate its impressive +249 TBW differential, which was in the top 20% of all team seasons since 1974, into a playoff spot. If the A's can make it through the winter without having to shed talent for the sake of their budget, they're likely to go into 2005 as our pick to win the division again, assuming their rivals don't improve in a big way.
One of the bigger spreads between statistical performance and win-loss record belongs to the Tigers. Statistically, they were more like a .500 team than a 72-win club. In our preseason projections article, our Tigers comment mentioned that the Oakland A's improved by 769 net TBW from 1979 to 1980. We were careful not to predict a similar bounce for the Tigers, but we wanted to let readers know that it wasn't unprecedented. As it turned out, the Tigers improved by 706 TBW this year.
What does this mean for the future of the Tigers? Unless 2004 turns out to be a fluke, they could enter 2005 as a legitimate contender for the division title. With a statistical base of .500, it wouldn't take more than a couple of good personnel moves and/or a couple of growth spurts from young players to put them in the hunt.
Historically, there are six other teams that have improved by at least 500 net TBW in one season:
Team |
Yr 1 |
Yr 2 |
Future |
Oakland
+769
|
1979 |
1980 |
Won 83 games in 1980, led the AL in wins in the strike-shortened 1981 season, but fell back to 68 wins in 1982 and stayed under .500 for five years |
Detroit
+706 |
2003 |
2004 |
??? |
Milwaukee
+679 |
1977 |
1978 |
Paul Molitor's rookie season ... began a six-year run when they were among the best teams in baseball ... the '78 Brewers had a TBW differential of +493, much better than the Yankees (+216) and Red Sox (+255), who staged the epic pennant race that culminated in the Bucky Dent homer |
Detroit
+674
|
1996 |
1997 |
Went from awful in 1996 to OK in 1997 to bad in 1998-1999 to OK in 2000 to awful in 2001-2003 |
Arizona
+600
|
1998 |
1999 |
Won the division in 1999, settled back to 85 wins in 2000, won it all in 2001 |
San Francisco
+543
|
1992 |
1993 |
Barry Bonds' first year with the Giants ... lost division to the Braves by one game despite 103 wins in 1993 ... under .500 in 1994 through 1996 |
Philadelphia
+508 |
1974 |
1975 |
Start of a nine-year run featuring five division titles (six if you count the first half of 1981) and one world title |
Obviously, there's no guarantee that the big leap just experienced in Detroit will lead to bigger and better things, but that certainly could happen.
The American League's biggest over-achievers, from an efficiency standpoint, were the Yankees. Their TBW differential was impressive but nowhere near that of the Red Sox, their run margin was nothing special for a winning team, yet their 101 wins led the league. This team has some work to do if it hopes to approach 100 wins again in 2005.
In the National League, there were fewer surprises. For the second year in a row, the Cincinnati Reds were awful on TBW and runs but still managed to finish within hailing distance of .500. That rarely happens two years in a row, and I'll be shocked if happens again in 2005.
The Cubs were the league's biggest under-achievers, posting the second-best TBW mark (better than the Yankees, for whatever that's worth) and third-best run margin, but missing out on postseason play. They'll be a team to watch next season.
As we noted above, the Dodgers were good but didn't demonstrate the statistical underpinnings of a 93-win team. Still, it's wins that matter and they won the big games when they had to.
Finally, we should mention that Arizona and Seattle nearly broke the 30-year record for the biggest decline in TBW differential from one year to the next. Arizona was down by 672, Seattle by 601. Only the 1998 Florida Firesales (-793) went from penthouse to outhouse at a faster clip.
Wrapping Up
A lot of things will change between now and opening day. This process
of looking at TBW differentials and run margins doesn't tell us how the
2004 season will unfold, but it can identify some teams that might have
more or less work to this winter than you may have thought.
Last year, we used this sort of analysis to identify the Royals, who were among 2004's biggest disappointments, as a team
that wasn't as good as their record indicated:
"... the Royals had a winning record despite finishing with the league's 11th best TBW differential, so they could easily fall into the low-to-mid 70s in wins next year."
Two years ago, we mentioned the Cubs as a team likely to get a big bounce in 2003, and that club came within five outs of going to the World Series.
Obviously, this winter will bring a lot of roster changes and injury reports that will also affect the outlook for 2005 in a big way. But I think it's safe to say that the Tigers and Cubs are among the decent-to-good teams most likely to add to their win totals next season. The Mariners, Diamondbacks, and Royals are also in line for efficiency-related bounces, but it's harder to get excited about clubs that are starting from such a position of weakness.
|