Measuring team efficiency

By Tom Tippett
December 9, 2004

The baseball world seems to be gaining a better understanding of the relationships between (a) wins and runs and (b) runs and the underlying offensive events that produce runs. Maybe that's just wishful thinking on my part, but I think it's true, especially with prominent baseball writers like Rob Neyer, Alan Schwarz, and the Baseball Prospectus crew making frequent references to these ideas.

In a nutshell, you win games by outscoring your opponents, so the connection between runs and wins is very strong, even though every season produces a few teams that win more or less than you'd expect given their run differential. And you score runs by putting together hits, walks, steals, and other offensive events, and you prevent runs by holding the other team to a minimum of those things. In most cases, there's a direct relationship between runs and the underlying events that produce runs.

To explore the relationship between runs and wins, we'll use the pythagorean method that was developed by Bill James. To explore the relationship between offensive events and runs, we'll compare total bases plus walks with runs.

We'll use the term efficiency to represent the ability to turn events into runs and runs into wins. An efficient team is one that produces more wins than expected given its run margin, produces more runs than expected given its offensive events, or allows fewer runs than expected given the hits and walks produced by their opponents.

In the 2002 edition of this article, we showed that teams that are unusually efficient (or inefficient) have exhibited a very strong tendency to revert back to the norm the next year. That's good news for some teams and bad news for others. If you'd like to find out who falls into which category, read on.

Converting runs into wins

The Bill James pythagorean method, a well-established formula based on the idea that a team's winning percentage is tightly coupled with runs scored and runs allowed. This method is gaining currency, as the expanded standings on ESPN.com and baseballprospectus.com include run margins and expected win-loss records derived using this formulas of this type.

Bill's formula is quite simple ... take the square of runs scored and divide it by the sum of the squares of runs scored and runs allowed (RF = runs for, RA = runs allowed):

                                RF ** 2
  Projected winning pct =  -----------------
                           RF ** 2 + RA ** 2

In 2004, for instance, 19 of 30 teams finished with win-loss records within three games of their projected records, and 26 of 30 teams finished within five games. That 26 of 30 mark was the same in 2003.

The exceptions are what interest us for this article, and we had a big one this year. The Yankees won 12 more games than normal for a team with a run margin of +89. On a run-margin basis, they were more like an 89-win team than the squad who led the AL with 101 victories. Since 1962, when the 162-game schedule was first used in both leagues, no team has ever been more than 12 games better than their pythagorean projection.

But 43 years of baseball history tells us that such large deviations are unusual and tend not to be repeated the following year. In other words, the Yankees must dramatically improve their run margin in 2005 if they are to come close to matching this year's win total. The same is true of the Reds, who finished 10 wins to the good in 2004 after going +7 in 2003, putting them on a very short list of teams that overachieved by a large number of games two years in a row.

The two teams that most underperformed their pythagorean records were the Tigers (-7) and the Cubs (-6). That Cubs deficit was enough to deny them a spot in the postseason.

Converting offensive events into runs

Just as there is a strong relationship between runs and wins, it's almost always true that the more hits and walks you produce, the more runs you'll score. Sometimes, of course, a productive team comes up short on the scoreboard because they didn't hit in the clutch, didn't run the bases well, or hit line drives right at people in key situations. But this relationship holds up most of the time.

To shed some light on this relationship, we need a way to take batting stats and turn them into a measure of overall offensive production. There are several good options here, including Runs Created (Bill James), Batting Runs (Pete Palmer), Equivalent Average (Clay Davenport), and OPS (on-base average plus slugging average).

For this exercise, we'll use the sum of total bases and walks, or TBW for short. TBW is not a perfect measure, but it does have a few things going for it. It captures the most important things a team does to produce runs -- singles, extra-base hits, and walks -- and it's easy to figure without a computer.

As with other statistics, a team's TBW total can be significantly influenced by its home park. For that reason, we focus on the difference between the TBW produced by a team's hitters and the TBW allowed by its pitchers. This effectively removes the park from the equation and helps us identify teams that outproduced their opponents.

The following table shows the offensive and defensive TBW figures for the 2004 American League, along with the difference between these two figures and each team's league rank based on those differences. It also shows runs for and against, the run differential, and the rankings based on run differential. Finally, because we're trying to trace a path from TBW to runs to wins, it lists the team's win total and league rank for the year.

      ---------- TBW ----------   ------- Runs --------   - Wins -
AL     Off    Def   Diff   Rank   Off   Def  Diff  Rank   Num Rank 

NY    3200   2883   +317     2    897   808  + 89    3    101   1
Bos   3361   2734   +627     1    949   769  +180    1     98   2
Bal   3004   2961   + 43     7    843   830  + 13    8     78   9
Tam   2690   2993   -303    12    714   842  -128   13     70  11
Tor   2744   3005   -261    11    719   823  -104   11     67  12

Min   2938   2748   +190     4    786   720  + 66    4t    93   3
Chi   3028   3028      0     9    865   831  + 34    7     83   7
Cle   3126   3129   -  3    10    863   863     0    9     80   8
Det   3044   3007   + 37     8    827   844  - 17   10     72  10
KC    2662   3207   -545    14    720   905  -185   14     58  14

Ana   2885   2832   + 53     6    836   734  +102    2     92   4
Oak   3086   2837   +249     3    793   742  + 51    6     91   5
Tex   3064   2973   + 91     5    860   794  + 66    4t    89   6
Sea   2760   3066   -306    13    698   823  -125   12     63  13

As you can see, the team rankings using TBW and those using run differentials are similar. The most efficient team was Anaheim, which turned a rather ordinary +53 TBW advantage into the league's second-best run margin.

Boston's TBW differential of +627 was the third best in the past 30 years and a 33-base improvement over their 2003 mark. We already know why didn't they run away with the division. Had the Yankees not found a way to win 12 more games than usual given their run margin, Boston would have coasted to the division title.

At the bottom of the AL East, you can see that the Blue Jays weren't quite as bad as the Devil Rays statistically. Yes, Tampa Bay is improving, but they still have a long way to go, and their 4th-place finish was a bit of an anomaly.

In the AL Central, Minnesota was clearly the best team, but the differences among the next three were quite small. At the TBW level, they were close enough to finish within a game of each other. But Chicago was a little better at turning TBWs into runs, Detroit was a little inefficient, and the Tigers also fell far short of their pythagorean mark. Put another way, you can discount the differences in wins and argue that they're all starting from about the same place as they prepare for 2005.

The AL West was a good example of the importance of efficiency (if you believe these things are the result of team skills) or luck (if you don't). Anaheim was third in TBW differential, barely in positive territory, but were efficient enough on both sides of the ball to turn that advantage into the division's best run margin. Oakland's downfall was an offense that produced only 793 runs, 49 short of the total projected by the Bill James Runs Created formula, thanks in part to a disappointing .239 batting average with runners in scoring position and two out. Otherwise, they might have clinched the division with ten days to spare.

Moving on to the National League:

      ---------- TBW ----------   ------- Runs --------   - Wins -
NL     Off    Def   Diff   Rank   Off   Def  Diff  Rank   Num Rank 

Atl   3002   2744   +258     3    803   668  +135    2     96   2
Phi   3144   3022   +122     7    840   781  + 59    8     86   8
Flo   2729   2758   - 29     9    718   700  + 18    9     83   9
NY    2772   2848   - 76    10    684   731  - 47   10     71  12
Mon   2640   2974   -334    14    633   769  -136   14     67  14t

StL   3101   2648   +453     1    854   657  +197    1    105   1
Hou   2975   2816   +159     5    801   697  +104    4     92   4
Chi   3068   2722   +346     2    789   665  +124    3     89   6
Cin   2904   3312   -408    15    750   907  -157   15     76  10
Pit   2614   2837   -223    12    680   744  - 64   11     72  11
Mil   2662   2795   -133    11    634   757  -123   13     67  14t

LA    2881   2748   +133     6    761   684  + 77    6     93   3
SF    3134   2916   +218     4    850   770  + 80    5     91   5
SD    2872   2814   + 58     8    768   703  + 65    7     87   7
Col   3104   3347   -243    13    833   923  - 90   12     68  13
Ari   2618   3108   -490    16    615   899  -284   16     51  16

St. Louis ran the table in 2004, ranking first in TBW differential, run differential, and wins, all by a very comfortable margin. The biggest surprise was the Dodgers, which finished third in wins but only sixth in TBW and runs. Their mirror-image was the Cubs, who managed to lose the wildcard despite ranking second in TBW and third in runs. Cincinnati once again defied gravity by finishing only five wins short of a .500 season despite being second-last in both TBW and runs.

There's a clear division between the haves and the have-nots in the NL Central, with the top half of the division looking strong and the bottom half looking very weak in all categories. Interestingly, the underlying stats for the Astros were the least impressive among the haves, yet they qualified for the postseason and made an impressive run. The +453 TBW differential for the Cardinals was good enough to put them in the top twenty in the thirty years for which we've got the data to compute these numbers, and while the Reds mark of -408 wasn't quite bad enough to put them in the bottom twenty, it wasn't far off.

In the NL West, the big story was the Giants. Our preseason simulations identified them as the team most likely to win the division, though there were anything but a juggernaut, averaging only 87 wins in our 100 simulated seasons.

But San Francisco struggled when the real season began, languishing in third place with a 28-28 record through the 6th of June. And that's the good news. With a TBW differential of -49 and a run margin of -35, they were fortunate to have won as many as they had lost. The Dodgers and Padres were three games ahead in the standings.

Over the next four months, the Giants turned on the jets, posting a very impressive TBW differential of +267 and outscoring their opponents by 115 runs. Meanwhile, the Dodgers were playing well, gaining 84 bases and 54 runs, but not all that close to the Giants statistically. But San Francisco was able to gain only one game on the Dodgers despite outproducing them by almost 200 bases and 61 runs. And that wasn't good enough to bring any postseason action to the Bay Area.

Looking ahead

As we've pointed out, it's unusual for teams that are especially efficient or inefficient to sustain those levels the next year. Instead, they tend to revert to the normal relationships between TBW and runs and between runs and wins. That means we can identify teams that are likely to improve or fall back even if they don't make moves that change their talent level significantly.

For example, the Baltimore Orioles were above average in TBW differential and run margin, but managed only a 78-84 record. History tells us that they're a few games closer to being a contender than their 78-win mark suggests.

Similarly, Oakland was unable to translate its impressive +249 TBW differential, which was in the top 20% of all team seasons since 1974, into a playoff spot. If the A's can make it through the winter without having to shed talent for the sake of their budget, they're likely to go into 2005 as our pick to win the division again, assuming their rivals don't improve in a big way.

One of the bigger spreads between statistical performance and win-loss record belongs to the Tigers. Statistically, they were more like a .500 team than a 72-win club. In our preseason projections article, our Tigers comment mentioned that the Oakland A's improved by 769 net TBW from 1979 to 1980. We were careful not to predict a similar bounce for the Tigers, but we wanted to let readers know that it wasn't unprecedented. As it turned out, the Tigers improved by 706 TBW this year.

What does this mean for the future of the Tigers? Unless 2004 turns out to be a fluke, they could enter 2005 as a legitimate contender for the division title. With a statistical base of .500, it wouldn't take more than a couple of good personnel moves and/or a couple of growth spurts from young players to put them in the hunt.

Historically, there are six other teams that have improved by at least 500 net TBW in one season:

Team
Yr 1
Yr 2
Future

Oakland
+769

1979
1980
Won 83 games in 1980, led the AL in wins in the strike-shortened 1981 season, but fell back to 68 wins in 1982 and stayed under .500 for five years
Detroit
+706
2003
2004
???
Milwaukee
+679
1977
1978
Paul Molitor's rookie season ... began a six-year run when they were among the best teams in baseball ... the '78 Brewers had a TBW differential of +493, much better than the Yankees (+216) and Red Sox (+255), who staged the epic pennant race that culminated in the Bucky Dent homer
Detroit
+674
1996
1997
Went from awful in 1996 to OK in 1997 to bad in 1998-1999 to OK in 2000 to awful in 2001-2003
Arizona
+600
1998
1999
Won the division in 1999, settled back to 85 wins in 2000, won it all in 2001
San Francisco
+543
1992
1993
Barry Bonds' first year with the Giants ... lost division to the Braves by one game despite 103 wins in 1993 ... under .500 in 1994 through 1996
Philadelphia
+508
1974 1975 Start of a nine-year run featuring five division titles (six if you count the first half of 1981) and one world title

Obviously, there's no guarantee that the big leap just experienced in Detroit will lead to bigger and better things, but that certainly could happen.

The American League's biggest over-achievers, from an efficiency standpoint, were the Yankees. Their TBW differential was impressive but nowhere near that of the Red Sox, their run margin was nothing special for a winning team, yet their 101 wins led the league. This team has some work to do if it hopes to approach 100 wins again in 2005.

In the National League, there were fewer surprises. For the second year in a row, the Cincinnati Reds were awful on TBW and runs but still managed to finish within hailing distance of .500. That rarely happens two years in a row, and I'll be shocked if happens again in 2005.

The Cubs were the league's biggest under-achievers, posting the second-best TBW mark (better than the Yankees, for whatever that's worth) and third-best run margin, but missing out on postseason play. They'll be a team to watch next season.

As we noted above, the Dodgers were good but didn't demonstrate the statistical underpinnings of a 93-win team. Still, it's wins that matter and they won the big games when they had to.

Finally, we should mention that Arizona and Seattle nearly broke the 30-year record for the biggest decline in TBW differential from one year to the next. Arizona was down by 672, Seattle by 601. Only the 1998 Florida Firesales (-793) went from penthouse to outhouse at a faster clip.

Wrapping Up

A lot of things will change between now and opening day. This process of looking at TBW differentials and run margins doesn't tell us how the 2004 season will unfold, but it can identify some teams that might have more or less work to this winter than you may have thought.

Last year, we used this sort of analysis to identify the Royals, who were among 2004's biggest disappointments, as a team that wasn't as good as their record indicated:

"... the Royals had a winning record despite finishing with the league's 11th best TBW differential, so they could easily fall into the low-to-mid 70s in wins next year."

Two years ago, we mentioned the Cubs as a team likely to get a big bounce in 2003, and that club came within five outs of going to the World Series.

Obviously, this winter will bring a lot of roster changes and injury reports that will also affect the outlook for 2005 in a big way. But I think it's safe to say that the Tigers and Cubs are among the decent-to-good teams most likely to add to their win totals next season. The Mariners, Diamondbacks, and Royals are also in line for efficiency-related bounces, but it's harder to get excited about clubs that are starting from such a position of weakness.