Baseball Articles | Index

The Diamond Mind Projection System

Written by: Tom Tippett
Last updated: February 24, 2005

In the 1985 Baseball Abstract, Bill James introduced two very important tools for predicting the future performance of baseball players. The first was the Major League Equivalency, or MLE, which demonstrated that it was possible to use AAA stats to predict how a player is most likely to perform at the major-league level. The second was the detailed description of his Brock2 system for projecting future performance based on past performance and the aging process (improving in the early years, peaking in the middle, and declining thereafter).

Since then, many others have taken these ideas and implemented projection systems of their own, most often for the purpose of helping fantasy baseball players prepare for their fantasy drafts. You can now buy projections from fantasy advisor services and you can find them online and in many books and periodicals.

When we began our work on our Projection Disks in 1998, we had every intention of licensing projected stats from a company that was already in the business of preparing them. There didn't seem to be much point in reinventing the wheel when we could instead focus on the game software, the team rosters, and manager profiles. But we quickly learned that these projections didn't meet the special needs of a full-season simulation, for a variety of reasons:

  • we needed projections for over 1500 players, including many players who had yet to make their major-league debuts, and most of the other sources don't do that many

  • we wanted to use stats from the majors, Japan, AAA, AA, and A ball to make sure we have as much playing time as possible on which to base our projections. Some projection systems ignore minor league stats or go down only as far as AAA.

  • to support a full-blown simulation, we needed to project many more statistical categories than the others provided

  • we needed to project left/right splits as well as overall totals for all batters and pitchers

So we built our own projection system. Actually, we expanded on a system that we originally developed in 1994. When that season came to a premature end, the TOPPS baseball card company hired us to simulate the missing games so they could produce CyberCards with full-season stats (real life stats through August 11 and simulated stats for the rest of the season). To their credit, they wanted to include prominent minor-leaguers (such as Derek Jeter) who would have been called up in September had the season continued. So we developed a method for projecting major-league performance from minor-league statistics.

The Diamond Mind projection system is based on Bill James' MLE and aging ideas, though it uses different and more advanced formulas than those in the 1985 Baseball Abstract. We can do better because of the explosion in available data from both the major- and minor- league level. Here are the key elements in our system:

  • we use both minor-league and major-league statistics from the past three seasons, ensuring that virtually all players have a large amount of playing time on which to base the projections

  • we use statistics from AAA, AA and A ball, plus the Japanese leagues, adjusting them all to their major-league equivalents. Bill James' published formulas cover AAA adjustments only, so we created our own adjustments for lower levels and for Japan.

  • all stat lines are evaluated with respect to league averages. This does two important things. First, it makes sure that stats from hitter-friendly leagues are suitably deflated. Second, it ensures that pitchers who faced the DH and those who didn't are evaluated properly. The DH adds roughly a third of a run per game, and if one doesn't take this into account, NL pitchers would be rated better than their AL counterparts of equal ability.

  • all stat lines are adjusted for ballpark effects, including the minor-league parks. The published Bill James methods do not take minor-league park factors into account because those factors simply weren't available at the time.

  • recent performances are weighted more heavily. What a player last year is more important than what he did three years ago.

  • performances at higher levels are weighted more heavily. What a player did in the majors is much more important than what he did in AA.

  • stat lines with more playing time are weighted more heavily. If someone batted .375 in 24 atbats, that doesn't matter nearly as much as what he did in 400 atbats at some other stop along the way.

  • the individual league- and park-adjusted stat lines are averaged (using the weights just discussed), then age-adjusted to produce a set of projected stats that are league- and park-neutral

  • adjustments are made to account for players whose stats are misleading because they were playing with an injury

  • additional adjustments are made for pitchers with unusually good or bad rates of hits allowed on batted balls that stay in the park, because recent research has shown that extreme in-play batting averages tend not to be repeated in the future

  • these neutral projections are then applied to the league and park in which the player will compete in the coming season

  • projected left/right splits are based on each player's composite splits for the past three seasons

  • the distribution of fly balls and ground balls is based on actual ratios compiled by each player in the past three seasons

That's the essence of the system. The other projection systems we looked at make some of these adjustments, but we're not aware of any that make them all. And we think it's necessary to make them all in order to evaluate past performance correctly and to support a realistic simulation of a season.