2002 Predictions -- Keeping Score 
By Tom Tippett
November 25, 2002
In the spring of 1998, we released our first annual Projection Disk,
enabling you to play the coming season using Diamond Mind Baseball and
over 1500 established big leaguers and top minor-league prospects.
Players on our Projection Disks are rated to perform in accordance with
stats that we create using our projection system.
That system produces expected statistics for all batters and pitchers
based on a blend of major-league and minor-league stats from the past
three years, adjusted for factors such as the level of competition (majors,
AAA, AA), ballpark effects (including minor-league parks), league rules
(DH vs non-DH), and age (young players are projected to improve while
older players are projected to fade a little each year).
After the projected stats have been computed and the player ratings assigned,
we set up a manager profile for each team with the starting rotation,
bullpen assignments, lineups versus left- and right-handed pitchers, and
roles for bench players (such as platoons, spot starters, and defensive
replacements) based on best assessment of how the players will be used
in the coming season.
We then simulate the season many times, average the results to come up
with our projected final standings for the season, and write an article
describing the results and commenting on the outlook for every team.
That first year, we were curious to see how our projected standings would
stack up against the pre-season predictions in leading magazines such
as Sports Illustrated, The Sporting News, Street and
Smith, and USA Today Baseball Weekly.
Assigning scores
To do those rankings, we needed a way to assign an accuracy score to
each prediction. So we turned to our friend Pete Palmer, the co-author
of Total Baseball and The Hidden Game of Baseball. Pete
has been projecting team standings for more than 25 years, and he routinely
collects predictions and ranks them at the end of the year.
Pete's rankings are based on a simple scoring system -- subtract each
team's actual placement from their projected placement, square this difference,
and add them up for all the teams. For example, if you predict a team
will finish fourth and they finish second, that's a difference of two
places. Square the result, and you get four points. Do this for every
team and you get a total score. The lower the score, the more accurate
your predictions.
We don't try to break ties and name one team as having finished ahead
of the other. If, for example, two teams tie for first, we say that each
team finished in 1.5th place for the purposes of figuring out how many
places a prediction was off. If a team was projected to finish third and
they tied for first instead, that's a difference of 1.5 places. The square
of 1.5 is 2.25, so that would be the point total for this team. That's
why you'll see some fractional scores in the tables below.
Keeping things in perspective
That first year, we created a little database with our projected standings
and those of fourteen national publications, and we were pleased to see
that we ended the year with the best accuracy score among those fifteen
forecasts as measured by Pete's formula. When we wrote up the results
and posted them to our web site, however, we were very careful not to
make any grand claims, saying:
"I'm not sure what to make of all this. It's just
one year, and it's entirely possible that we were just lucky. Time will
tell whether our approach to projecting seasons is consistently better
than average. But it sure is fun to make those predictions and then
take a look at them later, so we'll keep doing it."
Over time, we expanded our database to include the predictions of prominent
baseball writers from major newspapers and those of several ESPN.com staffers.
In the sections below, we'll show you how these prognosticators ranked
in 2002 and over a period of years, with the period varying in length
depending on when we added that forecaster to our database. We don't make
any claims of completeness here -- there are lots of other predictions
that are not in our database -- but I think you'll find that our sample
is an interesting one.
For several reasons, I want to emphasize that it's important that nobody
take these rankings too seriously.
First, this isn't the only scoring system one could use to rank these
projections, of course. The rankings
at ESPN.com use the same approach but don't square the differences.
A fellow named Gerry Hamilton runs a predictions contest every year (see
http://www.tidepool.com/~ggh1/index.html)
and assigns a score based on how many games each team finished out of
their predicted place in the standings.
Second, it's not entirely fair to put all of these predictions into the
same group. Because of publishing deadlines, the predictions in the spring
baseball magazines are made long before spring training started, others
(including ours) are usually prepared in early-to-mid March, while some
are published just before opening day. Obviously, the later you do them,
the more information you have on player movement and injuries.
Third, many newspaper editors ask staff writers to make predictions so
their readers have something to chew on for a couple of days. Some writers
hate doing them but comply because their editors insist. Others may make
off-the-wall picks just for grins. We don't have a reliable way to decide
which are serious, so we include them all. But we do ask you to remember
that some of these predictions may have been made in jest.
Finally, our projections are based on the average of many simulated seasons.
That means that the normal ups and downs of a single season are smoothed
out. For example, in 2002, we had the A's winning the AL West with 96
wins, but their win total in any one simulated season could have been
much higher or lower, and they didn't always finish first. In two of the
fifty seasons that we ran last March, Anaheim won that division even though
the overall averages put them in last place.
Rankings for 2002
It's interesting to see how everyone did this year, of course, and those
rankings are shown in the first table below.
More interesting, at least to my mind, is the process of looking back
at those predictions and identifying the patterns that emerge. Which teams
were consistently under- or over-estimated? Which divisions contained
the biggest surprises? Did anyone predict that certain teams would have
a sudden change of fortune? So we'll follow the table with a brief analysis
of the six division races.
Forecaster Score
Tony DeMarco, MSNBC.com 34
Athlon 38
Bob Hohler, Boston Globe 38
Peter Gammons, ESPN 38
Diamond Mind simulations 40
Baseball Weekly 42
Brandon Funston, ESPN.com 42
Eric Karabell, ESPN.com 42
Jayson Stark, ESPN 42
Lindy's 42
Danny Sheridan, USA Today 44
Los Angeles Times 44
Zack Scott, Diamond Mind 44
Dallas Morning News 46
Las Vegas over-under line 46
2001 final standings 48
Andy Latack, ESPN.com 48
Baseball America 48
Chicago Tribune 48
David Schoenfield, ESPN.com 48
Rob Neyer, ESPN.com 48
Sean McAdam, ESPN.com 48
Sports Illustrated 48
Tim Kurkjian, ESPN.com 48
Pete Palmer 50
San Francisco Chronicle 50
Alan Schwarz, ESPN.com 54
The Sporting News (spring magazine) 54
Gordon Edes, Boston Globe 54
Matt Szefc, ESPN.com 54
Phil Rogers, ESPN.com 56
Spring Training Yearbook 56
Bob Ryan, Boston Globe 58
USA Today 58
Jim Caple, ESPN.com 60
Steve Mann 60
Bob Klapisch, ESPN.com 62
Joe Sheehan, Baseball Prospectus 62
David Lipman, ESPN.com 66
Dan Shaughnessy, Boston Globe 70
Street & Smith 70
John Sickels, ESPN.com 74
Michael Holley, Boston Globe 76
Rany Jazayerli, Baseball Prospectus 78
Spring training results 86
The "Diamond Mind simulations" entry is the one representing
the average result of simulating the season 50 times. These simulations
were done about three weeks before the season started. Two weeks later,
Zack Scott of Diamond Mind made his own predictions. These were based
largely on the simulation results but also took his own hunches into account.
There are three entries in this list that don't represent the views of
a writer or a publication. If you predicted that the 2002 standings would
be the same as in 2001, your score would have been 48. If you put together
a set of standings based on the Las Vegas over-under line, you'd have
scored 46. And if you predicted that the regular season standings would
match the 2002 spring training standings, your score would have been 86.
In other words, for the second year in a row, the spring training results
were almost useless as a predictor of the real season.
Reviewing the divisions
Much more interesting than the overall scores, in my opinion, are the
details. Leaving out the entries that don't represent writers or publications,
here are some observations about how the others saw things last spring:
AL East. The teams have finished in the same order five years
in a row. Michael Holley of the Boston Globe was the only one to pick
Boston ahead of New York. Three people from ESPN.com (David Lipman, Eric
Karabell, and John Sickels) put Toronto in second. Holley and Dan Shaughnessy
of the Globe were alone in having Baltimore third, a pick that looked
mighty good before the Orioles collapsed in the last six weeks. Everybody
else had Baltimore and Tampa Bay finishing fourth and fifth, with the
O's getting the nod for fourth a little more often than not.
AL Central. The White Sox were the consensus pick as division
winner, but not by a lot. Twenty-four predictions put the pale hose in
first, fourteen correctly picked the Twins, and four thought Cleveland
would hold on for one more year. Two intrepid souls (Jim Caple and Steve
Mann) were brave enough to pick the Tigers for third, but everybody else
had KC and Detroit bringing up the rear, with Detroit picked fourth about
twice as often as KC. Three forecasters (Bob Hohler, John Sickels, and
The Dallas Morning News) got the division right from top to bottom.
AL West. It won't come as a shock when I say that most people
gave the division to Seattle after the Mariners' record-setting 2001 season.
But it was far from unanimous, and about a quarter of them gave the division
to Oakland, while Michael Holley picked the Rangers to finish first. Only
three (The Sporting News, Gordon Edes, and Jayson Stark) picked Anaheim
as high as second, and all three of them had Seattle first and Oakland
third. Twenty-seven predictions put Anaheim in the basement. Needless
to say, nobody was exactly right on this division.
NL East. This division was the undoing of many a predictor this
year. Eleven predictors thought the Mets would end Atlanta's run at the
top. Twenty-two more put them second, five had them third, two (Diamond
Mind and Steve Mann) ranked them fourth, and one (Joe Sheehan) had them
in the basement. By picking Florida first, Steve Mann was the only one
to have a team other than Atlanta or New York atop the division. Montreal
was picked last by everyone except Joe Sheehan and Pete Palmer, and neither
of them had the Expos higher than fourth. Another division that nobody
got right.
NL Central. For those who finished near the bottom of the rankings,
if the Mets didn't get them, the Cubs did. All but three submissions had
either St. Louis (27 times) or Houston (12 times) on top, but three (Dan
Shaughnessy, Bob Ryan, and John Sickels) picked the Cubs. Ten others put
Chicago second, with only two (Tony DeMarco and David Schoenfield) putting
them as low as fourth. It seems surprising now, but nine predictions saw
Milwaukee matching its fourth-place finish from a year. Most, however,
had Cincinnati rebounding to take that fourth spot. Pittsburgh was picked
last on almost every list, but a few had them fifth, and the LA Times
were the only ones to correctly pick them fourth. Cincinnati was picked
last on the only four submissions that didn't have Pittsburgh or Milwaukee
in that slot. Nobody picked this division correctly from top to bottom.
By the way, when we ran our first batch of simulations back in March,
Cincinnati finished one game ahead of Chicago. We had a few more minor
adjustments to make, and while we were working on those adjustments, I
was very worried that we'd end up picking the Cubs fourth and then spend
six months watching them win the division. Now we can see that those early
results would have made us look very smart. At the time, however, I was
very relieved to see the Cubs edge Cincinnati by one game when we ran
our fifty seasons for real.
NL West. For the second year in a row, we were among the most
optimistic about the Rockies chances. Seven predictions put the Rockies
in third and everyone else had them lower than that. And Diamond Mind?
In our simulations, they averaged 85 wins and edged Arizona by one game
for second place. We were among the 25 forecasters who picked the Giants
to win it, while only 15 saw the defending world champion Diamondbacks
repeating atop the division. Two (Joe Sheehan and Rany Jazayerli) picked
the Padres. Most predictions had the Dodgers finishing either third or
fourth, but there were exceptions, as they were listed second three times
(Bob Ryan, Sean McAdam, Phil Rogers) and in the basement twice (Rob Neyer
and Rany Jazayerli). This division was picked correctly from top to bottom
six times, by Tony DeMarco, Street and Smith, Athlon, Lindy's, Eric Karabell,
and Steve Mann.
So, we had one division that was dead simple (24 predictions nailed the
AL East), three that nobody got right, one with three correct predictions,
and one with five. Based on the past five years, that seems about par
for the course. The baseball world looked a bit different back in March,
and a lot can change in six months.
Five-year rankings
Looking back over the past five years, here are the rankings for those
who were included in our sample every year. Disappearing are Baseball
Digest and Mazeroski magazines, neither of which were published in 2002.
In what is most likely just a coincidence, those two were at the bottom
of last year's four-year rankings:
Forecaster 2002 2001 2000 1999 1998 Total
Diamond Mind simulations 40.0 54.5 68.0 42.0 44.5 249.0
Steve Mann 60.0 38.5 58.0 54.0 44.0 254.5
Sports Illustrated 48.0 56.5 40.0 56.0 54.0 254.5
Baseball Weekly 42.0 46.5 58.0 51.5 60.0 258.0
Las Vegas over-under line 46.0 65.5 51.5 48.0 52.0 263.0
Pete Palmer 50.0 70.5 54.0 40.0 58.0 272.5
Sporting News 54.0 52.5 38.0 78.0 54.0 276.5
Athlon 38.0 67.5 42.0 72.0 72.0 291.5
Street & Smith 70.0 68.5 58.0 68.0 64.0 328.5
Previous season standings 48.0 64.5 56.0 70.0 100.0 338.5
Four-year rankings
In 1999, we added some writers from the Boston Globe and ESPN.com, so
the four-year totals include a few more names than did the previous table:
Forecaster 2002 2001 2000 1999 Total
Gordon Edes, Boston Globe 54.0 56.5 26.0 28.0 164.5
Baseball Weekly 42.0 46.5 58.0 51.5 198.0
David Schoenfield, ESPN.com 48.0 56.5 56.0 40.0 200.5
Sports Illustrated 48.0 56.5 40.0 56.0 200.5
Diamond Mind simulations 40.0 54.5 68.0 42.0 204.5
Rob Neyer, ESPN.com 48.0 66.5 48.0 44.0 206.5
Peter Gammons, ESPN.com 38.0 56.5 48.0 66.0 208.5
Steve Mann 60.0 38.5 58.0 54.0 210.5
Las Vegas over-under line 46.0 65.5 51.5 48.0 211.0
Pete Palmer 50.0 70.5 54.0 40.0 214.5
Rany Jazayerli, BP 78.0 62.5 46.0 30.0 216.5
Athlon 38.0 67.5 42.0 72.0 219.5
Sporting News 54.0 52.5 38.0 78.0 222.5
Baseball America 48.0 54.5 54.0 70.0 226.5
Dan Shaughnessy, Globe 70.0 44.5 54.0 58.0 226.5
Previous year standings 48.0 64.5 56.0 70.0 238.5
Bob Ryan, Boston Globe 58.0 84.5 58.0 40.0 240.5
John Sickels, ESPN.com 74.0 68.5 58.0 58.0 258.5
Bob Klapisch, ESPN.com 62.0 57.5 78.0 62.0 259.5
Street & Smith 70.0 68.5 58.0 68.0 264.5
Three-year rankings
The Diamond Mind simulations missed the mark by quite a bit in 2000,
so they rank lower here than in any of the other tables. We added a new
concept to our projection system that year, but we were very unhappy with
the results, and we took that out of the model before doing this again
in 2001. The results have been much better since.
Forecaster 2002 2001 2000 Total
Sean McAdam, ESPN.com 48.0 32.5 38.0 118.5
Gordon Edes, Boston Globe 54.0 56.5 26.0 136.5
Peter Gammons, ESPN.com 38.0 56.5 48.0 142.5
Sporting News 54.0 52.5 38.0 144.5
Sports Illustrated 48.0 56.5 40.0 144.5
Baseball Weekly 42.0 46.5 58.0 146.5
Athlon 38.0 67.5 42.0 147.5
Baseball America 48.0 54.5 54.0 156.5
Steve Mann 60.0 38.5 58.0 156.5
David Schoenfield, ESPN.com 48.0 56.5 56.0 160.5
Diamond Mind simulations 40.0 54.5 68.0 162.5
Rob Neyer, ESPN.com 48.0 66.5 48.0 162.5
Las Vegas over-under line 46.0 65.5 51.5 163.0
Dan Shaughnessy, Globe 70.0 44.5 54.0 168.5
Previous year standings 48.0 64.5 56.0 168.5
Pete Palmer 50.0 70.5 54.0 174.5
Phil Rogers, ESPN.com 56.0 62.5 56.0 174.5
Matt Szefc, ESPN.com 54.0 56.5 68.0 178.5
Rany Jazayerli, BP 78.0 62.5 46.0 186.5
Street & Smith 70.0 68.5 58.0 196.5
Bob Klapisch, ESPN.com 62.0 57.5 78.0 197.5
John Sickels, ESPN.com 74.0 68.5 58.0 200.0
Bob Ryan, Boston Globe 58.0 84.5 58.0 200.5
Two-year rankings
Finally, here's how things have looked in 2001-2002. As you can see,
Lindy's has had two very good years in a row and have to be regarded as
the top dog for the time being. Sean McAdam slipped a bit in 2002 after
two extremely good years, but is still right at the top. This doesn't
surprise me a bit; I have the good fortune of hearing Sean regularly on
Boston's main sports radio station, and he really knows his stuff.
Forecaster 2002 2001 Total
Lindy's 42.0 36.5 78.5
Sean McAdam, ESPN.com 48.0 32.5 80.5
SF Chronicle 50.0 36.5 86.5
Baseball Weekly 42.0 46.5 88.5
Jayson Stark, ESPN.com 42.0 46.5 88.5
Diamond Mind simulations 40.0 54.5 94.5
Peter Gammons, ESPN.com 38.0 56.5 94.5
Steve Mann 60.0 38.5 98.5
Tony DeMarco, MSNBC.com 34.0 67.5 101.5
Baseball America 48.0 54.5 102.5
Zack Scott, Diamond Mind 44.0 58.5 102.5
David Schoenfield, ESPN.com 48.0 56.5 104.5
Sports Illustrated 48.0 56.5 104.5
Athlon 38.0 67.5 105.5
Sporting News 54.0 52.5 106.5
Chicago Tribune 48.0 62.5 110.5
Gordon Edes, Boston Globe 54.0 56.5 110.5
Matt Szefc, ESPN.com 54.0 56.5 110.5
Las Vegas over-under line 46.0 65.5 111.5
Previous year standings 48.0 64.5 112.5
Rob Neyer, ESPN.com 48.0 66.5 114.5
Dan Shaughnessy, Globe 70.0 44.5 114.5
Los Angeles Times 44.0 73.5 117.5
Phil Rogers, ESPN.com 56.0 62.5 118.5
Bob Klapisch, ESPN.com 62.0 57.5 119.5
Pete Palmer 50.0 70.5 120.5
Alan Scwarz, ESPN.com 54.0 70.5 124.5
David Lipman, ESPN.com 66.0 64.5 130.5
Street & Smith 70.0 68.5 138.5
Rany Jazayerli, BP 78.0 62.5 140.5
John Sickels, ESPN.com 74.0 68.5 142.5
Bob Ryan, Boston Globe 58.0 84.5 142.5
Conclusions
Except for the 2000 season, our approach to developing projections seems
to be providing good results.
If there's any one thing that stands out, it's the system's ability to
identify over-rated teams. In 2002, for example, our simulations indicated
that (a) the Mets would have real trouble scoring runs even with the addition
of guys like Mo Vaughn and Roberto Alomar, (b) the Seattle offense would
come back to earth after a terrific 2001 season, and (c) the Cubs were
likely to be battling the Reds for third and fourth place, not challenging
for the division title.
On the other hand, we didn't anticipate the sudden emergence of some
of the game's best bullpens. (I'm not sure anyone else saw this coming,
either.) In our simulations, the relievers on the Angels, Twins, and Braves
were nowhere near as good as they were in the real 2002 season.
And if there's a theme from the past five years, it's that our simulations
occasionally project a team with great hitting and barely acceptable pitching
to do very well, and it seems as if their real-life counterparts often
blow a bunch of leads early in the year and then fall apart. The Rangers
were this year's example, though a very tough division and key injuries
on both sides of the ball were a major factor as well.
I wish we were better at projecting which young teams will continue to
get better, as the Twins did this year, and which will backslide unexpectedly.
Before the season, I wouldn't have been surprised to see the Marlins outperform
the Twins, but it didn't work out that way.
Still, it's a lot of fun and a highly educational process for us. When
we run the simulations in the spring, we always end up learning something.
We always end up being surprised by some of the results. And we always
end up with a bunch of things to watch for as the real season unfolds.
So we'll take another run at it next spring and we'll report back after
the season.
|