Diamond Mind Email Newsletter
December 13, 2002
Written by Tom Tippett
Welcome to the sixth edition of the Diamond Mind email newsletter for
the year 2002. Through these newsletters, we will try to keep you up to
date on the latest product and technical information about the Diamond
Mind Baseball game, related player disks, and our ongoing baseball research
efforts. Back issues are available on our web
site.
Topics for this issue:
2002 Season Disk now shipping
2002 Season Disk update
Tips for using the 2002 Season Disk
Several new articles available
Leadoff walks
How age affects our view of the game
2002 Season Disk now shipping
We're happy to report that we began shipping the 2002 Season Disk on
schedule earlier this week, and all advance orders have been shipped.
If you ordered in advance for overnight or email delivery, the season
disk should already be in your hands. And if you requested delivery by
first class mail, priority mail, or air mail, your package should arrive
by the end of next week.
2002 Season Disk update
Earlier today, we updated our master copy of the 2002 Season Disk to
make a minor change to two ballparks.
If your copy of the season disk was shipped on or after December 13th,
you already have the new version and you don't need to do anything. If
you received the original version, please read the following so you can
decide what action, if any, you wish to take.
The weather this year was unusually cool in Pacific Bell Park in San
Francisco and Network Associates Coliseum in Oakland. As a result, we
assigned these parks an average temperature of Cool.
This setting has been supported by the game for many years and does not
cause any problems. But it can lead to a serious problem if you follow
the steps described in the next paragraph.
The Modify Park window has a drop-down list of available temperature
settings, but Cool is not one of the options on that list. As a result,
if you choose to Modify one of these parks and click on the weather tab,
you will see a blank value for average temperature. If you leave it blank
and click OK, an invalid temperature setting will be stored, and you'll
see ridiculously high scores for any future games played in that park.
(If you choose another temperature setting before clicking OK, or if you
click Cancel instead, you're fine.)
There's nothing wrong with the Cool setting other than the fact that
the Modify Park window doesn't recognize it. While testing the season
disk, we ran well over a hundred simulated seasons with excellent results.
So if you don't think you'll ever use the Modify Park window in exactly
this way, no action needs to be taken.
However, if you're worried that you might accidentally follow that sequence
at some point, we recommend that you use the Modify Park window to assign
the Comfortable setting to these two parks.
Changing the setting from Cool to Comfortable will not have an impact
on your simulation results. Both parks were right on the boundary between
these two settings and we could just as easily have coded them as Comfortable
in the first place.
The Cool setting has been supported within the Diamond Mind game engine
for many years even though it doesn't appear on the Modify Parks window
and has almost never been used. (Prior to this year, the only other park
given this setting was Candlestick Park in 1999.) We will add the Cool
setting to the Modify Parks window for version 9.
Tips for using the 2002 Season Disk
Here are a few tips regarding the use of this season disk:
1. We have prepared four notes that you can view through the Notes page
of the Organizer window. We recommend that you take some time to read
these notes in the relatively near future, as they contain useful information
that may answer questions you might have about using the season disk,
the statistics and ratings on the disk, and what you can expect when you
start playing games with it.
2. The 2002 Season Disk is shipped with the real-life transactions and
game-by-game starting lineups feature turned on, real-life opening day
rosters (meaning that players who were disabled on opening day in real
life are also disabled on this season disk), and the "as-played" 2002
schedule installed. By "as-played", we mean that postponed games are listed
on the dates they were actually played.
The use of real-life transactions and lineups requires that the rosters
and schedule be exactly as they were in real-life. Feel free to change
rosters or switch to the original ("as-scheduled") schedule, but if you
do, remember to change the settings in your organization or league so
the use of real-life transactions and lineups is turned off.
3. The season disk includes multiple player records for anyone who appeared
with more than one team this year. These players have one record for each
team and one combined record that reflects their overall performance.
If you wish to release all players into free agency and draft new rosters
from scratch, start by using the "Release all players" command and then
use "Delete team-specific records". Both commands can be found on the
Tools menu.
If you don't run the "Delete team-specific records" command, these multi-team
players will be drafted more than once. And this command must be used
AFTER releasing the players, because it deletes those team-specific records
from the list of free agents, not from team rosters.
4. If you ran a draft league using our 2001 Season Disk, remember that
you can use the Migrate command on the File menu to automatically set
up the 2002 Season Disk with the structure of your league and your team
rosters. See the DMB help system for more information on how to use the
Migrate feature.
If you use Migrate, remember that:
a) the "source" database is your 2001 league database and your "target"
database is the 2002 Season Disk. (You can install the 2002 Season Disk
more than once if you want to migrate your league to one copy and have
another with the real-life rosters still intact.)
b) Migrate does not assign home parks to each team, so you'll have to
do that yourself.
c) When Migrate is placing a multi-team player on a roster, it's the
combined record that is used. His team-specific records for the 2002 season
are placed in the free agent pool. Use the "Delete team-specific records"
command on the Tools menu to remove them before running a draft.
d) Migrate does not create manager profiles, so you'll need to generate
new ones or use the "Roster / manager profile" window to set them up the
way you want before playing games.
5. Before starting a season, take a look at the organization and leagues
options. The disk ships with the generation of game-by-game stats turned
on, but game accounts, boxscores and scoresheets turned off.
If you want faster autoplay results and you don't care about being able
to look at batting logs, pitching logs, or reports based on time intervals,
turn off the generation of game-by-game stats.
If you run a league and you're planning to use the Transfer features
to exchange game results, statistics, and manager profiles with the managers
in that league, you'll need to turn on the generation of game accounts.
And you may want to turn on the automatic generation of boxscores and
scoresheets.
6. If you plan to set up a pair of leagues whose champions will meet
in a "world series" at the end of your postseason, remember to create
an organization to link those leagues BEFORE your season begins. DMB won't
allow you to create an organization after the season starts, and you'll
need that organization in place to take full advantage of the game's support
for postseason play.
Several new articles available
In November and December, we added these new articles to our web site:
- a list of all of the players who made their big-league debuts this
season, along with their batting or pitching stats for 2002
- a recap of the preseason predictions that were made by various pundits
and publications, along with accuracy rankings for 2002 and for the past
several seasons
- an evaluation of the offensive production each team received from players
at each position, providing an interesting look at each team's offensive
strengths and weaknesses.
- our annual review of the Gold Glove selections along with a substantially
revised edition of our Evaluating Defense article.
- a new way to look at how efficient each team was in the real-life 2002
season. To measure efficiency, we used (a) the familiar Bill James pythagorean
method that shows how each team's win/loss record related to the runs
it scored and allowed and (b) a new statistic that we're calling Run Efficiency
Average (REA). By relating offensive events (hits, extra-base hits, walks)
to runs scored, REA measures how efficiently a team turned those events
into runs and how efficiently it prevented the other team from scoring.
For both the pythagorean method and REA, we look at several decades of
history to see what we can learn about the chances for each team in 2003.
Two of these articles were published by ESPN.com. All of them can be
found on our web site by clicking on the "Baseball Articles" link that
appears in the banner at the top of our web pages.
Leadoff walks
In response to comments by Tim McCarver during two postseason telecasts,
Dave Smith of Retrosheet (www.retrosheet.org) posted some very interesting
analysis to SABR's online forum. Dave was gracious enough to give us permission
to share the following with our newsletter subscribers . . .
Here is the analysis that I mentioned on SABR-L yesterday concerning
the consequences of starting an inning with a walk. I have three tables
of data which address the basic topic in different ways.
Recall that the immediate impetus was [another SABR member's] quote of
Tim McCarver who said on Sunday night's broadcast to the effect that "there
are more multirun innings that begin with a walk".
Last week, during one of the LCS games, McCarver asserted that "the one
thing I would tell a young pitcher is 'never walk the leadoff man, he
*always* scores; he *always* scores'" (repetition and emphasis in the
original).
I examined the second of these two quotes in 1998 at the request of the
San Diego Padres, although for the life of me I do not recall what use,
if any, they made of what I gave them. I have expanded my data set since
that 1998 study and for the present report I checked every game from 1974
through 2002. This 29-year period covered 61365 games and 1,101,019 half
innings. There were over 4.5 million plate appearances in these games.
Table 1. For all methods for a leadoff batter to reach base, this table
shows the number of times each event occurred, the number of times that
batter scored, and the frequency of each. Note that the "E" category includes
all times the leadoff batter reached on an error, which includes those
cases when he went past first. The frequency for batters with leadoff
walks scoring is insignificantly different from the frequency for leadoff
singles; both are a tiny bit lower than the value for reaching via a hit
by pitch.
CONCLUSION: A leadoff batter who walks does NOT "always score"; the walk
has the same effect as the other ways to reach first base.
Reach Score Freq
1B 183468 72841 .397
2B 48364 30961 .640
3B 6573 5753 .875
HR 27205 27205 1.000
BB 82637 33002 .399
HP 6217 2543 .409
INT 81 22 .272
E 12105 5298 .438
Table 2. For all possible outcomes for leadoff batters (the 8 categories
from Table 1 plus making an out), this shows the number of times the indicated
number of runs were scored. For example, batters led off an inning with
a single 183,468 times and in 104,074 of those innings, his team did not
score. One run was scored 35,868 times, two runs on 22,726 occasions,
and so on, with all innings of six or more runs combined.
Total 0 1 2 3 4 5 >5
1B 183468 104074 35868 22726 11329 5375 2415 1681
2B 48364 17671 17657 6772 3427 1632 683 522
3B 6573 984 3696 1019 467 228 101 78
HR 27205 0 19690 4130 1816 871 386 312
BB 82637 46794 15837 10481 5167 2503 1100 755
HP 6217 3453 1209 776 427 203 93 56
INT 81 56 9 7 6 1 0 2
E 12105 6427 2726 1580 744 355 159 114
OUT 734369 616379 70656 28839 11379 4441 1679 996
Total 1101019 795838 167348 76330 34762 15609 6616 4516
These raw totals are not easy to compare, especially since the various
outcomes occur with very different frequencies. Therefore, I created Table
3.
Table 3 takes the data from Table 2 and normalizes it per number of occurrences
of each outcome. For example, a leadoff single led to no runs with a frequency
of .567 (56.7%), one run was scored after the leadoff single with a frequency
of .196, etc.
CONCLUSION: The values for leadoff singles and leadoff walks are virtually
indistinguishable. The hit by pitch data are only slightly lower in the
"no runs" category.
0 1 2 3 4 5 >5
1B .567 .196 .124 .061 .029 .013 .009
2B .365 .365 .140 .070 .033 .014 .010
3B .150 .562 .155 .071 .034 .015 .011
HR .000 .724 .152 .066 .032 .014 .011
BB .566 .192 .127 .062 .030 .013 .009
HP .555 .194 .125 .068 .032 .014 .009
INT .691 .111 .086 .074 .012 0 .024
E .531 .225 .131 .061 .029 .013 .009
OUT .839 .096 .039 .015 .006 .002 .001
OVERALL CONCLUSION: Both of McCarver's assertions are clearly contradicted
by this huge body of evidence. Having the leadoff batter reach base is
certainly an advantage for the offense (compare the values for the "OUT"
row in Table 3). The data for reaching on interference are far too limited
to be useful. When the leadoff man collects an extra base hit or reaches
on an error (with the occasional cases of going past first on the error
included), it is even better than reaching first, as expected. However,
if we just look at those instances when the leadoff batter reaches first,
then it does not matter how he got there.
SUMMARY and personal views: Even if we allow Tim some poetic license
for his hyperbole; it is his job after all, we do not need to accept his
opinion as authoritative. I have great respect for anyone who played in
the Major Leagues for 22 years, as McCarver did. However, anecdotal observations
and gut feelings are just that and have no inherent credibility, no matter
what the source. Since we can now check these opinions with evidence,
and McCarver definitely has at his disposal the talents of people who
can do such checking, then we should expect him and other announcers to
get it right.
Dave Smith
How age affects our view of the game
Andruw Jones was my pick for NL Defensive Player of the Year in 1998.
And among players with at least 1500 outfield innings from 1997 to 2002,
Jones ranks second in the majors in putouts per game despite playing about
20% of his games behind a ground ball machine named Greg Maddux. (The
top six on this list are Torii Hunter, Andruw Jones, Mike Cameron, Chris
Singleton, Darin Erstad, and Tsuyoshi Shinjo.)
But relative to the norms for his position, Jones was making a lot more
plays in 1997 and 1998 than in 2001 and 2002, and this year he was only
10th in putouts per nine innings among players with at least 500 innings.
Was that ranking depressed by the ground ball nature of Atlanta's staff?
Not really. Atlanta's pitchers were 7th in the league in ground ball percentage,
only a few points above the league average.
We again rated Jones as one of the best center fielders in the game.
But there's a big difference between being one of the best and being head-and-shoulders
better than anyone in the game today and perhaps the best who ever played
the position. That's how he looked four years ago and how he is widely
regarded by the media.
In fact, the arc of Jones' career as a hitter and a fielder is more consistent
with someone who's 29 years old than his listed age of 25. It's widely
believed that hitters peak around age 27, and my experience in doing fielding
analysis for the past 15 years suggests that defensive range tends to
peak around 24 or 25. (Error rates may improve later, but the ability
to get to the ball tends to peak early.) Jones had his best year at the
plate in 2000 and his best defensive year in 1998.
Of course, there are at least three reasons why this pattern could mean
absolutely nothing. One, there's no evidence that I know of to suggest
that his listed age isn't legit. Two, things will look very different
if Andruw has a monster season in the next year or two. And, three, there
are plenty of other players who don't fit the "normal" career arc of building
to a peak around age 27, staying on that plateau for a while, and declining
slowly after that. After all, Barry Bonds has been a better hitter in
his late thirties than at any other time in his career.
But if it did come out that Andruw was really 22 or 23, not 19, when
we first saw him in 1996, our perception of his career to date and his
potential for future growth would be very different. We would be less
surprised that he was able to hold his own at the big-league level at
such a young age. Many of us would stop thinking that he's due to take
another great leap forward and start thinking that we may have already
seen him at his peak.
My point isn't that Andruw IS older than we think. As I said, I have
absolutely no evidence upon which to conclude that. I'm just saying that
his career LOOKS more like that of an older player SO FAR.
My point is that age has become a major factor in the thinking of baseball
analysts, perhaps too much so. Hundreds of player ages have been revised
in the last year or two, which makes me wonder whether there are more
that haven't been discovered, past and present. And even if we could assume
that all published birthdates are correct, many players don't follow the
"normal" career arc anyway.
Maybe it's time for baseball analysts (including us) to find better ways
besides age to assess the past and likely future path of a player's career.
|