Saturday, October 27, 2007

Great Expectations, And Others

A baseball team's won-loss record over a full season usually has a close relationship to the difference between runs scored over runs allowed ("net runs"). This seems like a pretty common-sensical observation, since the way a team wins a game is by scoring more runs than its opponent. When Bill James almost 30 years ago nicknamed the general arithmetical relationship over a season between won-loss record and net runs as the "Pythagorean" win expectation formula, he made it sound more evocative of high school geometry class than it really is. Unlike in that geometry class, there are no elaborate logical proofs behind baseball's "Pythagorean" formula. Baseball's Pythagorean formula is just plain old arithmetic -- if you look at all the seasons by all the teams in baseball history, it just so happens that they tend to end up a season with a winning percentage about equal to the square of the team's runs scored divided by the sum of the square of the team's runs scored plus the square of the team's runs allowed.

Bill James recently posted a spreadsheet that includes, among other things, every major league team's winning percentage for every season since 1876, along with its Pythagorean win expectation (Bill now uses a slight variation on the formula, with a slightly different exponent replacing the simple "2" designated by the "squared" part of the formula, but that is a very minor technicality). Of 2,516 team-seasons since 1876, only 242 resulted in more than 5 wins above the Pythagorean expectation. So less than 10% of all teams exceed their Pythagorean win expectation by more than 5 wins.

Of those 242 teams, 236 had a "next season" (2007 teams obviously have no "next season" yet, and 19th century teams sometimes went out of business before they could have a "next season") but only 26 of them could repeat the feat of winning more than 5 games above expectation, again less than 10% of the teams. That suggests that winning more games than Pythagorean-expectation is not really a repeatable skill, and is largely just luck (Bill James has found, however, that the most extreme overperforming teams, such as the D-Backs this season, may have some amount of a repeatable skill). By the way, winning itself is a repeatable skill: of the 316 teams in history that have had a greater than .600 winning percentage and who had a "next season", 134 repeated the feat in that next season, and those 316 teams that finished over .600 averaged a .582 winning percentage in their next season.

All this being an elaborate introduction to pointing out the oddity that although Mets seasons represent only 46 of the 2,516 team seasons that have been played in history (about 1 Mets season for every 55 team-seasons in history), the Mets have had two of the ten most Pythagorean-overperforming teams of all time, as well as the single most underperforming team of all time:

-- The 1984 Mets, Davey Johnson's first Mets team, gave up 676 runs while scoring only 652, yet somehow managed to finish with 90 wins and 72 losses. That's the fourth biggest Pythagorean overperformance of all time. The 2007 D-Backs, by the way, were the ninth biggest Pythagorean overperformers of all time.

-- The 1972 Mets, the first Mets team with Yogi Berra as manager, gave up 50 more runs than they scored in a slightly strike-shortened season, but still managed to finish 10 games over .500 anyway, the ninth biggest Pythagorean overperformance ever.

-- And then there is the astounding 1993 Mets team, managed first by Jeff Torborg and then Dallas Green. This club was not good, but was actually competitive. Their runs scored and runs given up predict a 73 or 74 win season, not good, but that should not have produced anything grotesquely bad. By hook or by crook, however, this team finished a horrible 59-103, the fourth worst record in the majors in the 1990s and the largest single Pythagorean underperformance, and the largest variation from the Pythagorean expectation of any kind, up or down, in baseball history.

No comments: