Thursday, May 18, 2006

Why aren't there more cricket statistics? Part 3

Though cricket and baseball share many characteristics, baseball has many more statistics than cricket. There are several external explanations for this disparity, including the lack of a clearly defined international cricket schedule and baseball's astronomical salaries. In addition to these external factors, there are at least two aspects of the game of cricket itself that have contributed to the relative lack of cricket stats compared to baseball. First, baseball simply has more things to count than cricket. Second, measuring players' contributions in cricket is far easier in cricket than baseball. Together, these two phenomena go a long way in explaining why there aren't more cricket statistics.

At the most basic level, cricket is a simple game. Bowlers attempt to take wickets and prevent batsmen from scoring runs. The basic outline of baseball is similar; the batting side tries to score runs while the the defense tries to prevent runs from being scored. In practice, however, baseball is more complex in terms of the possible outcomes to a single play. That increased complexity leads directly to a need for more sophisticated statistics.

The most fundamental unit of a cricket game is the ball (analogous to a pitch in baseball). A given ball can have one of 20 outcomes (I may be missing some; please feel free to correct me). These are:
- a dot ball (0 runs scored)
- 1 run scored
- 2 runs scored
- 3 runs scored
- 4 runs scored
- 5 runs scored (rare in practice but possible in practice)
- 6 runs scored
- Out, bowled
- Out, caught
- Out, handled the ball
- Out, hit the ball twice
- Out, hit wicket
- Out, leg before wicket
- Out, obstructing the field
- Out, run out
- Out, stumped
- No ball
- Wide
- Bye
- Leg bye

The corresponding list in baseball is longer. Individual pitches in baseball can have at least 39 outcomes. This list is close to comprehensive; feel free to skip right to the bottom.

- Ball, less than 4 balls
- Ball, 4th ball in an at-bat, resulting in a walk (runner is awarded 1st base)
- Swinging, called strike, or foul tip, less than than 3 strikes
- Swinging, called strike, or foul tip, 3rd strike (batter is out)
- Swinging, called strike, or foul tip, 3rd strike, catcher drops pitch and 1st base is open (or there are two outs), batter is not out and can try to reach 1st base
- Foul ball, no catch, less than 2 strikes (counts as a strike)
- Foul ball, no catch, 2 strikes (no change)
- Foul ball, caught (batter is out)
- Foul bunt with 2 strikes (batter is out)
- Wild pitch (bad pitch by pitcher gets past catcher and baserunner(s) advance)
- Passed ball (playable pitch gets past catcher and baserunner(s) advance)
- Hit by pitch, batter is awarded 1st base
- Catcher interference, batter is awarded 1st base
- Balk (illegal movement by pitcher, baserunners advance 1 base)
- Ball in play, caught in the air, no runners advance (batter is out)
- Ball in play, caught in the air, runners advance (batter is out, sacrifice fly)
- Ball in play, interference (batter is out)
- Ball in play, batter touches ball (batter is out)
- Ball in play, ground out to 1st base (batter is out)
- Ball in play, sacrifice (batter is out, runner advance)
- Ball in play, tagged out (batter is out)
- Ball in play, infield fly rule (batter is out)
- Ball in play, fielder's choice (batter reaches 1st base but another baserunner is out)
- Ball in play, single (batter reaches 1st base)
- Ball in play, double (batter reaches 2nd base)
- Ball in play, triple (batter reaches 3rd base)
- Ball in play, home run, out of park (batter scores run)
- Ball in play, inside-the-park home run (batter scores run)
- Ball in play, ground-rule double (batter is awarded 2nd base)
- Ball in play, baserunner tagged (baserunner is out)
- Ball in play, fly out/line drive double play (batter and baserunner are out)
- Ball in play, groundout double play (batter and baserunner are out)
- Ball in play, fly out/line drive triple play (batter and 2 baserunners are out)
- Ball in play, groundout triple play (batter and 2 baserunners are out)
- Ball in player, runner interference (baserunner is out, batter is awarded 1st base)
- Stolen base by baserunner

Most of these events can occur in a variety of game situations, including no one on base, a baserunner on 1st base, a baserunner on 2nd base, a baserunner on 3rd base, baserunners on 1st and 2nd base, base runners on 2nd and 3rd base, baserunners on 1st and 3rd base, and baserunners on 1st, 2nd, and 3rd base. So there are literally hundreds of possible events in baseball, far more than cricket. More events means more things that you can count which, in turn, means more potential statistics. Baseball has more statistics than cricket because, simply, there is more to count.

The simplicity of scoring in cricket also helps account for the relative paucity of statistics in cricket. One of the major developments in baseball stats in the past few years has been the creation of stats that accurately players' contributions to runs scored. For some events this is easy: a player who hits a home run with no runners on base is clearly and unambiguously responsible for 1 run scored. But what if there had been a player on first base? The player hitting theom run shouldn't get credit for 2 runs. After all, the player on first base did something right in the first place. As it turns out, a single with no one on base and no outs is worth 0.39 runs. Baseball statisticians (also known as sabermetricians) have done all sorts of fancy calculations to determine the precise run value of each possible event in a baseball game. A runner on first steals second with 2 outs? +0.27 runs. That runner attempts to steal second and is out? -0.58 runs. A pitcher striking out the batter with the bases loaded and two outs? -0.81 runs. And so on. Add up the value of all a given players' actions and you've got a pretty good approximationg of that player's contribution of runs.

In cricket, determining players' run contribution is far easier: simply count up the number of runs scored. Batsmen are almost entirely responsible for the runs credited to them. A player scoring a century has earned each of those hundred runs, though he may get a bit of luck along the way in the form of poor fielding or bad shot that somehow find the gaps.

This difference between cricket and baseball is a major factor in the relative abundance of baseball statistics. In both sports runs are the key measure of achievement: the team that scores more runs wins. In cricket, it is far easier to pinpoint who is responsible for each run. In baseball, this task is far more difficult. Aside from solo home runs, at least two offensive players deserve credit for each run scored. Baseball's wealth of batting statistics represents an effort to determine just how many runs each player contributes, a task that is wholly unnecessary in cricket.

All that said, cricket statistics are far from comprehensive. There are several areas where cricket statistics could become more sophisticated and, as a result, more accurately reflect cricketers' skills and achievements. Some of those areas will be discussed in the remaining posts in this series.

Sunday, May 07, 2006

Why aren't there more cricket statistics? Part 2

As I discussed in Part 1 of this series, baseball has far more statistics than cricket, in spite of numerous similarities. There are a number of reasons for this disparity, some related to the mechanics of cricket and some to the external circumstances in which cricket is currently played. This post looks at two factors that fall into the second category: (relatively) low salaries and a year-round cricket season. If cricketers earned higher salaries, fans would likely become more interested in accurate assessment of their performance. And if cricket had a clearly defined season, it would be far easier to evaluate players over a set period of time. If these circumstances were more similar to those in baseball, cricket fans would be more likely to develop a more robust set of cricket statistics.

The median salary for a major league baseball player is $1 million. Alex Rodriguez, the highest paid player in the game, will make $21.6 million this year. His teammate on the Yankees, Hideki Matsui, comes in at 25th, earning $13 million. Baseball players make a lot of money. The contrast with cricket is striking. I can't find the median salary figures for county cricket, but the BBC reports that "a six-figure deal for any player - from overseas or otherwise - is unusual." Even with the weakness of the dollar, the top players in domestic cricket make less than half of the median earnings of major league baseball players. Even players like Tomas Perez, the very definition of a replacement-level player, earn over half a million dollars.

The top cricket players in the world do not rely on domestic cricket contracts. They spend most of their time with their national sides and receive a salary from the national cricket organization. But even in the realm of international contracts cricketers earn far less than baseball players. The ECB salaries of Andrew Flintoff, Michael Vaughan, and Marcus Trescothick (three of the most important players in the England side) have been estimated at £400,000, which only puts them within shouting distance of the typical baseball player and nowhere close to baseball players with similar skills. Baseball players earn more than cricket players, period.

But with baseball's inflated salaries comes a greater scrutiny from fans. When faced with the absurd proposition of players receiving millions of dollars to play a game for a few hours a day, fans may very well start wondering if all that money is worth it, especially if those high salaries are financed through increased ticket prices. If it turns out that Alex Rodriguez created 138 runs in 2005, or, better yet, was worth 12.3 wins, paying him over $20 million each year might seem a bit less preposterous. I don't think it's a coincidence that Bill James and sabermetrics began gaining popularity in the early 1980s; Nolan Ryan became the first player to make a million dollars in one season in 1980.

In short, high salaries lead to greater interest in objective measures of performance. If cricket salaries increased tenfold in the next year, there would be far greater attention devoted to determining just who's actually earning those overblown salaries.

The year-round international cricket schedule also contributes to the relative lack of cricket statistics. While each cricketing nation has a clearly defined cricket season (April through September in England, October through March in Australia, etc.), cricket's international appeal ensures that cricket is always being playedsomewheree in the world and that international sides play year-round (much to the consternation of some players). It is not possible to speak of the 2006 international cricket season; the rhythm of the international game is governed by series, not seasons.

The lack of a clear cricket season, unsurprisingly, prevents the development of season statistics and records. While yearly statistics are occasionally highlighted (like Shane Warne's unprecedented 96 wickets in 2005), there's not nearly the same emphasis on season results as seen in baseball. It's true, of course, that season statistics do exist for the various domestic competitions. Scoring 2,000 runs or taking 200 wickets in a single season of county cricket remains a recognized achievement. But the demands of international cricket ensure that the best crickets in the world (i.e. the ones most likely to be subjected to rigorous statistical analyses) play in just a few domestic matches each year. In short, the lack of a clearly defined international cricket season means that there are no ready-made periods of time over which international players' performances can be evaluated and, as a result, discourage the development of more cricket statistics.

Low salaries and a year-round international cricket calendar help explain why there are relatively few cricket statistics compared to baseball stats. But the most compelling explanations lie in the game of cricket itself. For details, check back for the next part in this series.

Why aren't there more cricket statistics? Part 1

Cricket and baseball share many features. Both are the traditional summer sports of their respective countries. Both involve a bat and a ball. Both use runs as the basic scoring measure. And both generate more interest in statistics and numbers than other sports. Numbers compiled with bat and ball hold greater meaning than those earned only through kicking, tossing, or running with a ball. Ted Williams's .406 batting average in 1941, Babe Ruth's 714 career home runs, and Joe DiMaggio's 56-game hitting streak hold a place in the American psyche unchallenged by, say, Jerry Rice's 207 career touchdowns, Wayne Gretzky's 894 goals, or even Wilt Chamberlain's 100 points in a single game. Donald Bradman's career batting average of 99.94 towers over all other cricket performances and might very well be the defining individual achievement in Commonwealth sports.

The reverence directed towards record performances is one thing. Baseball and cricket also exceed virtually all other sports in the sheer number of statistics its followers track, analyze, and dissect. MLB.com, far from the most comprehensive baseball stat resource, provides so many statistics that it has to display them on two separate pages. Cricinfo provides a sparser but equally bewildering array of stats.

A large part of this abundance of statistics can be attributed to the simple fact that baseball and cricket have more discrete events than other sports. Soccer statistics, for example, are fairly limited, due to its low-scoring nature and continuous play. In contrast, every pitch in baseball and every ball in cricket is recorded. 50 years after the fact, someone can look at a completed baseball or cricket scorecard and reconstruct the course of the game. In short, baseball and cricket have so many stats because they have so many things to count.

In spite of these similarities, baseball statistics are far more robust than their cricket counterparts. Baseball Reference includes 36 hitting statistics, 15 defensive statistics, and 24 pitching statistics. Baseball Prospectus goes further, with 59 hitting stats and 51 pitching stats. Cricinfo pales in comparison, with just 12 batting statistics, 2 fielding statistics, and 12 bowling statistics.

It's not just in quantity that baseball stats outstrip their cricket counterparts. In addition to the well-known batting average, home runs, and runs batted in, hitters these days are tracked by things like OPS+, VORP, and RCAA. Pitchers were once evaluated on the basis of their win-loss record, earned run average; now baseball fans look at stats like WHIP, DIPS, and BABIP (say that three times fast). Cricketers, meanwhile, are still evaluated almost exclusively on the basis of batting average and strike rate (for batsmen) and bowling average and economy (for bowlers). Sabermetrics has not come to cricket, and there is no Bill James of cricket (Bill Frindall has been suggested, but he's known for trivia, not statistical analysis).

So why aren't there more cricket statistics? What explains the lack of robust statistical analysis of cricket when baseball fans have easy access to a host of stats that grow more sophisticated and more esoteric each year? There a number of plausible explanations, some related to cricket itself and some to the circumstances in which cricket is currently played. Though these factors help explain the relative lack of cricket statistics, they do not preclude the development of new statistics and analysis that would provide more accurate assessment of cricketers' abilities and performance. In the remain parts to this series, I'll look at some of the reasons why there aren't more cricket statistics and suggest possibilities for new types of cricket stats.