Do Moneyball principles apply in little league?
Youth baseball stats are hard to track and even harder to use effectively, but it can be done. Stats can help create strong lineups in critical games, of course. Stats can help reduce bias and blind spots from relying entirely on visual observation. Most importantly, stats can help guide youth baseball coaches to drill on issues most in need of improvement.
In this series of posts I’ll explain my approach towards gathering data for different age groups, discuss the stats that matter most, and provide examples of how such stats can best be applied. If you want to skip ahead:
Why Youth Baseball Stats are Hard
Statistical conclusions require lots of clean data. This is easy to come by at the major league level. This is hard at the youth level. Sample size is low.
A typical spring rec (recreation) season consists of less than 20 games. Scorekeeping standards are inconsistent with regards to errors or obtaining extra bases, not to mention scorekeeper experience. Also, player ability is not static as skills often improve dramatically throughout the season, so historical stats may not say much about a player’s ability to perform by season’s end.
Consider a player with 60 plate appearances during a 16 game season. This player reaches first base 30 times for an OBP (On Base Percentage) of .500. You can calculate the standard error and find that there’s a 95% probability that this player’s true OBP is somewhere between .373 and .627.
.373 to .627 is a very wide range for true OBP. It means that OBP can only reliably separate players who reach first often (> .500) from those who reach first base rarely (< .250). Anyone could have told you which player was the better hitter without stats, just by watching. With such wide statistical variation, what’s the point?
I have several answers to this statistical objection:
- You can get more reliable stats by focusing on events that occur much more often than plate appearances: pitches. For hitting, this stat can be contact %, and for pitching it can be strike %. These two are discussed later in this series.
- It turns out there are other stats beyond pitch-by-pitch stats that remain stable after a relatively small number of repetitions. Both the original and updated articles by Russell Carleton about baseball sample sizes conclude that rates for strikeouts, walks, homeruns, fly balls, and ground balls stabilize fairly quickly. Strikeout and walk rates stabilize faster than any of the other plate appearance stats, which is fortunate because they’re both very useful.
- You can even use unstable stats as long as they are not taken in isolation. For example, when examining a player with .520 OBP, be sure to also look at walks and strikeouts. It could be that the player is a poor hitter with a high strikeout rate who draws many walks due to having a tiny strike zone. This is a very different type of hitter than someone with identical OBP but few strikeouts and few walks. The first benefits from beginning pitchers not having the control needed to throw strikes to a small player. The second is putting the ball in play and learning to be a good hitter. Against improved pitching, the first player will strike out while the second player may get a hit. Many pitchers do improve control by season’s end.
- Imperfect as stats are, what happens when coaches don’t consider stats? Well, we all have our stories of the coach’s son. Does it really make sense to bat the coach’s son first in the order if, statistically, he gets to first less often than 7 of his 11 teammates? If the fastest guy on the team manages to get himself caught stealing in 1 out of 3 attempts, do you want your third base coach to keep giving him the steal sign?
I’ve seen all sorts of manager decision-making that relies on gut feel or biased observation that could be improved by tracking and using stats. Even stats based on imperfect scorekeeping and limited sample size. Even in little league.