In Nate Silver’s new book, The Signal and the Noise, he had a chapter on climate projection. The chapter showed projections had a sweet spot at sometime in the future where they were the most accurate. Three uncertainty factors were at work affecting the projection and the sweet spot.
- The first is is initial variability. With climate, a location may experience a very cold winter in the first year of the model. The extra cold and hot winters will hopefully even out at some point. This error starts out high and eventually goes to zero.
- The second error is long term unknowns. With climate, maybe a new scientific invention is created which removes CO2 from the air or a couple of volcanoes go off at once cooling the earth. This value starts at 0 and grows steadily over time.
- The third is an underlying unpredictability. This value is the most steady of the three. Say the climatologist want create a 40 year prediction. From the beginning they will have a certain level of uncertainty. The further they want to predict into the future, the higher level of underlying uncertainty exists.
The sum of the 3 factors is the total uncertainty in the projection. The total error amount will be the lowest at the point in the future where the projection is aimed to be the the most correct. Here is a sample graph:
The total error is lowest 40 years into the future.
This same idea can be applied to baseball projections. Right now, projections only look at how a player will do next year and have the least amount of error set for a single season’s worth of data. The one year projection is good, but not in all instances.
Would it be nice to have a 2, 3, 4, or more year projection when looking at free agents? By the time a player has reached free agency, there would have enough information on the player to see what is expected from them over a set number of years. This projection would have high initial level on underlying variability, but by limiting the unknowns curve it has lowest level of error for the years in question. The error graph would look like this with the lowest amount of error at time 5 or 6:
Now, how about a pitcher’s projection once it is known they are throwing 2 MPH slower?Something like a 30 game (6 start) projection could be created which would take into account all the known factors of the pitcher right now. It would be really accurate for a short time, but would have more and more error over the time compared to the long term projection. Here is an example:
Ideally, in my opinion, I think would be nice to have a 30 game, 160 game, 1 year, 2 year, 3 year, 4 year and 5 year projections for all players all the time. The 30 and 160 game projections will be a little tough to pull off, but multiple year projections should be available now. Stay tuned.