Subscribe to DSC Newsletter

Is it better to overpredict, or underpredict?

Weather predictions are notoriously difficult for the Seattle area. I have seen many instances in which snow was predicted (say 1 to 3 inches) while the actual amount was much smaller. I assume that it is a bigger legal risk for statisticians to underpredict snow amounts, than to overpredict them. But by overpredicting, people eventually stop believing in these forecasts. It seems that the same is happening with flu predictions. What are your thoughts on this? 

Views: 924

Reply to This

Replies to This Discussion

Nate Silver's book, The Signal and the Noise, discusses this topic of weather prediction.  Different weather outlets issue slightly different forecasts in an attempt to moderate public sentiment particularly toward the outlet issuing the forecast.  Compare the National Weather Service forecast with the Weather Channel's forecast, and your local station (NBC, CBS, ABC, FOX).  In terms of flu, prediction has always been a challenge.  There are always stories about how people get the flu shot but still get sick with the flu.  Both weather and flu are multifaceted (and possibly interconnected) in the variables that can be used for prediction.  Consider this headline:  "Migration Routes Hold Key to Bird Flu Spread, Global Study Finds".

As for errors, I remember viewing a meme online that illustrated the difference between a Type I and Type II error.  There were two panes.  In the first pane was a doctor telling a male patient that he is pregnant -- obviously a false positive.  In the second pane was a doctor telling a female patient with an obviously protruding pregnant belly that she is not pregnant -- a false negative.  It's clear in that context to see which type of error can cause more harm.

This brings to question at least two issues regarding overpredicting and underpredicting: 

  • public sentiment -- will the prediction cause civil unrest or uprising
  • unknown unknowns -- variables for which we just have no knowledge or even awarewess, and doesn't even occur in our minds as a possibility of having an impact

I think James highlighted the most important point: the cost of false positives or false negatives. Staying with James example, in healthcare, classifying somebody has no cancer, while this person actually has cancer and dies from it, is far more costly than opening the person and finding he has no cancer...

You can transfer that logic to your examples. Espacially snow can cause traffic accidents, people slipping away because of wrong footwear or even frostbite... so, underpredicting and leaving people with potential costs is worse than overpredicting and causing inconvinience (btw, cost is always meant in the economic sense and can include anything from injuries with later problems of walking or so, damage of vehicles or even the lost time of commuters due to traffic accidents). Flu is much the same: if you predict it will not be so bad this year and it comes hard, people might die. (I don't know the correct translation, its something like cost for society or social costs) The other way around, some people have the inconvenience of vaccination. Still costs due to lost time when visiting the doctor, but probably less.

If you consider logistics regression, the decsion of which error is more important - determining the cut-off value - is one essential factor to fine-tune the model.

To the point of not believing... well, it is a forecast. Nobody has been to the future, observed the weather and came back to tell the local radio station. This might be an issue of educating the population about quantitative models. 

Hi James, here is the link to the article you are referring to: Type I and Type II Errors in One Picture.

James Theobald said:

Nate Silver's book, The Signal and the Noise, discusses this topic of weather prediction.  Different weather outlets issue slightly different forecasts in an attempt to moderate public sentiment particularly toward the outlet issuing the forecast.  Compare the National Weather Service forecast with the Weather Channel's forecast, and your local station (NBC, CBS, ABC, FOX).  In terms of flu, prediction has always been a challenge.  There are always stories about how people get the flu shot but still get sick with the flu.  Both weather and flu are multifaceted (and possibly interconnected) in the variables that can be used for prediction.  Consider this headline:  "Migration Routes Hold Key to Bird Flu Spread, Global Study Finds".

As for errors, I remember viewing a meme online that illustrated the difference between a Type I and Type II error.  There were two panes.  In the first pane was a doctor telling a male patient that he is pregnant -- obviously a false positive.  In the second pane was a doctor telling a female patient with an obviously protruding pregnant belly that she is not pregnant -- a false negative.  It's clear in that context to see which type of error can cause more harm.

This brings to question at least two issues regarding overpredicting and underpredicting: 

  • public sentiment -- will the prediction cause civil unrest or uprising
  • unknown unknowns -- variables for which we just have no knowledge or even awarewess, and doesn't even occur in our minds as a possibility of having an impact

I think if you are considering weather, it is always better to overpredict and be extra cautious.



Vincent Granville said:

Hi James, here is the link to the article you are referring to: Type I and Type II Errors in One Picture.

James Theobald said:

Nate Silver's book, The Signal and the Noise, discusses this topic of weather prediction.  Different weather outlets issue slightly different forecasts in an attempt to moderate public sentiment particularly toward the outlet issuing the forecast.  Compare the National Weather Service forecast with the Weather Channel's forecast, and your local station (NBC, CBS, ABC, FOX).  In terms of flu, prediction has always been a challenge.  There are always stories about how people get the flu shot but still get sick with the flu.  Both weather and flu are multifaceted (and possibly interconnected) in the variables that can be used for prediction.  Consider this headline:  "Migration Routes Hold Key to Bird Flu Spread, Global Study Finds".

As for errors, I remember viewing a meme online that illustrated the difference between a Type I and Type II error.  There were two panes.  In the first pane was a doctor telling a male patient that he is pregnant -- obviously a false positive.  In the second pane was a doctor telling a female patient with an obviously protruding pregnant belly that she is not pregnant -- a false negative.  It's clear in that context to see which type of error can cause more harm.

This brings to question at least two issues regarding overpredicting and underpredicting: 

  • public sentiment -- will the prediction cause civil unrest or uprising
  • unknown unknowns -- variables for which we just have no knowledge or even awarewess, and doesn't even occur in our minds as a possibility of having an impact

Thanks for the link!!

Reply to Discussion

RSS

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2018   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service