Subscribe to DSC Newsletter

Suppose I have 10 independent variable, and I intentionally didn't remove the nan value from few of my independent variable and move it to numpy and then passed it to the ML algo. Will few of the algo give error. Can ML also such as (Decision tree, SVM etc) can handle nan value. If these ML algo doesn't give error, then how will these nan values will be treated/handled internally by the algo.

Views: 246

Reply to This

Replies to This Discussion

NaN stands for "not a number". If it is a missing value instead, some algorithms such as decision trees handle them automatically. See also imputation methods (google the term) to estimate what the value could have been.

Now if it is truly a NAN (for instance 3 divided by 0) I suggest you do some data cleaning to identify them: it is usually the result of some algorithm that has not been well coded (not being able to handle extreme cases), some ill-conditioned problem (a matrix determinant extremely close to zero causing problems in regression models) or machine precision (you are dealing with very large numbers such as 10^1000) or even a data glitch (misaligned columns in a data set, and at some point you make an arithmetic operation, say multiplication, on two values that are not numbers.)

In any case, I would investigate what is causing these NaN values.

Thanks..  but just for knowledge, like any algo, such as SVM or linear regression will give error, if I pass nan value.. 

RSS

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service