Subscribe to DSC Newsletter

I am working on predictive maintenance and get temperature data from assets. In few months or few days asset remains down and we do not get temperature value. In this scenario i cannot fill data with missing value techniques. Also cannot give some number because even 0 and -1 are valid values for temperature. How to deal with such data? 

I am thinking of putting very big value for such columns which is not possible as temperature. Please suggest.

Views: 285

Reply to This

Replies to This Discussion

I suppose 6 months fake value is of no use, if we consider data relationship, else lead to wrong prediction.

Data cannot be filtered or filled with any measure of central tendency in above scenario. Missing data can be imputed in this scenario using following techniques:

 

  1. Regression Imputation: You can use multiple-regression analysis to estimate a missing value. Regression substitution predicts the missing value from the other values. Given that we have enough data to create stable regression equations, we can predict the values.
  2. Stochastic regression imputation: It tries to predict the missing values by regressing it from other related variables in the same dataset plus some random residual value.
  3. MICE: Multivariate Imputation by Chained Equation (MICE) works by filling the missing data multiple times. Multiple Imputations (MIs) are much better than a single imputation as it measures the uncertainty of the missing values in a better way. The chained equations approach is also very flexible and can handle different variables of different data types (ie., continuous or binary) as well as complexities such as bounds or survey skip patterns.

RSS

Videos

  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service