Subscribe to DSC Newsletter

Hello, I'm working on a really interesting project but I am getting a little lost. I have a time series data set of performance data from thousands of individual modems. Each modem is probed and gets a new entry in the database every 15 minutes. I have 100 days worth of data like this, about 300 million rows.

MAC_Address        time                          duration  latency down_speed    up_speed       down_power
AAAAAAAAAAAA  2015-10-13 00:47:12   312345   0.208   3534401000    1429432973         30

down_snr   up_power   status   next_status
437           507            OK       UNKNOWN

The next_status is a column I created in R. It is the status in the next probe. I was able to successfully build a logistic regression model to predict the status in the next probe. My concern is that this is only 15 minutes into the future. I would really like to be able to predict with some certainty if the status will change from OK to UNKNOWN within the next 24 hrs. However, I am not sure how I should rearrange my data or what modeling approach I should take to conquer this task. I would really appreciate any thoughts, ideas, or resources. 

Tags: Big Data, Machine Learning, R, Revolution R, Time Series

Views: 418

Reply to This

Replies to This Discussion

I have an anonymized version of the data I can share.

https://drive.google.com/a/mtu.edu/file/d/0B6VvhxxLVGccMGpNRnNXYzZL...

RSS

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service