Hello, I'm working on a really interesting project but I am getting a little lost. I have a time series data set of performance data from thousands of individual modems. Each modem is probed and gets a new entry in the database every 15 minutes. I have 100 days worth of data like this, about 300 million rows.
MAC_Address time duration latency down_speed up_speed down_power
AAAAAAAAAAAA 2015-10-13 00:47:12 312345 0.208 3534401000 1429432973 30
down_snr up_power status next_status
437 507 OK UNKNOWN
The next_status is a column I created in R. It is the status in the next probe. I was able to successfully build a logistic regression model to predict the status in the next probe. My concern is that this is only 15 minutes into the future. I would really like to be able to predict with some certainty if the status will change from OK to UNKNOWN within the next 24 hrs. However, I am not sure how I should rearrange my data or what modeling approach I should take to conquer this task. I would really appreciate any thoughts, ideas, or resources.
I have an anonymized version of the data I can share.