Subscribe to DSC Newsletter

What's the best model for a history of timestamped events vs. a binary response?

I have three tables:

  1. A list of client ids
  2. For those clients, 3 years of timestamped inclusions/exclusions in a credit bureau blacklist
  3. For those clients, 1 year of timestamped negotiations with the company I work for

I want to build a model to predict the probability that a specific client will negotiate with us, during a certain time frame (length could be anywhere from 1 week up to 1 year), based on the history of the credit bureau.

An easy way would be:

  1. Choose a date
  2. Bin the blacklist events (prior to that date) by time frame, e.g. monthly
  3. Choose a time frame for the negotiations, e.g. 1 month, with 0=no negotiation happened, 1=at least one negotiation happened
  4. Logistic regression

However, I feel I am losing a lot of information by doing that. What would be the state-of-art to solve this? I imagine that with a good technique:

  • I wouldn't need to choose a base date, because the training will consider all possible cutoffs considering the available data (e.g. automatically choose the first available date, split the datasets into past/future, train on that, increment the date by 1, and iterate until the last available date)
  • I wouldn't need to bin the predictors
  • The response probability will be dynamic over the response time frame (i.e. I can train only one model, provide it with a test set and an arbitrary response time frame up to 1-year, and it will predict the probabilities for that specific time frame)

Thanks!

Views: 153

Reply to This

Videos

  • Add Videos
  • View All

Follow Us

© 2018   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service