Hi, I am new to analytics and studying at college is totally different then what really industry want form data engineer.
I am summer intern at IDRC, we have a cab company as our client we want to run analytics on their
data to increase their revenue.
We have to types of data.
1. GPS data polled from 100+ taxis (position data), and
2. Client’s call-centre data (demand data).
since, bookings with immediate requirements are easy to execute or deny with high certainty (you either have a cab available, or you don’t have it). The tricky bit is committing to bookings which ask for travel some time away like 4 hours or 6 hours or 8 hours from the time of booking.
We want to create a probability dashboard for such bookings. So let’s say we begin with a number of 60% certainty, and with‘factors’ changing, this number should change in real time (for each booking). Suppose we see the number going below a threshold, let’s say 30% (a red zone), we should be able to take an ‘intervening action’.
Can anybody tell me how proceed to make a algorithm on this type of data. I have attached excel sheet which contains data.