We analyzed Glassdoor.com data to summarize 10 interview questions for Data Scientist Interview positions. How many can you answer?
“How do you take millions of users with 100's of transactions each, amongst 10k's of products and group the users together in a meaningful segments?”
You're about to get on a plane to Seattle. You want to know if you should bring an umbrella. You call 3 random friends of yours who live there and ask each independently if it's raining. Each of your friends has a 2/3 chance of telling you the truth and a 1/3 chance of messing with you by lying. All 3 friends tell you that "Yes" it is raining. What is the probability that it's actually raining in Seattle?
“How do you know if one algorithm is better than other?”
4) Goldman Sachs
There's one box - has 12 black and 12 red cards, 2nd box has 24 black and 24 red; if you want to draw 2 cards at random from one of the 2 boxes, which box has the higher probability of getting the same color? Can you tell intuitively why the 2nd box has a higher probability
5) American Express
We have like million card members and along with their transactions. Also, we have 10k restaurants and 1k coupons to eat food. Give a method that can be used to pass along the coupons to the users given that some users have already got their coupons so far.
Given two lists of sorted integers, develop an algorithm to sort these numbers into a single list efficiently.
“How do you find out trending queries/topics? How do you test a website feature i.e. given a set of web pages and few changes, how will you find out that the change works positively?”
You are compiling a report for user content uploaded every month and notice a spike in uploads in October. In particular, a spike in picture uploads. What might you think is the cause of this, and how would you test it?
Imagine you have N pieces of rope in a bucket. You reach in and grab one end-piece, then reach in and grab another end-piece, and tie those two together. What is the expected value of the number of loops in the bucket?
“How do you test whether a new credit risk scoring model works? What data would you look at?”