Subscribe to DSC Newsletter

19 Worst Mistakes at Data Science Job Interviews

This applies to many tech job interviews. But here we provide specific advice for data scientists and other professionals with a similar background. More advice is being added regularly. 

Here's the list:

  1. Not doing any research on the company prior to the interview.
  2. Not understanding whether they want to hire a scientist or a developer. In many cases, they want to hire essentially a developer or a coder, but one that is also a scientist - in short, a unicorn. You might be able to convince them that you are good at both, by emphasizing your business acumen, backed by success stories verifiable with hard facts, and easy to quantify with some yield metrics.
  3. Talking too much about their competitors, during the interview (unless you offer a credible solution to kill them)
  4. Believing that they want creative talent: hiring managers fear highly creative people more than incompetent employees. A little bit of creativity is OK, but not too much. You don't want to appear as a disruptor.
  5. Forget to tell how you can help.
  6. Provide sloppy code when asked to solve an algorithm.
  7. Failure to bring a portfolio with you, featuring well written and documented sample code, powerful visualizations (the ones that anyone understand in 15 seconds) or dashboards.
  8. Ignoring business issues during the interview, and instead focusing exclusively on code, tech, or theory.
  9. Offering artificial answers to questions, aimed at seducing the interviewer by saying exactly what they expect to hear. Hiring managers are not stupid and will detect this trick very fast, beware!
  10. Failure to mention success stories, together with metrics that measure the success in question (like reduced costs by 30%, increased retention by 20%). 
  11. You don't know the tools, techniques, platforms, or programming languages that your reports (if you are hired), are using. At least you should have a general idea about them: for instance, be able to answer what are the differences between Python and R even if you never used any of these languages.It is a good idea to ask HR who your interviewers will be, and do some research about their background, using LinkedIn. Even connect with them on LinkedIn!
  12. Not mentioning any team work you did in the past. 
  13. You don't know the trends (or top leaders) in your industry. You can't answer questions such as "how do you think deep learning will evolve over the next 5 years" or "Is IoT or AI here to stay". Easy fix: read our Monday digests to keep being informed about trends.
  14. Applying to the wrong job: for a coder position if you are a scientist. So be very clear upfront about who you are, this will save a lot of time to everybody, including you.  
  15. Make promises that you can't meet. When hired, your life will become miserable, and you will be an easy target for getting fired.
  16. Failure to think like the interviewer. Just try to think, if you were the interviewer, what would you want from the successful job applicant, for this specific job? Then provide answers that address these questions. And if you have no clue about these "hidden" questions, just ask the interviewer what these questions might be! 
  17. Telling the same story to all interviewers. Rather, try to understand what each interviewer is interested in (a bit of research before the interview can help), then focus on customized topics to discuss with each interviewer. If in doubt, ask each interviewer what the great challenges are, for the job you are applying for, and especially as far as he/she is concerned. Then provide meaningful answers. It's OK to answer a question by saying that you don't know (but that you can research it and get the answer  very quickly). Some interviewers will appreciate that you don't know an answer, and will feel superior and not threatened by you. It may play to your advantage. Some interviewers will talk non-stop: let them talk and give them the impression that they are very knowledgeable. 
  18. Not knowing what kind of data scientist you are, and what kind of data scientist they are looking for. This article might help you with this. 
  19. Failure to ask what the urgent issues are. And if these issues are brought to you, failure to recognize that (1) they are very urgent and challenging - thus the need to hire someone, and (2) failure to suggest any idea or path to address them, including how much time and resources are needed.
  20. Give the impression that you know everything, and that the challenges the hiring manager is facing, are trivial. And if they really are trivial, don't offer the solution. Negotiate a contract instead, knowing that the job will be easy for you.
  21. Failure to ask why the previous employee left, or why the position is open.
  22. Not differentiating yourself from other candidates, not mentioning your selling points (a geek who understands business talk, a girl who strives at completing all projects on time, a guy who loves to automate his tasks when possible to save time and to handle bigger work loads, or someone who works very well in a team and knows who to delegate and even provide inspiration and positive energy to colleagues, or someone who is a self-learner and learned Python and R all by yourself, or a guy who develops popular apps during his spare time, or a  respected author with his own blog and articles, or experience with really big data like terabytes, or a girl who loves optimizing processes and can provide examples)
  23. Calling a data set with 100,000 observations 'big data'
  24. Thinking that the techniques that you learned at school can be applied to any kind of problems or data, with little if any adjustments. Not being aware of modern, robust, scalable techniques not learned at school. A solution is to get some data (there are tons of free data sets) and use a modern tool such as a cataloguer algorithm, to automatically process a few gigabytes of data. Now you have something interesting to talk about during your job interview, especially if you can describe the benefits that it offers (automated, fast indexation of big unstructured data, creation of search engines or taxonomies such as Amazon's big product listing)
  25. Unable to say much about the speed (computational complexity) of various algorithms, offering slow/inefficient solutions when asked to solve a problem, not knowing where the complexities and bottlenecks are in modern platforms.
  26. Believing that data is king. Not being able to guess where sources of bias and variance might come from. No experience working with messy data. Not knowing how data is produced, and how metrics are identified. Can only talk about static data
  27. Not being able to tell the pros and cons of two popular platforms, products, architectures, programming languages, or algorithms. You need to read  the literature to become familiar with this. For instance, R versus Python, the 8 worst predictive techniques, or 10 types of regressions, which one to choose, or Hadoop versus Spark.   

Related articles:

DSC Resources

Additional Reading

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

Views: 6190

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Dave Dyer on March 2, 2016 at 9:20am

This is a great list, thank you.  I'd love to see a list of common mistakes data science INTERVIEWERS make, also. For instance:

  • Founding your opinion based on questions that are irrelevant for the position (e.g. do you really *need* to grok neural nets when all your problems are actually stats problems?)
  • Fixation on a particular tool or technique based on personal bias.
  • Unwavering confidence that every data scientist should know "this one thing" or else they're not a real data scientist. 
  • Not doing your research about a) who is applying and b) the position they're applying for.

Unfortunately, a lot of these issues are enabled/reinforced by the hundreds of "how to spot a fake data scientist posts."  It's getting ridiculous.  It's super easy to spot a fake data scientist -- They don't solve complex business questions using statistics, programming, ML, etc. Choosing  a t-test over an inverse matrix algorithm should never be grounds for dismissal based on your personal opinion (if the technique matches the problem).

Comment by Richard Ordowich on March 2, 2016 at 7:38am

The most important skill and knowledge for anyone working with data is to be Data Literate. If you can't explain the semantics of data, including the importance of  pragmatics, if you can't build a taxonomy or ontology or if you think that there is such a thing as raw data, you are Data Illiterate. Unfortunately most people working with data remain Data Illiterate, including those who profess to be data scientists. Scientists in other domains are literate in the raw materials they use but few data scientists have this knowledge.  Without this knowledge you are a "data pusher". 

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2016   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service