How To Identify A Good/Bad Data Scientist In A Job Interview?

Data scientists are in notoriously high demand, so when your company is ready to make the leap into big data, it pays to understand how to tell if you’re getting a good one.


Source for picture: click here

Because of the vast amounts of money at stake with some big data projects, every data scientist wants you to believe that he or she is the kind of genius that can tease industry-changing information from a set of numbers and some code. And some can.  But some can’t.


If you’re ready to hire a data scientist for your project or organization, there are some important questions to ask to make sure you get the right person for the job:


  • Does the candidate have solid programming skills?
    A data scientist needs the skills to not just view and analyse the data, but manipulate it as well.  A statistician who reviews and interprets a set of data is very different from a data scientist who can change the code that collects the data in the first place.
  • Do they excel at producing analytics for computers or humans? (And which do you need?)
    There are two main types of big data analytics: those whose end user is solely a computer, and those whose end user is a human. If your end result is a machine learning algorithm to, for example, choose which ads to show on a website or make automatic stock trades, your analytics are for computers.  If, on the other hand, a human will make a choice based on the analytics, your analyst needs a different set of skills, chiefly, being able to tell a story through data and providing good visualization of that data.
  • Can they provide concrete examples of when they’ve improved a business process through their work?
    As with any position, you hope to see real-world examples of when they successfully implemented improvements to a business process.
  • Are they a good communicator?
    Stereotypes would have us believe that it’s OK for scientists and techy types to be introverts with poor communications skills, but that’s not really an option with a data scientist. He or she needs to be able to communicate effectively with people who don’t “speak the same language,” tell a story through data, and use visual communications effectively.
  • Can they be creative and open minded?
    Big data is a rapidly changing and expanding field that requires a certain open-mindedness and creativity. To innovate, a good data scientist must be able to look beyond what came before. If a candidate has implemented the same processes or procedures at multiple companies, ask yourself seriously if he or she is able to innovate and try something new.
  • Have they got a scientific mind-set?
    As the name suggests, data scientists should be scientists that apply the scientific model to data. This means being able to experiment with data to find models and algorithms that are useful for businesses and can be used to predict future events. Scientist are inquisitive but follow the scientific method in their endeavour to find models that are useful in the real world.
  • Do they have solid business understanding?
    It’s one thing to understand the science and mathematics behind analysing huge data sets. It’s another thing entirely to truly understand how that data affects profitability, user experience, and employee retention — or any of a myriad other factors important to the business. Someone with a background in business will be better at spotting trends that will benefit your business.

If you are a data scientist or have hired one for your company, what other traits would you add to the list? What differentiates a good data scientist from a mediocre one? I’d love to hear your thoughts in the comments.


About : Bernard Marr is a globally recognized expert in analytics and big data. He helps companies manage, measure, analyze and improve performance using data.

His new book is: Big Data: Using Smart Big Data, Analytics and Metrics To Make Better Decisions and Improve Performance You can read a free sample chapter here


DSC Resources

Additional Reading

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

  • Sione Palu

    Its a wrong question to start with which will lead companies to hire the wrong people.

    We advertise for data-scientist as a role, but then when screen candidates, we don't want to emphasize heavily on those who have had data-scientist job titles previously, simply because data scientist is a recently invented term. We state in the job ad that we're looking for candidates with Masters or PhD in any quantitative field, math, stats, physics, bioinformatics/bio-medical engineering, signal processing, and so forth. We know that candidates from those fields can step up to the mark. They may be not software architects or software engineers, but that area is covered in the engineering team. So we look for candidates that can come up with their own ideas or do their own R&D (ie, ability to research the literature for potential solution to our business problems). Anyone that try to look for data scientist background or qualification is simply slack in identifying what he/she's looking for. If that's the case, then why not just advertise for data analyst? Because that's what most companies have been doing all along, before the term data scientist before it became popularize these days? We look for candidates that can walk the extra miles, like ability to understand the latest research & prototype for evaluation. IMO, when companies are specifically looking for CV that mentioned data scientist (past job titles or current certification), then they're narrowing themselves. They may mistake trees for the forest if they only go for CVs which stated data scientists job titles in the past or data scientist qualifications.

  • Chang Hsiung

    excellent point

  • Sunitha V Thampi

    As the designation indicates 'scientists' are people who start their work with a 'question'. Ideally every analysis should start with a question and outcome of the analysis should be the answer to that question. Data is only a mean to reach out to that answer. Hence every piece of information is important in that process. To many data related professionals data means only structured, numerical data. You may get the real answer from verbal or written conversations,  other ways of expressed emotions (such as stories, pictures etc) etc. The real efficiency comes from structuring the unstructured data and come up with efficient solutions/answers to the question/s they started with.