Subscribe to DSC Newsletter

Datapalooza: Compose Your Data-Science Music

Guest blog post by James Kobielus

Data science is a creative problem-solving exercise. People become data scientists for many reasons, and the creative challenge is probably high up on the list.

Data scientists, like most creative people, are adept at what’s often called “pattern thinking.” This is the ability to discover beautiful regularities in the world around us that others may have overlooked.  To search for statistical regularities, data scientists use big-data platforms, high-powered analytical tools, and interactive visualizations to find correlations that might otherwise elude them.

How do you identify pattern-thinking aptitudes in candidates for your company's next data-scientist position? This article highlights an interesting approach that at least one organization has taken to identifying strong pattern thinkers for their data science practices. An executive at Booz Allen Hamilton says the consulting firm, in addition to hiring statisticians, computer scientists, and domain experts, has had success with both physicists and music majors. "Both groups tend to bring curiosity and experimentation into play...with physicists 'exuding the scientific method' -- moving from conjecture to hypothesis to testing -- and music majors offering 'amazing creativity and quantitative skills.'"

Where data scientist teams are concerned, you are quite likely to find one dominant personality type: people who have ample curiosity, intellectual agility, statistical fluency, analytical acuity, research stamina, and scientific discipline. Of course, these aptitudes are not evenly distributed throughout the population. If you’re assembling a data-science practice, you need an aptitude for social pattern thinking to determine what types of individuals would complement each other best. Some data scientists are awesome polymaths who have mastered a wide range of skills, while others are strict specialists. Some are closer to the statistical analyst end of the skills spectrum, whereas others take pride in being the subject-matter expert that all the data scientists run to when the question turns to marketing, finance, and what have you.

The productivity of the entire data-scientist team depends on being able to balance this mix of people, aptitudes, skills, and roles. But more than that: it depends on being able to incorporate new roles into the team as the nature of big data and data science initiatives evolves. For example, the notion of a "customer experience modeler" is of fairly recent vintage, and it's usually not the same expert you hire when you need an expert in, say, log-linear regression modeling. It may be someone with a degree in the humanities, not mathematics and statistics.

This new reality is the focus of an InformationWeek article, "How To Build An Analytics A-Team.” The piece discusses a study by Blue Hill Research in which that firm outlines several important roles within data-science organizations. I've arranged the bulleted list of roles from well-established (in business intelligence and data management generally) to newer and less frequently found in traditional data-science organizations:

  • Data visualizer: visual orientation, focusing on innovative ways of presenting data-driven insights within "instinctual graphics"
  • Data custodian: quality orientation, focusing on data governance, data cleansing, and master data management
  • Data evangelist: application orientation, focusing on identifying new uses for big data analytics
  • Contextual analyst: narrative orientation, focusing on interpreting and describing quantitative insights within the larger business context
  • Neuro-analyst: cognition orientation, focusing on how humans can best interact with data-driven analytics to drive comprehension, exploration, and insights

If you've already included all or many of these as distinct jobs in your initiative, you're in the forefront of businesses who've committed to building data-science centers of excellence (CoE)s. And you may have even created a position for CoE administrator, whose core job it is to build up the environment where cross-role "chemistry" takes hold. Here are some tips for finding the best blend of data science professionals and for orchestrating their efforts in a collegial environment:

  • Conduct regular data-science competitions where teams can win awards for tackling tough analytic challenges;
  • Put job candidates through rigorous interview processes where they must defend their research theses;
  • Offer advanced training courses and opportunities for self-improvement;
  • Allow data scientists to participate in professional conferences;
  • Encourage data scientists to publish in external journals and other channels;
  • Organize regular gatherings that encourage data scientists to talk, present their work, and learn from each other
  • Let data scientists pursue their curiosities and research agendas with minimal interference
  • Offer data scientists the opportunity to collaborate in a steady stream of new, challenging projects in which they can develop new skills and experiment with new approaches

Want to engage with a creative community of top-notch data science professionals? Get your ticket here for the first Datapalooza, which will take place next week, November 10-12, at Galvanize in San Francisco. Sponsored by the Spark Technology Center, Datapalooza will enable you to take your data-science skills to the next level. You’ll gain hands-on experience, enjoy one-on-one coaching, and learn how to build a practical data-science product in just three days. In doing so, you’ll be addressing real-world data-science challenges that require creative pattern thinking, machine learning, cognitive computing, natural language processing, and stream computing.

You should also explore this informational IBM Analytics resource page on Spark.

Kobielus is an industry veteran and serves as IBM Big Data Evangelist; Senior Program Director for Product Marketing in Big Data Analytics; and Team Lead, Technical Marketing, IBM Big Data & Analytics Hub. He spearheads IBM's thought leadership activities in Big Data, Hadoop, enterprise data warehousing, advanced analytics, business intelligence, and data management. He works with IBM's product management and marketing teams in Big Data. Kobielus has spoken at such leading industry events as IBM Insight, Hadoop Summit, and Strata. He has published several business technology books and is a very popular provider of original commentary on blogs and many social media.

Views: 930

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service