Subscribe to DSC Newsletter

Research Brief: Four Functional Clusters of Analytics Professionals

Authored by:

Pasha Roberts, Chief Scientist

Greta Roberts, CEO

July 2013

Key Findings

Digging into a cross-industry study of analytics professionals, we identify four distinct patterns of how these workers spend their week: (1) Data Preparation, (2) Manager, (3) Programmer, and (4) Generalist.  These functional clusters are defined by unique time-usage patterns, and further, exhibit important differences across dimensions of education, demographics, and mindset.  This brief quantifies these characteristics, and their deep implications for sourcing, hiring, managing, and retaining analytics professionals in each category.

Four Functional Clusters of Analytics Professionals 

The groundbreaking 2012 Analytics Professionals Study by Talent Analytics, Corp and the International Institute for Analytics utilized many measures to understand the characteristics of modern analytics professionals and data scientists1. The study examined 302 active analytics professionals in a diverse sample of companies, industries, sizes and circumstances. For the purpose of this paper we will use the terms data scientist and analytics professional interchangeably.

Our premise? Data scientists have been discussed as if they performed a single role. We suspected this role was wider, containing a broader workflow of tasks. Clarifying the analytics workflow and tasks performed by data scientists could also provide insight into those looking to hire and retain professionals in this important role.

To this end, we asked participants in our study how many hours per week they spent in various analytics-related activities. Every attempt was made to capture and reflect tasks in a modern analytics workflow.

The study also gathered 11 factors pertaining to an individual's “raw talent” factors, also known as aptitude or mindset. Aptitude can be distinguished as different from achievement.  This study will show how aptitude and other factors differ across job types.

Time Spent on Tasks in Analytics Workflow

In aggregate, the study sample spent roughly the same percentage of time performing tasks in the following 11 categories:

However, upon deeper analysis it became clear that there were several “types” of workweeks at hand in this sample.

Fuzzy Clustering Yielded Best Results for Understanding Analyst Work

When we reviewed the 11 different analytics functions, 
4 categories emerged, each containing multiple functions.

Several algorithms were examined to perform cluster analysis upon this sample, and “Fuzzy Clustering” delivered the best results. This method implies that each item belongs to each cluster to some degree, which makes sense given the fluidity of most analysts' work. The best results were found with four clusters, which were named based on their dominant activities.

Reading Density Plot Chart Types (as seen below):

This brief uses a chart type known as a “Density Plot”.  A “bell curve” is a density plot. It displays the estimated population percentage for each possible value of a variable, that is the “density” of that variable. The horizontal X axis measures a single raw talent metric (like Curiosity) on a scale of 1 – 100. The farther to the right, the closer the score is to 100, and the more Curious the individual is. The vertical Y axis shows the percentage of the population estimated at each raw talent score. The higher up, the more of the population will have this score at X.

In a random sample of people, the same number of people would have a Curiosity score of 29 as would have a Curiosity score of 78. Therefore, a random sample of people would display as a flat line at 1% (we drew a dotted line at 1% to show where a random sample of people would score.)

When viewing our Density plots, the most interesting information is found when the line is below or above the dotted line (above or below the random sample line).

Analysts in All Functional Clusters Have Important Similarities

For the purpose of this paper we focus primarily on teasing apart differences in the 4 functions inside of the data scientist role.  These differences have implications for hiring, promoting and retaining. Before we begin, it is interesting to note similarities among the functional clusters before we tease apart differences.

Similarities Seen in All Functional Clusters

Analysts in all 4 functional clusters have two things in common: 1) very strong intellectual curiosity (Theoretical Drive, see Figure 1) and 2) strong drive to create out of the box solutions (Creative Drive, see Figure 2)

Figure 1: Level of Intellectual Curiosity. 
(the further right, the more curious.)

Figure 2: Level of Creativity
(the further right, the more creative)

Four Functional Clusters

Cluster 1: Data Preparation Analysts

Analytics professionals in the first group, the Data Preparation cluster, spend a significant amount (46%) of their time gathering and preparing data for analysis used later on in the analytics workflow.

Figure 3: Time by Analytics Function for Data Preparation Cluster

How are Data Preparation Analysts Different from other Analysts?

  • One interesting difference is that they are decidedly less Competitive (less Politically motivated).  This group is less interested in advancing up the hierarchy, competing for an internal promotion as a reward or engaging in power struggles with management, other analysts, employees or their customers.  (See Figure 4)

Figure 4: Density of Drive to Compete and Win
(the further right, the more motivated to compete)

 

  • Our analysis showed Data Preparation analysts showed the strongest aptitude for detail and are least likely to make mistakes. This seems to make sense as a Data Preparation role requires constant attention to detail and attracts those who embody this quality. (See Figure 5)

 

Figure 5: Density of Drive to be Exact, Accurate, Mistake-free
(the further right, the more detailed, exact, precise in their work)


Sourcing, Hiring, Managing and Retaining Data Preparation Analysts

  • Sourcing:  Data Preparation candidates are likely to be found in other areas of your organization, particularly roles that are detail rich.  Remember aptitude for detail and accuracy needs to be reviewed in addition to a requirement for strong intellectual curiosity and creativity.  Of course, it is vital to also to evaluate intelligence and training for such a role, but Data Preparation requires the least statistical domain knowledge out of the four clusters.
  • Hiring: In trying to incent candidates to this role, do not focus on political growth, career advancement, or a future with lots of senior level visibility.
  • Managing: Data Preparation staff will want details about their goals and performance vs. general comments.  They will keep details on their own projects and performance and be disappointed if this is not similarly tracked.
  • Retaining: The work in the Data Preparation cluster is definitely on the “back office” side of analytics, but it is a large and essential function.  Professionals in this role share much of the creativity, intellectual interests and beliefs as other analytics professionals.  They will easily become bored and leave for another role that satisfies their intellectual curiosity more fully.  Ultimately they are looking for a mentally challenging role that appeals to their natural curiosity and creativity more than career advancement.

Cluster 2: Analytics Programmers

The second functional cluster identified consists of analytics professionals whose workweek is weighted more heavily toward programming – writing computer code to manipulate and process data.  They spend more than 3 times the time programming than any of the other clusters. That being said, analytics Programmers still only spend one third of the time, on average, programming.  The rest of the time they spend on other analytics-related activities, like other analysts.

Figure 6: Time by Analytics Function for Programmer Cluster

  • This is the youngest age group; almost half are under 29 years old.
  • Programmers are the least experienced group; again over half have less than 5 years of either business or analytics experience.
  • This cluster is similar to the Data Preparation cluster in their lack of desire to climb the corporate ladder.  They have no goals of heading a large organization to increase their stature inside of an organization.  (See Figure 4)
  • From an aptitude perspective, Programmers have the strongest desire to collaborate, making sure they gain alignment on their work. (see Figure 10)

Sourcing, Hiring, Managing and Retaining Analytics Programmers

  • Sourcing: Look for analytics Programmers inside your organization in current programming roles.  Make note of those that install beta-versions of software, pushing the limits of functionality or who constantly experiment.  Recent college graduates could be a good source; ask what projects they’re working on even if just for their own personal projects.
  • Hiring:  Candidates in this role are most interested in learning new software, staying on the leading edge of technology and analytics, being given some free reign to experiment and explore and be involved in continuous learning. The farther away they get from doing hands on work the more bored and dissatisfied they will get.
  • Managing: Given the age and experience level of Programmers, managers would be wise to take the time to mentor them about general business knowledge, business expectations and perhaps how to maneuver politically inside of an organization.  Their lack of political savvy could land them in trouble if not given some insight and boundaries.  They will be a quick study, and will be happy to learn.
  • Retaining:  Like all analytics professionals this cluster will easily become bored. Financial incentives and promises that they will move up the ladder will not appeal to them nor make them feel valued or challenged. Ultimately they are looking for a mentally challenging role that appeals to their natural curiosity and creativity more than career advancement.

Cluster 3: Analytics Managers

The third cluster, Analytics Managers, report they spend more than half their time, on average, managing their analytics team and performing a variety of administrative tasks.  Their workload leans towards managing direct reports and projects, and then presenting results of projects to their customers.

Figure 7: Time by Analytics Function for Managers Cluster

 

How Analytics Managers Differ from other Analysts

  • Managers are the oldest of all other functional clusters – less than 10% are under 29. 
  • Along with Analytics Generalists, Managers showed broader experience and more years as an analyst than other groups (which makes sense along with their more advanced age).
  • Managers accounted for the largest number of Ph.Ds. (27%), yet most Managers (40%) were only at a Masters level. 
  • Managers had a strong competitive and political aptitude, quite different from the other 3 clusters.
    • Managers were more competitive than any other cluster (see Figure 4), reflecting their choice to lead, compete and engage with the corporate hierarchy.  
    • Interestingly, Managers had higher and consistent “altruistic” scores, as well, signaling a drive to mentor and help their team; as well as empathy with clients who find out tried and true, older approaches could be easily improved by newer analytics models. (see Figure 8).  That said, very few of the sample exhibited “altruistic” scores that were at the top of the scale.

Figure 8: Density of Drive to be Compassionate and Empathetic
(the further right, the more compassionate)

  • Finally, our sample of Managers displayed less of a drive to focus on “Economic” results than all other groups.  This is curious and unexpected by the study team. This indicates Managers (at least today) are focusing more on knowledge and leadership-oriented drivers such as “being the smartest person on the team” and “taking care of their team” while analysts at a lower level focus more on attaining tangible results. (see Figure 9)

Figure 9: Density of Drive to Achieve Bottom Line Results, or to See ROI
(the further right, the more focused on results, including personal financial results)

  • Managers (and Generalists) showed a high score in having a “Command and Control” approach to managing. This is quite a contrast to the other two clusters. (see Figure 10)

Figure 10: Density of Drive to use an Assertive Management Style
(the further right, the more bold and confident the approach)


Sourcing, Hiring, Managing and Retaining Analytics Managers

  • Sourcing:  Management candidates can be located from existing analytics professionals or other management areas inside the organization.  They will be easy to identify, as they will be very focused (even from the point of being a candidate) on what they need to do to advance and move up the management chain.
    • As the discipline of analytics progresses, we wonder if the management role will change from one where the manager needs to be the smartest person in the room to a role where they have a team of smart, curious problem solvers, and they are more focused on results while keeping everyone else on target.
    • It might be interesting to source management candidates from other internal business areas where a manager’s analytics expertise is less but their management aptitude is more advanced.
  • Hiring:  Management candidates will be focused on advancing and one day managing their own team of direct reports. If advancement is a real opportunity this would be something to point out in a job ad, job description or interview. But know, if they come on board, they will not forget that they were told they could advance and will feel robbed if this isn’t addressed. 
  • Managing: Those with management aspirations will feel less motivated until their path for advancement is clear.  They will see leadership and visibility to other leaders as a bonus and something to strive for.  Given that they have less of a focus on results, perhaps their advancement could be tied to goal achievement and completing projects on time, etc.
  • Retaining: Managers in our Study showed that what they care most about is learning, not being bored, and begin able to advance inside the organization. Ironically our sample of managers were least interested in financial rewards.  So as with other types of analytics professionals, additional money won’t be turned down, but it won’t be a factor in retaining them in a job they don’t enjoy. 

Cluster 4: Analytics Generalists

One cluster of analytics professionals, Analytics Generalists, did not report spending significant time in any focused area.  Generalists in our study were found in a wide variety of company and industries.  Contrary to the study’s original hypothesis, Generalists are found in very large organizations, as well as small companies.

We suspect Generalists work in all sized companies is because:

  • Analytics team sizes continue to be small even in the largest organizations with the largest analytics teams (78% of analytics professionals in our study worked on teams of 10 or less people), and
  • Analytics hiring managers are early in realizing that Data Science isn’t a single role. We suspect the role will continue to be defined over the next 12 – 24 months as the discipline advances and matures.

Figure 11: Time spent by Analytics Function for Generalists Cluster

How Generalists Differ from Other Analysts

  • Analytics Generalists appear to be a hybrid of the “raw talent” traits contained in the other 3 Functional Clusters. 
  • They could be described as most like Managers with less inclination to be political or controlling, and more inclination for tangible results while tending towards doing careful and detailed work.

Tips for Hiring, Sourcing, Managing and Retaining Generalists

  • We don’t necessarily advocate actively looking for or hiring a Generalist.  Given some of the differences we’ve seen in the data, even with regards to how Data Scientists spend their time we would suggest actively considering how your analyst is going to spend their time and hire to those requirements. Being clear about role requirements will increase business performance, job satisfaction and will reduce top analysts leaving (and saying it was because of money, which we know it isn’t).

Business Conclusions

Division of Labor

The field of data analytics is going through rapid change as new data sources and new business opportunities emerge. By nature, very few people are well suited to do everything on the spectrum of analytics – to clean data AND program AND analyze AND present AND manage.  This is unrealistic and does not scale.

This Study reveals an ongoing trend to divide the work up between Preparation, Programming, and Management.  Ironically, the analysis and visualization stage is rather small by comparison.

It appears that some Generalists are in this role for organizational reasons rather than aptitude or personal preferences.  Meaning, it could be that today’s Generalists have been placed in this role not because they are great at this, but because their analytics role is less well defined and they were hired to “do everything”. Generalists are found in small organizations, where it may be necessary to do everything, and in large organizations, which could easily specialize, but do not seem to deploy a division of analytics labor.

If this proves to be the case, over time today’s analytics discipline will mature and analytics teams will begin to divide workers into more specialized tasks – like the clusters we’ve identified. When this happens, it could be that a group of “True Generalists” will remain, or perhaps these will emerge as the true “Analysts”.

Suggestions around Promoting Analysts into Management Positions

  • The optimal pattern for a Manager is different from patterns for other roles that emerged in the Study.  Specifically, only the Manager Cluster have the necessary mentoring and coaching mindset required for effective managers and for moving up the organization. 
  • It is important to identify and promote analysts who are “made for a management role” and to offer this role to Analysts that see it as a true reward, rather than as something a great analyst needs to do to keep their job.  Otherwise, the firm will potentially lose a good analyst, and gain a bad Manager. 
  • Hiring and promoting individuals into management roles with aptitude results close to the optimal pattern will produce the best managers. While few will exactly match the Manager benchmark to a digit, the differences from the pattern will indicate areas for coaching and growth.  

About the Authors:

Pasha Roberts is Co-founder & Chief Scientist of Talent Analytics, Corp.

As Co-Founder and Chief Scientist, Pasha is responsible for architecture, development, and algorithms for Talent Analytics. He wrote the first implementation of the software over a decade ago, and today he continues to drive new features and platforms for the company.

As is often found in data science, Pasha has decades of experience/education that span computing, quantitative, artistic, and business categories.

Pasha holds a bachelors degree in Economics and Russian Studies from The College of William and Mary, and a Masters of Science degree in Financial Engineering from the MIT Sloan School of Management. His thesis at MIT prototyped the application of advanced 3D graphics to massive financial “tick” datasets.

He has founded two companies, WebLine Communications Corporation, an web-call center enterprise software company, and Lineplot Productions, a financial visualization/animation service company.

Pasha’s passion with Talent Analytics is to develop new analytics to focus business performance, and to extend the Talent Analytics model to a useful set of software platforms. He hopes to discover new information about people and the work they do, with every new project. Follow Pasha on twitter @PashaRoberts.

Greta Roberts is Co-founder & CEO of Talent Analytics, Corp.

As Co-founder and CEO, Greta is responsible for charting a predictive analytics approach and software platform to solving employee challenges. In addition to her role as CEO, she was elected as The Program Chair for Predictive Analytics World for Workforce and continues as Faculty at the International Institute for Analytics.

Greta brings a unique perspective to solving complex, long-term challenges. This is never more evident in the firm’s early direction to use analytics to solve “line of business” challenges instead of “HR” challenges and modeling business outcomes instead of HR outcomes. This approach has lead Talent Analytics recognized leader in predicting employee performance and attrition. Talent Analytics focuses their work on high value, high turnover positions like Sales positions, Bank Tellers, Insurance Agents, Customer Service Reps and Data Scientists; all areas where reduced attrition or increased performance can yield $ millions in bottom line savings or income.

Greta is a sought-out international thought leader, presenter, and author. She has been a multi-year presenter at Predictive Analytics World (PAW), keynoting in 2014 at PAW Toronto, the ADMA Global Forum in Sydney, Australia, the INFORMS Analytics Conference & SAP Sapphire Now. In addition to speaking, she is often quoted in the press in a variety of influential business publications.

Follow Greta on twitter @gretaroberts.

Note:  To license Talent Analytics Data Scientist benchmark to help build your analytics bench, please contact Talent Analytics directly for more information:  617-864-7474 x.101 or [email protected]


[1] See IIA Research Brief Quantifying Analytical Talent for additional results of the study.

©2013 Talent Analytics, Corp.
All rights reserved

Views: 3626

Tags: Greta, Pasha, Roberts, Talent, analysts, analytics, aptitude, clusters, corp, data, More…functional, hiring, managers, managing, mindset, people, preparation, programmers, retention, sourcing, talent, traits, workforce

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Harshvardhan Jaipuriar on June 11, 2014 at 7:32pm

Great article!

Comment by Sumedha Sengupta on May 23, 2014 at 7:37am

This is an analytically informative and very interesting  article to me, a Statistician, who is trying to understand  the role of a Data Scientist in an organization, a role that requires the person not be a Statistician. Some think Statisticians only work with Regression Analysis, which may be for a very specific job, but surprisingly, Statisticians also work with Data, its presentation, analysis and interpretation. Working in cross functional teams frequently developing  methodologies suitable for the process at hand.

This article clarifies lot of fuzzy areas in a Data Scientist's job description. 

Comment by Andrew Troemner on May 22, 2014 at 11:51am

Daren, I think part of the problem is that different ways to visualize the data depend on availability, cleaning, and so on. A number of times at work I was asked to put together a graph on such-and-such statistic, and it required jumping through two or more aggregation levels to really get at what they were looking for. Being able to spend significant amounts of time in visualization for data exploration requires having a really solid and thoroughly cleaned data set, which can take 5-10 times longer to produce than the actual visualizations.

Comment by Daren Scot Wilson on May 22, 2014 at 11:23am

Interesting.  I'm surprised how little visualization stands out among the activities.  Generalists seem to have the longest horizontal bars in that area, but clearly not in a special way compared to other activities.  I would have imagined some smart creative people would love plotting data, making charts, and such, so much that there'd be a cluster of them too.

Comment by Dorothy Hewitt-Sanchez on May 22, 2014 at 8:53am

I really like the article. 

Comment by Andrew Troemner on May 22, 2014 at 8:10am

Great article! Thank you very much for sharing.

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service