Big Data and Data Science. Some reflections on compensation levels

Guest blog post by Harry Powell, Head of Advanced Data Analytics at Barclays.

I was at a meetup in Oxford recently and one of the speakers, the CEO of a tech start-up, brought up the subject of Data Scientists’ pay. Apparently they are paid too much. I am not sure whether the data supports this assertion, but it seems to be a common complaint amongst highly-paid CEOs. What surprised me was his tone of moral indignation: How dare an engineer demand to be paid as much as a general manager? He was really upset.

I guess it is ingrained in British culture that if you want to earn more, you stop doing what you are good at and become a general manager. But it got me thinking. What determines how much someone should be paid? What has changed so that Data Scientists can ask for more? And can it last?

After 2008 no one still believes that value creation determines pay. Clearly it must determine the sustainable upper bound of pay. But given that you can only be paid less that you contribute, how much less? It all comes down to bargaining power.

Currently there is a lot of demand for people who can build data products. Companies like Google and Uber have shown not only that you can make money by embracing data science, but also that you will become vulnerable to disruptors if you don’t. Small companies want to emulate Uber. Big companies want to fight Uber. To do so they need data scientists. But the supply of data scientists is inelastic. It’s a new technology, and only a subset of existing data professionals (analysts, architects and developers) have the skills or aptitude to adapt.

But demand exceeding supply in itself doesn’t drive individual wages up.

It’s not unusual for a firm to have one employee who is said to be “irreplaceable”. This is the one guy that has been there from the beginning, that understands the product and the technology, that the company couldn’t function without. On the face of it that employee has the same level of bargaining power as these star data scientists. Unfortunately the same knowledge that makes him valuable also acts as his prison. His knowledge is only useful within that company. In any other company he has no special value. He has no outside option and so no credible threat of exit. So he has no bargaining power at all and is often lowly paid.

Why would data scientists be any different? They too are technical specialists, but unlike engineers in industry, the data scientist’s skills are much more easily transferred between businesses. I think this is because they work at a level of abstraction which is not tied to a particular product but more to a set of concepts and methodologies.

But it is this same kind of abstraction that has capped the wages of software engineers, who are at threat of being outsourced to India (or Romania or wherever) if wages escalate. They can be outsourced precisely because the same set of concepts and methodologies can be abstracted and applied anywhere on the planet. Just as they are not tied to a job, so the job is not tied to them.

I don’t think that this rule applies to Data Scientists (at least not yet). The reason you can outsource software development is that you can write requirements in a document and send it overseas. I have written elsewhere that data science is exploratory and iterative – you only know your detailed requirements once the project is delivered. To do this you have to be close to both your client (who will own the system) and your ultimate customer (who will use it). While one can easily send data overseas, it is hard to send clients and customers.

And there is another reason it’s hard to move work to lower-cost locations:  data science is new and fast-changing. For example, every very few months there is a major release of Spark, each one containing  significant new functionality. Every week I hear about new ideas, frameworks or applications. Documentation and training can’t keep up. Data Scientists learn what’s possible from talking to people at work, at meet-ups or from friends in the pub. Only a few places in the world have the critical mass of people and companies actually using this technology and at the moment the technology is being created faster than it is diffusing out, so the gap is growing. London is one such place, and it attracts the best talent from across Europe and further afield. I would be hard pressed to outsource even to many other cities in  UK (Cambridge, Edinburgh, Oxford maybe), let alone overseas (Berlin, certainly, but not sure about anywhere else).

So it’s hard to see any immediate economic pressures coming from the labour market to put upstart data scientists back in their place.

But there is another factor which might threaten the data scientists position: automation, the classical substitution of labour by capital. Just as weavers were replaced by spinning machines, can data scientists be replaced by a general data science machine? A number of vendors are betting big that it can. Managers across the world are hoping they are right.

I think it’s a fascinating proposition, and it’s the subject of an upcoming blog.

* Have a look at the O’Reilly Data Science Salary Survey. It’s a fascinating analytical report and gives lots of insights into our changing industry:https://www.oreilly.com/ideas/2015-data-science-salary-survey

About the Author


Harry earned an MBA from Said Business School at Oxford University. He also earned a physics and economics degree. I He has built and is managing a team of world-class data scientists, creating a vision and a strategy for data science at Barclays, leveraging pattern recognition technology to diverse data sets, and developing Big Data applications in Scala and Spark. In particular, Harry has implemented big data machine learning applications that deliver automated analytical content directly to users, including an automated system to present customers relevance-ranked analytical insights about their business in natural language form.


Leave a Reply

Your email address will not be published. Required fields are marked *