Subscribe to DSC Newsletter

50 Shades of Grey – The Psychology of a Data Scientist

Unless you’ve recently graduated from one of the new Data Science courses that have been popping up online and in various universities around the world, then becoming a Data Scientist was most likely slightly accidental and was more about the journey than the destination.

 

Here’s my journey. See if you recognise any of it in your own:

I started out as a physicist and had a strong mathematical grounding, but I had a passion for medicine. After completing my bachelor’s degree I took a master’s degree in medical physics. This is where I gained an appreciation for the importance of image analysis and the role that data plays in medicine. I created a virtual model of a human torso by segmenting images from the Visible Human Project. Each slice had dimensions of 2048 x 1216, each in 24 bit colour, which is approximately 7.5 megabytes. Not too large, but when you put all the slices together, the full dataset is around 40 gigabytes. This may not be in Big Data territory, but it’s pretty big for a desktop PC and you get quite familiar with handling large amounts of data.

Incidentally, there are no shortages of blog posts talking about the necessary skills of Data Scientists, but very rarely does anyone mention image analysis. I predict that image analysis and video analysis will shortly become a very useful skill for a Data Scientist to have, not just in medical data analysis, but in many other areas of data analysis too.

Anyway, I digress.

After my master’s degree I then did another master’s degree in bioinformatics. During this time, the results of the Human Genome Project were published and I was honoured to be able to do some analysis of the resultant data. The Human Genome Project produced huge amounts of data, so my newly-discovered data handling skills came in very handy. Here I learned about artificial intelligence and created a number of predictive models for a variety of purposes.

At the end of my master’s research I did a PhD in artificial intelligence where I created a predictive system that prevented a terrorist attack on a public water supply. Well, actually, that part isn’t strictly true. I wrote an article that was published in New Scientist about how an artificial neural network system could be created that would prevent a terrorist attack on public water supplies…

Now here’s where my journey comes full circle. At the conclusion of my PhD I left bioinformatics and returned to medicine where I was offered the role of medical statistician to one of the worlds best breast cancer research departments. I wasn’t appointed because I was a statistician, but rather because I wasn’t a statistician. Although I had a working background in stats, they were more interested in using my skills as a bridge between disciplines. I was neither a specialist in microbiology, pathology, cancer, surgery nor stats, but I had sufficient working knowledge of each to be able to communicate and translate effectively between them all of them.

It was a really interesting time, but I realised that I didn’t actually like stats. What I did like was programming stats. Most of my time as a medical statistician involved creating programs to automate data analysis, stats and predictive systems that helped researchers reach the story of their data in a fraction of the time that it would take to analyse the data manually.

And that sort of brings me to where I am today. A few years ago I left my job to form a start-up company, Chi-Squared Innovations, that creates automated data analysis programs, but that’s a story for another day.

 

OK, so that is my story, but there wasn’t really a destination. I didn’t actually plan all of that out, it just sort of happened. I think the journey is an important one, because it tells you a lot about what Data Scientists are all about, and the skills they use every day.

I started out as a scientist, and have worked in many different scientific fields, but I’m not a specialist in any of them. I learned a lot about computer programming, data handling and image analysis, but I don’t specialise in any of these either. I guess my strongest areas (at the moment) are in artificial intelligence and statistics, but I don’t claim to be an expert. Right now I’m working on improving my skills in business development, data visualisations, shell scripting, python and GUIs, but – yes, you guessed it – I’m not an expert.

 What is a Data Scientist? - Cartoon courtesy of Philip Riggs (Twitter: @ProductiveEgg)

For me, this journey typifies the life of the Data Scientist. Most of us aren’t experts in more than one or two disciplines (or any, in my case), and to the traditional academic we are ‘jacks of all trades’. Our skills are neither that of the expert nor of the novice, but somewhere inbetween. Neither black nor white, but varying shades of grey.

 

What we need to recognise though, is that Data Science – as broad a subject as it may be – is a specialist subject of its own. To me, Data Science is the glue that binds together distinct areas of specialisation. It is the ultimate multi-discipline.

 

Here’s an unfunny joke I used to tell when I was still an academic:

Q: What do you get if you put together the best physicist, mathematician, biologist, surgeon, programmer, statistician and AI guy in the world into one room?

A: An unholy mess, the potential for 1000 arguments and the waste of $50 million.

 

Of course, this is exactly the type of multi-disciplinary dream team that universities, government bodies and companies set up regularly and call it a ‘think tank’, so why does it often fall apart?

The answer is because there is no glue. Each specialist is trained to see the problem from their own perspective and has little knowledge and understanding of other points of view.

This is why Data Scientists are becoming so important. They are the glue that pulls together disparate disciplines.

 

Oh yes, and to those that say that all you need to do to be a Data Scientist is to do an online course to learn Hadoop, MapReduce, R, Python and d3 I say this: it’s about the skills, not the tools. To learn the skills of a Data Scientist takes years, if not decades. If you don’t have any grey hairs yet, then you’re not a Data Scientist (but don’t give up – you’ll get there eventually)!

 

So to all Data Scientists the world over: stop using the Grecian 2000 and celebrate the grey – all 50 shades of them…

 

 

So what was your journey to becoming a Data Scientist? I’d love to hear your story. Just lie back on the couch and tell me all about it…

About the Author

Lee Baker is an award-winning software creator with a passion for turning data into a story.

A proud Yorkshireman, he now lives by the sparkling shores of the East Coast of Scotland. Physicist, statistician and programmer, child of the flower-power psychedelic ‘60s, it’s amazing he turned out so normal!

Turning his back on a promising academic career to do something more satisfying, as the CEO and co-founder of Chi-Squared Innovations he now works double the hours for half the pay and 10 times the stress - but 100 times the fun!

Connect with me in Twitter:

@eelrekab

Views: 5584

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Lee Baker on March 16, 2017 at 2:42am

Hi Giulia,

thanks for sharing your story.

My path from academia to the commercial world is quite a familiar one. As a non-specialist I didn't really fit into an academic world that values only specialisation. Despite all my skills, my career was stuck on the first rung of the ladder and there was little prospect of rising above it. And yet, the work I was doing was increasingly valuable to those around me and I was becoming busier and busier with researchers wanting me to work on their projects.

I realised that although my skills weren't appreciated by the institution, they were really useful to the researchers that worked on the front line with data.

That's when I made the decision that I needed to get out and forge my own path.

Interestingly, at the time I was leaving, the university was just setting up an MSc in Data Science. Despite being one of the most suitable candidates to teach on the course, they didn't want me to be a part of it (it was set up and is being run by programmers - I guess they figured that programming is the most important part of working with data...). Now that I'm out of the uni they keep calling and inviting me to come in and give guest lectures!

You really couldn't make it up...

Comment by Giulia Savorgnan on March 15, 2017 at 3:17pm

Hi Lee, 

Thanks for sharing your journey, I enjoyed reading your post a lot :)

I come from a very similar background. I started with a BS in Physics, and then continued with a MS in Astrophysics and Space Physics and eventually completed a PhD in observational Astrophysics last year. After a few months of "post-doccing" I finally made it out of academia and jumped into industry. Now I'm data scientist for a company that does predictive people analytics, a field that I find tremendously fascinating. Basically, I replaced galaxies with people, physics with psychology, the cosmological model with the five factor model, and research publications with short-deadline business projects. But I'm still doing the same mathematical modelling, statistics, probabilistic calculus, analytics predictions, and lots of coding and hacking in my favourite languages, plus I get to learn a lot about the commercial applications for data science, and I have sharpened my data visualisation skills. 

It seems that Data Science is a natural evolutionary path for any physicist, as you know very well, and I would say that astrophysicists do not make an exception.

I'm curious, why did you decide to leave academia? 

Looking forward to reading more of your posts!

Comment by Lee Baker on April 27, 2016 at 4:32am

@Nahum

Thank you for taking the time to comment - and what a comment it is, almost the same length as the article! :-)

I love your analogy with the BORG. Does this mean that if we glued our heads together we'd be a BORG collective?

I think I agree with you that medicine is mostly out of step with modern developments. Individuals might get excited with the latest tech, but when they want to drive forward its implementation it's often a case of 'we need to review this at a higher level before we can move forward'. Fast-forward 10 years, they're still reviewing it and the tech has become obsolete, replaced by something newer and better - and round and round they go again.

As a commercial organisation, it's incredibly difficult to get started in public medical institutions like this because they're happy to talk and talk without ever making a decision - just like their Think Tanks!

Anyway, I loved reading your story, so thank you for sharing it with us.

Comment by Nahum Kovalski on April 27, 2016 at 2:18am

"The answer is because there is no glue" - IMHO, this is the most powerful statement in your post, which I greatly enjoyed. 

I studied computer science in the early 1980s [and I had the great pleasure of working in DEC assembler on a PDP 11-73, to develop software that was used to study a cell system, for my parallel major in physiology]. After this, I studied medicine [because my mother told me that doctors never starve] and then I did  4 1/2 years of a surgery and urology residency. But I left it to develop a company in Israel dedicated to urgent care. Fast-forward 21 years, I fulfilled multiple tasks in this company, including medical director and head of IT. I had the incredibly unique and rewarding experience of writing my own electronic health record. I worked with a brilliant young lady who was my ASP.NET expert. We built a data warehouse and on top of this, a query system that has been used on a daily basis for various operational and clinical statistics, as well as the basis for multiple published articles and presentations.

I am dedicating this coming year to relearning all of my senior math [which was wiped from my memory when I was absorbed into the BORG]. After this, I hope to do at least a couple of online courses in machine learning, with a specific interest in image analysis, which understandably is very important in medicine. I was already incredibly fortunate to work with a deep learning company in the US (Enlitic), and we succeeded in demonstrating the ability of a deep learning system to diagnose a wrist fracture. I was over the moon when the CEO, Jeremy Howard, presented this success at the last exponential medicine conference.

Particularly in medicine, there is a huge disconnect between clinical practice and science. And this disconnect is growing as days go by. I built a system that allowed me to review x-rays sufficiently well on my smart phone, to provide remote consultations to multiple clinics. I was literally laughed at by multiple physicians when I told them of my success with this system. They brushed me aside and said that this would never work. They refused to look at my phone to see the quality of the image. And as a physician, practicing 25 years, I have to tell you that this is where most doctors are today. They are absolutely not qualified to even appreciate the significance of a great deal of modern technology, yet they brush it aside on the principle that if it was not taught in medical school, it is not important.

Before I make this comment book-length, let me say the following. There is an absolute need for the glue you spoke of. Because of my dual background, I was able to translate between tech needs and doctor needs.  That is the only reason I was able to develop the EHR that I did, in the time that I did, thankfully to the great satisfaction of its users.

I think there is a desperate need for a think tank that is made up of people with combined backgrounds in (not exclusively) technology, general science, math, health sciences. I think such a think tank could offer its services to major hospitals that simply do not have the mindset to employ the right team of data scientists to analyze their own data. Believe me, if I had the money to just find a great location and higher at top dollar, various data scientists from around the world, I think such a service would change the face of medicine for the positive. There would be some resistance from the medical community because they would feel excluded from tremendous discoveries being made, based on data they themselves collected but never fully analyzed.

I thank you again for this excellent post  and I wish you all the best and success

Comment by Lee Baker on April 21, 2016 at 1:30pm

@Brian

Thank you for sharing this. I suspect there are many Data Scientists who found the field almost by accident and have similar stories to share.

For many years I had an identity crisis. People asked me what I do, and I honestly didn't know how to answer until I heard of the new title 'Data Scientist'. When I read more about it I realised I'd accidentally become something very different to what I expected.

Here's the next big thing though - in the next few years there will emerge various 'flavours' of Data Science, and people will start asking what kind of Data Scientist we are, and the whole identity crisis thing will start all over again...

Comment by Bryan Hudson on April 21, 2016 at 6:31am

Thanks for sharing.  My story is also a bit wandering.  Graduated with a MS in Experimental Psychology back in '99; emphasis on the word experimental.  That's the scientist portion of the field, not the counseling/couch portion.  Tried my hand running a mental health study after graduating and the red tape drove me insane.  And quickly came to realize that no matter how good a plan you put together, the decision to change is ultimately left with the patient (also translates to a great business lesson for data scientists).  Migrated into data analysis using my statistical skills, which lead to programming and code development, then to software administration, which got really stressful and boring.  So circle around back to data analysis but with a lot of experience and programming skills (starting to sound more like a data scientist).  Now, I am enrolled in one of the MS in Data Science programs while working as a data scientist.  And that is my story.

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service