The sixteen data scientists across sixteen different industries were interviewed to understand both how they think about it theoretically and also very practically what problems they’re solving, how data’s helping, and what it takes to be successful.
Here are some Thoughts that Inspired & Make Me Happy.
1. Chris Wiggins
Chief Data Scientist at The New York Times & Associate Professor of Applied Mathematics at Columbia University
Most of the knowledge in the world in the future is going to be extracted by machines and will reside in machines
There are just not enough brain cells on the planet to even look or even glance at that data, let alone analyze it and extract knowledge from it
Knowledge is some compilation of data that allows you to make decisions, and what we find today is that computers are making a lot of decisions automatically
Diversity of point of view is a very important thing
You don’t want to just hire clones of the same person, because then they will all want to explore the same things. You want some diversity
The idea that somehow you can put a bunch of research scientists together and then put some random manager who’s not a scientist directing them doesn’t work. I’ve never ever seen it work
Management skills are a little overrated in the sense that managing research scientists is like herding cats
The only way to make intelligent machines was to get into learning, because every animal is capable of learning. Anything with a brain basically learns
It’s useful for a company to have its scientists actually publish what they do. It keeps them honest
The data sets are truly gigantic. There are some areas where there’s more data than we can currently process intelligently
The amount of human brainpower on the planet is actually increasing exponentially as well, but with a very, very, very small exponent. It’s very slow growth rate compared to the data growth rate
4. Erin Shellman
Statistician and data scientist in the Nordstrom Data Lab
As a data scientist, even if you don’t have the domain expertise you can learn it, and can work on any problem that can be quantitatively described
The most interesting types of data are those collected for one purpose and used for another
Presentation is the ability to craft a story
Presentation skills are undervalued, but is actually one of the most important factors contributing to personal success and creating successful projects
If you talk to somebody who has something you want, follow up
What companies want is a person who can rigorously define problems and design paths to a solution
The best way to become a data scientist is to do data science
Anything that looks interesting is probably wrong
Intuition is really a well-trained association network
As data scientists, our job is to extract signal from noise
Query understanding offers the opportunity to bridge the gap between what the searcher means and what the machine understands
Search is the problem at the heart of the information economy
Where things get interesting is in the details
Our goal is to fail fast. Most crazy ideas are just that: crazy
It’s easy to be lazy and look at aggregates. Drilling down into the differences and looking at specific examples is often what gives us a real understanding of what’s going on
One thing we’ve learned is that there’s no such thing as over-communicating
Technology is like exercise equipment in that buying the fanciest equipment won’t get you in shape unless you take advantage of it
Always put talent before technology
Data scientists need to have strong critical-thinking skills and a healthy dose of skepticism
Failure is a great teacher
Experience is not only the best teacher, but also perhaps the only teacher
Our computers, mobile devices, and web-based services are witnesses to many of our daily decisions
You don’t have to know everything, but you should have a general idea
E-mail data is powerful, because as a communications channel it generates more revenue per recipient than Social Channels
Twitter is probably the best place to start conversations about data science
Talking to users is crucial because they point you in the right direction
What we focus on, and this is going to sound goofy for a data scientist - is the happiness of our users
Vendors are there to sell you a tool for a problem you may or may not have yet, and they’re very good at convincing you that you need it whether you actually need it or not
I find it tough to find and hire the right people
Data scientists are kind of like the new Renaissance folks, because data science is inherently multidisciplinary
It’s essential for a data science team to hire people who can really speak about the technical things they’ve done in a way that nontechnical people can understand
If you’re solving problems appropriately and you can explain yourself well, you’re not going to lose your job
The biggest lesson is to have a very clear set of customers that you’re going to serve, notwithstanding the fact you may be building something that can ultimately help many different types of customers
8. Claudia Perlich
Chief Scientist at Dstillery (formerly Media6Degrees) and teaches data mining for business intelligence in the MBA program of the Stern School of Business, New York University
Data is the footprint of real life in some form, and so it is always interesting. It is like a detective game to figure out what is really going on
The conversation is based around how to properly deal with even more sensitive information about where exactly people spend their lives
So a large part of how things are presented, communicated, and represented carry very different messages from very different angles, depending on what you are reading, so you probably need a very broad depth to understand the issues
My primary challenge as a data scientist is to use the right algorithm to connect the right data to the problem you actually want solved
In the real world data is not like data they saw in classrooms and in books
I prefer somebody who has done ten different things in ten different domains because they will have hopefully learned something new about data from each of different places and domains
“Data scientist” is a completely undefined job description. Today, if you hire a data scientist, you do not know what you are getting
Learning how to do data science is like learning to ski. You have to do it
9. Jonathan Lenaghan
Head of Data Science at PlaceIQ, a mobile geolocation intelligence company aggregating and analyzing spatial data for marketers
People under pressure to find patterns are prone to fall into the common human fallacies of over insuffucient data and overreading correlation as causation.
Losing somebody else’s money is one of the most horrible sinking feelings in the world
Having no competitors is bad
I always try to look at the problem from the end. When you start from the beginning and everything is blue sky, there are hundreds of ideas to chase as well as thousands of ideas to try and, since everything is possible, nothing ever gets done
Keeping your eyes on the final deliverable is essential to solving the right problems
Being self-critical is important
Your location history that is important, not necessarily where you are right now
It is very important to be self-critical: always question your assumptions and be paranoid about your outputs
10. Anna Smith
Analytics engineer at Rent the Runway, an online and offine fashion company that rents designer dresses and accessories
Conversations that happen in machines are different from the ones that happen in the physical world. In the physical world, it lasts a long time and we are able to use a lot of cues other than just text or audio. In computers, interactions are usually very short and many times there are many more people involved
These days, if you build a community around yourself, the news and people start to find you
Success is when what you do is adopted by someone else. The ability to actually deliver something tangible - that’s the main index of success.
Getting through life, through those uncertainties in a way, when you look back and see things still connect and exist, that’s the biggest measure of success
There is a big part of intuition in choosing the most important problem.
We are positioning ourselves to be lucky. We follow the adage that luck is being prepared for an opportunity and seizing it when it appears
The core lesson from tool-and-method explorationis is that no silver bullet
To build successful teams and projects, I strongly believe in the Kaizen approach. Kaizen was made famous in part by Japanese car manufacturers involved in continuous improvement. I believe you should always be looking for ways to improve things, just small things. Just try it out
Financial gain is a second-order result: if you do the right things, everything else will follow.
Maybe the most important thing is to surround yourself with people greater than you are and to learn from them
The idea or the initial enthusiasm is just a small part of doing something great
Everyone is right, depending on the situation and context
12. Amy Heineike
Director of Mathematics at Quid, an intelligence platform that combines natural language processing, machine learning, network science, and data visualization
The key is figuring out how you get those three things: the right problem, the right data, and the right methodology to meld
There are a lot of different roles that are going under the name “data science” right now, and there are also a lot of roles that are probably what you would think of data science but don’t have a label yet because people aren’t necessarily using it
In general, it’s very hard to hire people who are a complete package, who know what to do and how to do it
13. Victor Hu
Chief Data Scientist at Next Big Sound, an online music industry platform that tracks artist popularity and probability and fan behavior across social media, radio, and traditional sales channels as reported at a granular level by record labels
One of the big challenges of being a data scientist that people might not usually think about - is that the results or the insights you come up with have to make sense and be convincing. The more intelligible you can make them, the more likely it is that your recommendations will be put into effect
People are more interested in their projects because they have selected them
Hiring data scientists is very exciting at this time because in some ways there are no established guidelines on how to do it. People have skills in so many different areas
It is hard to know what you really need until you dig into it
14. Kira Radinsky
CTO and Co-Founder of SalesPredict, a company using machine learning and predictive analytics to provide Customer Lifecycle Intelligence
The hope is that if we can start building the right models to find the right patterns using the right data, then maybe we can start making progress on some of these complicated systems
One way of understanding the data issue we face is by imagining that you are in a stadium and you can listen to 500 people at once. Your goal is to figure out what’s going on in the game just from listening to those people.
Graduate students, perhaps because of an adherence to sunk cost fallacy, often write really great surveys of the field at the beginning of their PhD thesis
What really matters is who’s actually using and paying for it
In reality, almost no one actually cares about predictive accuracy because in almost all the cases, their starting point is nothing. The number of industries where the difference between 85 versus 90 percent accuracy is the rate-limiting factor is very small
My list of goals is to learn everything, be able to build anything, save everyone, and have fun doing it. That’s a nice simple list
Startup culture teaches you to be like Steve Jobs, in that you’re right, everyone else is wrong, and your vision will power through.
Academic culture teaches you that you’re dumb and that you’re probably wrong because most things never work, nature is very hard, and the best you can hope for is working on interesting problems and making a tiny bit of progress
Some of the best scientists out there are the ones who are extremely opportunistic - when they see novel ideas and how things suddenly fit together, they drop everything else and work on that for a while.
The right thing to do is to not build a tool company but to build a consultancy based on the tools. Identify the company, identify the market, and build a consultancy. Later, if that works, you can then pivot to being a tool company.
Life is too short to not be having fun
I have a clock that shows the estimated number of days until I’m 80, which is a reasonable life expectancy. It helps to remind me that each day actually really matters.
You don’t really know if what you’re working on is the right problem to be solving sometimes until years out, but you hope you’re on the right track.
When I evaluate machine learning papers, what I am looking to find out is whether the technique worked or not. This is something that the world needs to know ...
I basically can’t hire people who don’t know Git.
I actually think a lot of the future is in small data .... As the big data hype cycle crests, we’re going to see more and more people recognizing that what they really want to be doing is asking interesting questions of smaller data sets.
Life’s too short to work with assholes.
In academics or industry, if you’re not actually speaking in a language that your customers understand, then you will have a nice time talking, but no one will really listen to you
The biggest thing people should be working on is problems they find interesting, exciting, and meaningful
16. Jake Porway
Founder and Executive Director of DataKind, a nonprofit dedicated to using data science to tackle the world’s biggest problems
The world will be more effective if everyone can at least converse about data science
Data scientists in the business world are all generally well-compensated
Data scientists can apply their skills for good
Data is new eyes
Data science is a way to see the world through the lens of this new macroscope to learn the patterns of society and nature so we can all live better lives
There’s almost no limit to where data and data science can be applied
Every company has data that can help make the world a better place
You need to be a member of Data Science Central to add comments!
Join Data Science Central