Subscribe to DSC Newsletter

Should Data Science Become a Profession?

On April 10, 2013 Gregory Piatetsky-Shapiro (KDnuggets), Eric Siegel (Predictive Analytics World) and Michael Walker (Rose Business Technologies) discussed whether data science should be an independent profession with a code of professional conduct and self-regulation. See the video here.
Regulation of data science is under consideration (read here and here) and Michael Walker argued that either data science becomes a profession and regulates itself or congress will impose draconian regulations that defeat the purpose of data science: to make life, business and government better. He has drafted a proposed "Data Science Code of Professional Conduct". See:
In support of data science as a profession is the following:
1)  Data science is in the pre-industrial stage and needs to develop a "Canon" (a body of principles, rules, standards, or norms) of scientific methods, principles and best practices for practitioners. Data science incorporates a number of disciplines - is wide open for innovation - and requires guidance to ensure data science is used to make life, business and government better - and prevent abuse. Ninety percent (90%) of the worlds data has been produced in the past two (2) years and will grow exponentially. How we extract meaning from all this data without creating an illusion of reality is important.
2) To protect both consumers of data science and data scientists from charlatans, illegal and unethical conduct and data science malpractice. A Data Science Code of Professional Conduct is needed to protect individuals privacy, clients confidential data, prevent conflicts of interest and to ensure data scientists have a duty to the greater good of society, and not just blind loyalty to the client.
3) Self-regulation versus imposed regulation. Either data science becomes a profession and regulates itself or congress will impose both good and bad regulations. It is better for data scientists to architect and implement a regulatory scheme than to trust congress to enact an appropriate regulatory structure that may defeat or limit the development of data science.
4) To create a check and balance against big government and big business using data science at the expense of the majority in society. Some argue that the internet, mobile smart-phones and computers are a big spying machine that big government and business uses to collect information on people further eroding civil liberties. The potential for abuse is significant and the professionalization of data science can mitigate harms.
Reasons to oppose data science becoming a profession include:
1) Professions tend to create artificial barriers to entry causing artificially higher prices.
2) Professions tend to be self-serving at the expense of consumers.
3) Professions - after a period of time - tend to stifle innovation to protect vested interests.
Michael Walker argued that - on balance - the equities favor data science becoming a profession. He pointed out that in many disciplines like medical research, economics and psychology, data manipulation is common and the scientific method has not been honored resulting in decreased reputation and the eroding trust of society. Future data scientists need to preempt this outcome by not only honoring the traditional scientific method, but by developing new data science "canons" and scientific methods to liberate meaning from data without creating an illusion of reality.
Eric Siegel is agnostic about whether data science needs to become a profession. Mr. Siegel agreed that data science can be abused - that a code of professional conduct may be useful and stated that a certification to establish a base level of competency may be prudent. He voiced concern over the civil liberties aspect of the use and potential abuse of data.
Gregory Piatetsky-Shapiro argued against data science becoming a profession. He asserted that other established organizations - like ACM (computing professionals) - is considering The Pledge of the Computing Professional, which touches upon many themes relevant to Data Science - and also pointed out that INFORMS has Analytics Certification programs. He thinks these organizations will be adequate to develop data science.
Mr. Piatetsky-Shapiro asserted that while a code of professional conduct is a noble goal, it is meaningless without a central organization that promotes and enforces this goal, and currently data science is such a diverse field that central organization is very unlikely. Just looking at current Data Sceince related meetings on page, we see meetings sponsored by research societies like ACM, IEEE, INFORMS, SIAM, commercial companies like O’Reilly, GigaOM, IEG, Big Data Companies like IBM, SAS, EMC, and many others. It looks very unlikely that all these diverse interests will agree to a single organization to enforce any code 
of conduct.


Further, a recent KDnuggets Poll (March 2013) found that a majority of data scientists voted against a pledge. Yet a majority of non-data scientists supported the pledge suggesting that consumers of data science would welcome and favor a data science code of professional conduct.

Mr. Walker responded that data science is a new field that encompasses a variety of skill sets from different disciplines and desperately requires a professional body to develop canons that incorporate and blend scientific methods from a myriad of disciplines. The blend of scientific methods will create something new and relying on the scientific methods of math, statistics, computer engineering and others - alone - is not sufficient. Data science requires its own professional canons.
Mr Walker also asserted that - while a majority of data scientists may not at this time favor a "pledge" - a large majority of data science consumers would likely favor hiring a data scientist who is certified and is required to honor a code of professional conduct - similar to certified public accountants, lawyers and physicians. Considering the significant damage data science malpractice can cause, Walker speculated that the market would favor certified, professionalized data scientists. Moreover, a professional code can protect data scientists from unethical and illegal client conduct.
Mr. Walker suggested that we should learn from other professions like law and medicine - adopt the good and remove the bad to mitigate the negatives of a profession. To earn and maintain trust and credibility, data science must follow traditional scientific methods, innovate new methods and follow a code of professional conduct.

Views: 2590

Tags: Code, Conduct, Data, Eric, Gregory, Michael, Piatetsky-Shapiro, Pledge, Profession, Professional, More…Science, Siegle, Walker, of


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Paula Lackie on April 22, 2013 at 7:39am

Professionalizing Data Science doesn't have to be a rigid process.  Consider working through the increasingly active data-related professional organizations (, http://www.CODATA.org , .. and a wonderfully long list of others.)  If these overlapping yet disconnected organizations can come together to help define shared principles, shared protocols and definitions, this will go a long way toward more generalized recognition of the value of being a data scientist.  

So far, however, I'm finding it quite challenging to even come up with a definitive list of organizations which are about data science - at the center of the diagram in this article.  If anyone would like to help me with this project, please comment here & we'll see what we can do!  I'd be even happier if someone could tell me who else is already working on such a collaborations! 

Comment by Michael Walker on April 20, 2013 at 6:59am

Vincent: Do you trust congress to design a prudent regulatory scheme for data science?

Comment by Vincent Granville on April 18, 2013 at 1:39pm

Michael: Yes being certified would help you land a job if your competitors are just as good as you but not certified. But at the end of the day, it's about selling yourself, building trust with clients. If you have fantastic success stories, deep expertise, a Ph.D. in an analytic field from a respected university, and can sell yourself well, you'll beat all your competitors even if you are non certified and not licensed. And you can always call yourself something different (non regulated), such as decision scientist, analytics scientist, analytician, data architect, data engineer or something else.

Comment by Michael Walker on April 18, 2013 at 12:56pm

Vincent: You make great points. I agree with you about enforcement challenges and exam flaws. Yet I would rather self-regulate than trust congress. The unregulated alternative may be unacceptable to consumers of data science and society. I say lets copy the good from existing professions like law and medicine and leave out the bad.

For example, eliminate licensure - let the free market decide - I bet you lunch the market will favor certified, professionalized data scientists required to follow a code of professional conduct. See certified public accountants.

Further, data science needs a professional organization to develop canons that incorporate and blend scientific methods from a myriad of disciplines. Otherwise you are likely to see the bastardization of data science.

Yes, professionalization has downsides - yet I respectfully suggest on balance the advantages to being a profession outweigh the bad. 

Comment by Vincent Granville on April 18, 2013 at 9:06am

Hi Michael, what do you mean by profession? It is already a profession, for instance if someone asks me what my profession is, I will answer data scientist. Do you mean a regulated keyword, like lawyer, financial adviser? It would be impossible to enforce. You could always claim that you are a data scientist according to UK standards (if someone tries to sue you) if the keyword is regulated by US laws, but not by UK laws. Or that the translation of your job title, from Russian to English, is data scientist. And no one will ever waste her time suing you for calling yourself data scientist, unless you create significant damage through your job. But in that case, you would be sued even if you are an "official" data scientist.

Creative data scientists have better things to do than spend time on ridiculous exams (similar series 66 to be authorized to call yourself a financial adviser). I doubt any real data scientist would spend her time complying with such regulations. These types of exams tend to favor people good at passing exams, which is very different from identifying data scientists with good potential. We've seen how bad financial advisers have been, most of them far worse than me for managing finances, despite the fact that I manage finances, have no official title and will never have.

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service