Understanding Data Roles
By Michael Burke
With the rise of Big Data has come the accompanying explosion in roles that in some way involve data. Most who are in any way involved with enterprise technology are at least familiar with them by name, but sometimes it’s helpful to look at them through a comprehensive lens that shows us how they all fit together. In understanding how data roles mesh, think about them in terms of two pools: one is responsible for making data ready for use, and another one that puts that data to use. The latter function includes the tightly-woven roles of Data Analysts and Data Scientist, and the former includes such roles as Database Administrator, Data Architect and Data Governance Manager.
Ensuring the data is ready for use
Making Sure the Engine Works.
A car is only as good as its engine, and according to PC Magazine the Database Administrator (DBA), is “responsible for the physical design and management of the database and for the evaluation, selection and implementation of the DBMS.” Techopedia defines the position as one that “directs or performs all activities related to maintaining a successful database environment.” A DBA’s responsibilities include security, optimization, monitoring and troubleshooting, and ensuring the needed capacity to support activities. This of course requires a high level of technical expertise–particularly in SQL, and increasingly in NoSQL. But while the role may be technical, TechTarget maintains that it may require managerial functions, including “establishing policies and procedures pertaining to the management, security, maintenance, and use of the database management system.”
Directing the Vision. With the database engines in place, the task becomes one of creating an infrastructure for taking in, moving and accessing the data. If the DBA builds the car, then the Enterprise Data Architect (EDA) builds the freeway system, laying the framework for how data will be stored, shared and accessed by different departments, systems and applications, and aligning it to business strategy. Bob Lambert describes the skills as including an understanding of the system development life cycle; software project management approaches; data modeling, database design, and SQL development. The role is strategic, requiring an understanding of both existing and emerging technologies (NoSQL databases, analytics tools and visualization tools), and how those may support the organization’s objectives. The EDA’s role requires knowledge sufficient to direct the components of enterprise architecture, but not necessarily practical skills of implementation. With that said, Monster.com lists typical responsibilities as: determining database structural requirements, defining physical structure and functional capabilities, security, backup, and recovery specifications, as well as installing, maintaining and optimizing database performance.
Creating and Enforcing the Rules of Data Flow. A well-architected system requires order. A Data Governance Manager organizes and streamlines how data is collected, stored, shared/accessed, secured and put to use. But don’t think of the role as a traffic cop–the rules of the road are there to not only prevent ‘accidents’, but also to ensure efficiency and value. The governance manager’s responsibilities include enforcing compliance, setting policies and standards, managing the lifecycle of data assets, and ensuring that data is secure, organized and able to be accessed by–and only by– appropriate users. By so doing, the data governance manager improves decision-making, eliminates redundancy, reduces risk of fines/lawsuits, ensures security of proprietary and confidential information, so the organization achieves maximum value (and minimum risk). The position implies at least a functional knowledge of databases and associated technologies, and a thorough knowledge of industry regulations (FINRA, HIPAA, etc.).
Making Use of the Data
We create a system in which data is well-organized and governed so that the business can make maximum use of it by informing day-to-day processes, and deriving insight from data analysts/scientists to improve efficiency or innovation.
Understand the past to guide future decisions. A Data Analyst performs statistical analysis and problem solving, taking organizational data and using it to facilitate better decisions on items ranging from product pricing to customer churn. This requires statistical skills, and critical thinking to draw supportable conclusions. An important part of the job is to make data palpable to the C-suite, so an effective analyst is also an effective communicator. MastersinScience.org refers to data analysts as “data scientists in training” and points out that the line between the roles are often blurred.
Data scientist–Modeling the Future. Data scientists combine advanced mathematical/statistical abilities with advanced programming abilities, including a knowledge of machine learning, and the ability to code in SQL, R, Python or Scala. A key differentiator is that where the Data Analyst primarily analyzes batch/historical data to detect past trends, the Data Scientist builds programs that predict future outcomes. Furthermore, data scientists are building machine learning models that continue to learn and refine their predictive ability as more data is collected.
Of course, as data becomes increasingly the currency of business, as it is predicted to, we expect to see more roles develop, and the ones just described evolve significantly. In fact, we haven’t even discussed one of a role that is now mandated by the EU’s GDPR initiative: The Chief Data Officer, or ‘CDO’.