Understanding Data Agility
I recently co-authored a book on the next iteration of Agile. Over the years, as a programmer, information architect and eventually editor for Data Science Central, I have seen Agile used in a number of companies for a vast array of projects. In some examples, Agile works well. In others, Agile can actually impede progress, especially in projects where programming takes a back seat to other skill-sets.
One contention that I’ve made for a while has been that data agility does not follow the same rules or constraints that Agile does, and because of that the approach in building data-centric projects, in particular, requires a different methodology.
Many data projects today follow what’s often called Data-Ops which involves well-known processes – data gathering, cleansing, modeling, semantification, harmonization, analysis, reporting, and actioning.
Historically, the process through harmonization falls into the realm of data engineering, while the latter steps typically are seen as data science, yet the distinction is increasingly blurring as more of the data life-cycle is managed through automation. Actioning, for instance, involves creating a feedback loop, where the results of the pipeline have an effect throughout the organization.
For instance, a manufacturer may find that certain products are doing better in a given economic context than others are, and the results of the analytics may very well drive a slowdown in the production of one product over another until the economy changes. In essence, the data agility feedback loop acts much like the autopilot of an aircraft. This differs considerably from the iterative process of programming, which focuses primarily on the production of software tools, and instead is a true cycle, as the intended goal is a more responsive company to changing economic needs.
Put another way, even as the data moves through the model that represents the company itself, it is changing that model, which in turn is altering the data that is passed back into the system. Such self-modifying systems are fascinating because they represent a basic form of sentience, but it is very likely that as these systems become more common, they will also change our society profoundly. Certainly, they are changing the methodologies of how we work, which is, after all, what Agile was all about.
This is why we run Data Science Central, and why we are expanding its focus to consider the width and breadth of digital transformation in our society. Data Science Central is your community. It is a chance to learn from other practitioners, and a chance to communicate what you know to the data science community overall. I encourage you to submit original articles and to make your name known to the people that are going to be hiring in the coming year. As always let us know what you think.
DSC Featured Articles
Picture of the Week