In this blog, I will explain how an approach to handle small amounts data can be reconstructed to handle much larger amounts. This reconstruction is the product of an anomalous perspective or mutation relating to the attribution of performance.
Many businesses share certain common features. Assuming that operations are not fully paperless, an office might maintain supplies to handle the physical artefacts of data: staples, paperclips, pens, highlighters, post-its, elastic bands, and binders. Companies generally have comparable computers and operating systems to deal with all sorts of information. There are also policies and practices that share much in common; these help to shape what gets recorded, when, how, and for what purpose. Professionals from human resources and accounting can sometimes be moved between companies without much preamble. This is because professions represent pools of core competencies: their perspectives over facts and figures are actually prescribed by the profession itself. Methodologies to measure performance also tend to be shared: it is possible to find examples in textbooks explaining how to apply industry-recognized methods. There might even be supporting software to carry out evaluations - thus confirming a market of similar users. So the business landscape contains players sharing numerous traits in relation to their data. In a manner of speaking, this represents a kind of organizational "species." So when I raise the issue of mutation, I mean a change that is relevant in relation the entire species: the emergence of something distinctive and structurally persistent. In relation to a biological species, mutation affects genetic details. In an organization, data represents the primary means of conveyance; the data helps an organization decide how to structurally adapt to its environment. Consequently, I would argue that for many companies, the issue of mutation actually involves its data system.
Common Steps to Monitor Production and Productivity
In order to bring the nature of the mutation to light, I must first introduce what is being mutated. I have worked in a number of different types of companies that seem to follow the same basic pattern or methodology to gather data and make it accessible for decision-making purposes. I consider the following to be common steps useful for assessing operational performance: 1) determining the key indicators of performance relevant to the operation; 2) ascertaining the presence of these key indicators in workflows; 3) routinely tabulating the number of events or incidents; 4) retaining an historical record of the periodic totals; 5) maintaining summary statistics for individual agents, project teams, functional groups, and strategic divisions; and 6) presenting the numbers in a manner that people can understand and to meet specific organizational needs. The exact details vary of course. A factory might count skids sent to shipping while a call centre would likely focus on the number of accounts opened and orders processed. I remember once working for a coffee factory. I was exposed to a great many things counted for quality and production statistics: number of canisters packed; amount of coffee per canister; bags of coffee beans opened; barrels of beans emptied; skids loaded with boxes of product; and trucks loaded with skids. There was a lot of counting because there was a great desire for control. The act of counting and the data collected from it represent control over critical processes.
I once received comments indicating the following: the market suffers from a documentation deficit in relation data gathering for quality and productivity. I'm not actually sure this is true, but I certainly need to offer readers a basic framework at least to help explain the mutation. In the illustration to follow (Fig. 2), the preceding six points have been expressed as basic questions that a person might ask in order to facilitate data-collection. The business of gathering information and reporting the individual production contributions of workers and factory departments can be facilitated using old technology: just sheets of paper and pencils would suffice in a pinch. Given good bookkeeping habits and adequate time to complete daily tasks, businesses can probably get along fine using archaic methods to assess quality and performance. This is assuming that the desired outcomes serve a purely prescriptive purpose - to confirm that particular tasks are carried out in specific ways within certain time periods. In other words, if the purpose of data is simply to ensure conformance behaviours, little sophistication is required in the data system. Conventional approaches serve prescriptive requirements well. I consider it a major accomplishment in record-keeping to be able to support management efforts to assess quality and performance. At the same time, I am certain that almost every company that has ever gone bankrupt did so under some kind of management; however, this isn't the focus of the blog.
Fig. 2 - Main Procedural Points
The so-called "clipboard" in point #1 represents the formal criteria or standards against which events within the workplace are to be assessed: for example, since it would be illogical to apply investment criteria to automotive quality, clearly the choice of clipboards is rather critical. The clipboard might be in the form of a checklist. The criteria used to generate compliance-type data often leverages on the expertise of those doing the checking: a check-box such as "gas pressure regulators working properly" might require some training to properly interpret. In terms of points #2 and #3, there is a difference between assessing the amount of work that has been completed and gathering those specific aspects that are relevant to the criteria. Over the course of producing 50 pallets or skids of merchandise, it is necessary to determine what aspects of production are being managed and must therefore be set to data. So I am saying that there is a clear delineation between the data gathered for central control purposes and that used for decentralized monitoring of routine events. Points #4 and #5 both relate to record-keeping, the idea being to maintain data in accessible formats that can later be used to generate desirable reports; this requires an interpretation not just of the original criteria but also the intent. The only part of the process that might receive any significant recognition occurs at point #6 involving data presentation; as such, it stands to legitimize or delegitimize everything else - perhaps unfairly so.
Ideally by going through the above steps, an organization can ascertain the contribution levels for each worker (in relation to behaviours considered relevant by the central authority to production). This is the idea, anyways. It isn't necessary to characterize production in relation to workers. For instance, factory operations can be broken up into a number of functional parts or specializations. Rather than collect data attributable to specific employees, it is possible by applying the same methodology to compile statistics for the organizational parts. In other words, the parts can be treated as employees. (Conversely, it seems to have become fashionable to treat employees as factory parts. It is difficult to say whether this perspective is any more or less correct.) Statistics can be maintained for each important aspect of production to support comparisons over different time periods. If small changes are implemented in one section, then the results can be examined against similar operations in other sections.
The listing of points that I just provided doesn't have to involve a process that already exists. The methodology can be used to gather data that has never existed before in order to support decisions that are completely new to an organization. Within the framework of my graduate studies, I examined one pilot project intended to determine the feasibility of centralizing employee counselling services. The purpose of the pilot was to collect data since so little of it existed at the time. A number of facilities were chosen to participate in the experiment. Organizations that have never had a coherent system of data collection, tabulation, and presentation would likely experience some difficulty coping with more sophisticated approaches. I suspect that the leap to big data would be quite challenging indeed without exposure to small data. Nor would it be necessary. As I mentioned earlier, computers aren't required to support the traditional method. However, as I will discuss in my next blog a few weeks from now (covering the Push Principle), just a slight twist in how processing occurs from the mutation can radically alter the meaning of the data. For now I just what to provide I guess some reassurance that intellectual capital conforming to the points I described earlier can "evolve" into something quite unique and promising. The question of whether or not the community has a place for it remains to be determined.
What Is the Mutation?
I use the term "TIME" to describe a special data-rich environment containing organizational contexts and data events; there are also algorithms designed to characterize the level of relevance between these contexts and events. "TIME methodology" refers to the approach I use to create the environment. I will be providing some structural details after describing what brought on the methodology. The mutation giving rise to the methodology and therefore a higher level of sophistication to access data involves a slight aberration in the interpretation of performance. Under the old methodology as I noted earlier, it is possible to obtain the contribution levels for individual employees in a department or organization; this is presented below on the left-hand side of the illustration (Fig. 3). The use of employee metrics to examine performance constitutes only a single perspective or context. While managers might show great interest in this particular context, this should not prevent data scientists from exploring other types of organizational contexts. Rather than associate events with people, it is possible to associate events with different contexts including those that form performance rankings as shown on the right-hand side of the same illustration. Instead of maintaining a single theme such as employee performance, an organization can systematically deal with many hundreds or perhaps even thousands of contexts. The objective of the TIME methodology is to create and sustain a data environment that preserves these important associations.
Fig. 3 - Structurally Identical Performance Gradients
Central to the aberration is the deconstruction of assignment operations. Applying a rather conventional interpretation of assignment, one might say that Paul is responsible for 2 + 1 + 3 +2 + 2 = 10 units of production where the level of production 10 is simplistically assigned to Paul. A rather faulty perspective that is probably pervasive is to say that Paul equals or can be equated to 10 units of product as if Paul's reason for existence is to produce units. Assignment in relation to the TIME methodology is more like immersion. For example, all of the data events that occurred in the City of Toronto during an ice-storm could be assigned to Terrible. I realize that it might not be obvious how a person goes about performing such a curious assignment, but the barrier is more due to lack of familiarity than technology. First of all, let us accept that an ice storm is indeed terrible, just to get over the initial hurdle. Further, I would say that a sunny cheerful day in May is rather Terrific. Further yet, I underline how the types of events found in the data of an ice storm differ from the data of a sunny cheerful day. Finally, performance metrics need not be limited to units of product. (What does the 3 next to "Terrible" mean? That means there were 3 applicable days. Sorry for leaving that out.) So let's examine the slight-of-hand closely. We would normally have said that Paul is responsible for 10 units. I'm saying that the direction can be the exact opposite: furniture, pants, wrist watch, wallet, spouse, children, education can all be assigned to Paul. An unlimited amount of data can be assigned from the right to the left; and doing so makes sense per my ice storm, sunny day analogy.
Technology speaking, "big data assignment" is nothing difficult. If a handful of data events can be assigned to Paul, then an astoundingly massive number of data events can be assigned to Terrible. Just do it - obviously not using conventional software. It might not be possible to handle the data as a primitive quantity given that it is a complex object having no fixed structure or size limit. Consider setting aside such apparent obstacles to focus on the beauty and simplicity of massive assignment. Traditional assignment served to limit the meaning of data and to set boundaries. Massive assignment helps to expand the meaning and broaden our horizons - but only if we let the technology take care of the details. Big data requires computers. The more power - the better. How much data should be used? Use as much data as necessary and more if desired. Assignment or immersion doesn't mean equality in an algebraic sense. Assignment is only part of the process. It is still necessary to determine the transpositional relationship between the data and the contexts; this is done through the use of algorithms as noted earlier. I refer to these algorithms as "relevancies." A relevancy that I frequently mention in my blogs is called the Crosswave Differential. (I have other types planned.)
Reflecting on the Need for Change
I believe that much of how we handle data relates to its simple origins. The development of society has become increasingly dependent on the development of methods to handle its data. Expressed differently, the level of complexity that society can attain cannot go beyond the level of complexity possible in its data. This conceptualization of society suggests that social problems might depend on how a society chooses to handle its information. So if we got rid of computers, society as we know it would likely have to revert to a simpler form. Moreover, if we eliminate books and stop teaching basic writing skills, society would simplify even further. If we lose the gift of language completely, civilization as we know it would probably collapse. I suppose that the data-complexity argument is debatable in relation to society as a whole; but to me its relevance in organizational settings is difficult to dispute. I want to emphasize that in the TIME methodology, current methods are not being replaced. Everything that currently exists can continue to do so; but another aspect is added in parallel. The data environment resulting from the TIME methodology extends from current resources in several ways. I will spend a few moments explaining the three major components of the methodology as indicated in the illustration below (Fig. 4): 1) the data itself; 2) its intended organizational context; and 3) an algorithmic assertion of the relationship between the data and the context. Really, only the last item is new. The first two items already exist although perhaps not articulated as part of a formal methodology.
Fig. 4 - Major Components of TIME
(1) Concerning the Data
In relation to big data, I believe there is a bit of a psychological preoccupation questioning whether or not an organization should collect so much information - as if big data were an issue of self-control, choice, or restraint. If an online retailer simply decided one day to stop collecting large amounts of data, it wouldn't be able to operate. Much of the data gathered is needed to support operations: e.g. the identity of the person placing an order; the merchandise included in the order; the person responsible for filling the order and by what time; the destination of the order; and how the order should get there. I describe this type of data as the Data of Direction. The business model of an organization "directs" its operations either to collect or make use of the data, and there is really little choice in the matter. However, there has been some conflation between this type of data and another type that I call the Data of Articulation. I believe that a fairly common type of articulation shared by many organizations is present in marketing data. Articulation exists in relation to consumer feedback and after-sales support. A rather big problem involving big data might not be the amount of information but rather its increasingly eclectic nature as more data is collected. Standardization of diverse facts can render data more cohesive but less meaningful. TIME can accommodate any type of data, keeping in mind that all of the data is destined to be contextualized
(2) Concerning the Organizational Context
The preceding two types of data can sometimes be confused with another type that I call the Data of Projection. Management criteria results in projection data. This is quite different from direction and articulation data. In certain respects, direction data emerges as a consequence of managerial efforts to bring about the business of the organization. Articulation data generally exists irrespective of what management might say or do although its recognition probably depends on managers. Projection data on the other hand is specifically the result of the application of metrics by management as a part of a control or regulation process. This body of data is conceptually smaller than articulation data, which can be of infinite size. One would also hope that it is smaller than direction data. While projection doesn't create a great deal of data in relative terms, the metrics of criteria is an important concept; unlike the other forms of data, it supports deliberate intervention and change management. The outcomes of projection can be examined using existing tools such as statistics. I believe that this creates a much needed bridge between old and new methods of analysis while also helping to extend the role of management into big data.
(3) Concerning Relevancies
A relevancy is an algorithm that provides portrayals of relevance. Managers can therefore become aware how particular data might be relevant to important organizational contexts; this provides guidance in relation to intervention efforts. Perhaps more importantly as it relates to big data, a relevancy can systematically sift through all of the data available and provide managers with a listing of the data that seems most relevant . . . albeit without rational explanation. While poor design remains possible when using relevancies, at least the designers gain the means to deliberately minimize the presence of subjective inclinations. Relevancies provide logic to accompany the data that can be reproduced decades later, applied to other environments, and openly shared and discussed in teams, meetings, and academic papers. So although mistakes might still be made, the setting can be preserved, critically examined, and used to improve future decision-making.
Mutation as a Prelude to Evolution
Recall how Paul was changed to Terrible earlier in the blog using the same structure of performance. This reinterpretation of the gradient demonstrates how structure might change on the outside in relation to or as a consequence of a mutation: I explained the three major components - data, context, and relevancy. I also said that a fundamental aspect of the change occurs at a deeper level in relation to the issue of assignment or immersion. I would say that in most organizational settings, only a limited number of data events would ever be associated with Paul. The formal role of any employee in an organization - the aspect that necessitates compensation through payroll - is fairy constrained and theoretically possible to summarize in a job posting. Since there might only be a few dozen or perhaps hundreds of reoccurring events, there is little need for contextual multiplicity in performance assessments. It might make little sense to go through an elaborate algorithmic process to attach relevancy to the data since the relevance is predefined as part of the job. In other words, the simplicity of the relationship makes it possible to apply simple data methods. However, if we hope to handle much more diverse data, it is probably necessary to make drastic changes.
In this blog, I have described the structural implications of mutation allowing for the strategic assignment of highly complex data to the contextual multiplicities of organizations; this brings organizational development within the realm of information systems. I suggest that more traditional methods were used during simpler times. The times were simple because the world was full of resources and ripe with opportunities. All sorts of approaches sufficed not so much because they worked well but because the companies tended to operate regardless of the approach. Times were good. Today, simple companies can be replaced by foreign imports. I'm a bit nervous to use the term "low-brain economies" since it might become popular and later attributed to me, possibly triggering comparisons to "high-brain economies." Well, competition has become fierce, and it is important to consider the evolutionary nature of data systems in the equation. I have detailed a way to build on existing knowledge while moving forward. This level of discourse will divide people. It will surely delineate between the casualties and survivors of technological change. The division are caused by fear and ambivalence rather than barriers posed by the technology itself. In terms of the technology, the time for change is upon us, and the means to do so is at hand.