Given the nature of the community, presumably many visitors already have a strong understanding of the nature of quantitative data. Perhaps more mysterious is the idea of qualitative data especially since it can sometimes be expressed in quantitative terms. For instance, "stress" as an internal response to an externality differs from person to person; yet it would be possible to canvas a large number of people and express stress levels as an aggregate based on a perceptual gradient: minimal, low, medium, high, and extreme. We have to recognize the rationale for doing so relates to a pervasive quantitative normative. Within the framework of an organization where much that transpires does so in qualitative terms, we nonetheless produce metrics to help characterize that reality within the normative. There is however a problem. The metrics can convey aspects of state and progress while providing few insights and alternatives. A company could for instance gain a solid grasp of its employee accident levels from the numbers alone. The organization might decide to allocate or budget more resources to a responsible department in response. But taking correction action requires awareness of the underlying events and circumstances; and depending on the configuration of the data system, the company might not have the foggiest idea what aspects of production seem to be contributing to accidents. Arguably, one of the reasons why companies are required to report accidents to regulators might be to encourage adequate book-keeping; this imposition can help them become more aware of prevention opportunities.
It would be reasonable to suggest that companies are in the business of making money rather than maintaining records; and this might explain the bias towards quantitative details. However, I would argue that a fair number of companies actually don't have any idea why their clients buy their products. I would say, there are specific individuals in some companies such as in the marketing department with reasonable perceptions explaining the demand for products. These perceptions are reasonable until clients stop buying, of course. Then it would be necessary to pull different perceptions out of a hat to explain the change. The point is, there might not be a place either in the data system or in the rational ecology of boardrooms to accommodate qualitative facts. Ignoring the events giving rise to the metrics is a structural weakness I believe triggered by an overemphasis on the metrics and a frail belief that people can control reality by exercising authority over its depiction.
I will use an example I have given a few times in other blogs. In my own research, in light of increasing costs for disability claims, there was this idea that the problem could be addressed through organizational restructuring. Restructuring can have the effect of redistributing of costs - on the surface making the amounts seem smaller and therefore easier to bury. However, I found that about half of the insurance claims related to problems that could be reasonably attributed to the work environment: e.g. circulatory and musculoskeletal problems. How exactly corporate restructuring can reduce repetitive strains and poor circulation is baffling - and it points to the quasi-intellectualism of people estranged from the realities faced by workers. It also shows the impact of being focused on the metrics although disconnected from its qualitative details. In other words, the business problem was at least in part a data problem: the clarity of the quantitative metrics influenced decision-making; the obscurity of the qualitative events concealed the solution. Statistics only provide half the answer. We routinely ignore the phenomena giving rise to the stats I believe due to lack of a compatible data system.
This blog describes a simulation environment that I will be developing based on what I have learned from a research prototype called Tendril. I routinely create simulations usually designed to generate streams of data that resemble real-life conditions in quantitative terms. For instance, stock-market data is quite easy to imitate by following some basic trading criteria: the trading price should be near the previous price; small fluctuations are common while large changes happen only periodically; prices tend to be choppy rather than smooth and predictable. The simulator on the other hand is going to be unique in that it is meant to make use of actual work data that can be both quantitative and qualitative in nature. Similar to Tendril, the simulator will be "event-driven"; this is something that I will explain further in this blog. The simulator will employ a rather non-statistical approach focused on organizational and business events.
Tendril has a number of screens. The image below is the "original" screen that was introduced about 18 months ago. The name of the simulation environment is Shatterdome. I picked up this term from a science-fiction film called Pacific Rim. As indicated on its title, Tendril is Shatterdome Enforcement. Perhaps this relationship might make sense closer to the end of the blog.
I am providing details of the simulation environment here despite its preliminary nature. It would be fair for readers to question how I can be certain that the simulator will work even before I commence writing the code. Well, the prototype was meant to be my trial balloon. My confidence stems from the success of the prototype. Tendril already performs certain jobs that I would expect the simulator to do; however, it doesn't generate events, this being the main task of the simulator. Normally, events are based on real-life situations pertaining to myself. Shifting these events to study an organization is simply a matter of redirecting my attention. So instead of generating events relating to my personal circumstances, I can carry out the same process for different departments of a company. (I can do all of this for a fictitious company with imaginary departments, of course.) For "simulation" purposes, bundles of events will be released at particular times under specific circumstances. I consider the event-distribution method fairly straightforward.
Tendril handles two types of data: event data and metrics. The theory behind the "proxy-phenomena differential" is fairly in-depth. I leave it to readers to check out my earlier blogs if they are interested in the conceptual distinction between proxies and events. Similar to Tendril, the simulation environment will accommodate both types of data. Below is another screen from Tendril on the left kind of reinforcing that Shatterdome imagery: in Pacific Rim, the Shatterdome housed super-robots. On the right I present the contents of an actual data file showing a list of events. It deconstructs processes converting them into little fragments of data - like pieces of glass from a shattered dome. The robot by the way is only a splash image. Other images appear in place of the robot depending on the circumstances - if Tendril finds certain things during processing. I should explain that I often leave Tendril for long periods of time so I can get other things done. I find it useful having large images that I can see from across the room to tell me what the processor is doing.
I'm not aware of this type of simulator already being in use. Part of the reason I am posting this blog is to get feedback from the community. So if after reading a bit further any readers find that they have come across similar systems, by all means please send me details. I mostly want to avoid the unnecessary invention of terms if others already exist. I expect the Shatterdome to make no effort to mimic the behaviour of people or machines; rather, it attempts to generate the same data that people and machines produce. The program will draw from large spools of events based on current policies, protocols, practices, and production patterns. It should at least in theory be possible to create a simulation of any organization from its observable routines and artifacts.
As indicated on the image below, event data can be gathered through questions and observations using a clipboard, pen, and sheets of paper. Or, the events can be collected and conveyed electronically. I have worked in all sorts of production settings. I realize that conditions are not always ideal for research purposes. I think that some simulations attempt to be "scientific" in that there is some effort to create elaborate experiments. In contrast, I am concerned more about compiling non-experimental field data. Noise is fine. I don't have problems with production peculiarities. I consider complexity a challenge rather than an opportunity for simplification. It is true that life would be easier if the data were clear-cut. But then again, it would hardly be necessary to hire scientists and other professionals to deal with straightforward situations. The fact that a highly skilled individual exists in a process I feel is due in part to the messiness of the data and how the facts are blurry. I rather enjoy data weirdness - scenarios in which people are so baffled, they are willing to take chances on something new.
I my blog on the "Geography of Data," I describe the idea of transposition. Consider as a foil the concept of reductionism where there is some effort to gain the most salient aspects of particular phenomena. I find that this process of simplification tends to be instrumental in nature: the information we extract might only serve our immediate production needs as efficiently as possible. Within this characterization of reality, we might embed little of the reality that exists; for we focus only on the aspects that we need and want. The potential for social disablement is quite high in such an instrumental framework. Moreover, we bring to the new technology our old methodologies which might likewise limit the expression of phenomena. Yet another potentially alienating force is the distance that can emerge between data scientists and the production settings that give rise to data.
The idea behind transposition is to create some means of retaining not just metrics but the relevance of events to those the metrics. For instance, it isn't good enough just to have sales figures; it is necessary to also maintain data about the contributing events a-n-d how those events are relevant to the metrics. Relevancy is a type of data that is difficult to capture. We are sort of stuck with tradition in terms of current methods, technologies, and behavioural normatives from antiquated belief systems. For instance, there is a great emphasis on metrics rather than gaining an understanding of the underlying phenomena; I believe this is due in part to the longtime emphasis on exploiting resources. In terms of the Shatterdome, I plan to deliberately map out both circumstantial events and their multifarious proxies.
Handling and Placement of Events
I will be focused on two general types of events during this project: articulation and actuation events. An articulation event is a discrete piece of data reflecting something that has happened. An actuation event on the other hand is a response to something that has happened. Below I show the basic triggering process for Shatterdome. If an event occurs, the program will generate actuation events at specific times and "locations." I apologize if the narrative seems almost like a computer program. Suffice it to say, this simulation environment is meant to be event-driven - not based on logic or psychology but rather production and reality. It hardly matters if a book tells me that people respond in a certain way to given stimuli or that a production system should behave in a particular manner. My objective is to create a realistic simulation based on tangible evidence.
What do I mean by locations? In the diagram below, think of Tendril as that box called "Hub" to the very right. It can receive data from many different locations: the environment or business setting; the organization itself; and different departments such as human resources. (Throughout development, I will in fact be focused on events derived from human resources.) The actuation events can be destined for any part of the organizational system. Actuation events at Organization (the location) will be mostly focused on production whereas events at human resources (a department) will pertain to personnel. Now, I know that in terms of statistical distribution, there is the idea of maintaining different performance metrics for employees. However, irrespective of whether or not employees exist or how many of them exist, the events pertaining to their behaviours can nonetheless be incorporated into the Shatterdome. I make no attempt to make the program mimic individual employees.
In order to run the simulation, Shatterdome would commence its actuation of events. There is a question as to how one can be certain that particular events will lead to the desired metrics. At the moment, I hope to use an approach that I call metric actuation. This means that metrics will be triggered by events. There might be a range or spectrum of metrics associated with different events. I would obtain not really a number but rather a distribution of potential results. It is going to be a really interesting journey. The deeper point of this exercise is to create an operating abstraction of the real thing. Once there is a stable simulation of a particular organization, I hope to run many different types of hypothetical situations. This is why I can't simply create "a simulation." A simulation might not reflect the organizational reality. I need something that has the capacity to parallel the organization.
Event Invocation Objectives
The primary objective at the different locations shown above - the environment, organization, and department - is to generate events that to the greatest extent possible reflect real-life conditions. Since conditions are dynamic, one would expect the events to be likewise. The starting point for the simulation doesn't have to be hypothetical. Ideally, data-collection specialists would routinely canvas the organization for data to feed into the simulation. If intervention occurs, existing events will change, and more events will be generated. The Shatterdome might be useful not merely for simulation purposes but also to manage change - to keep track of event data relating both to environmental transformations and specific business initiatives.
Apart from the human resources applications, I will try to use the same approach in relation to security. It can be quite expensive and impractical to stage mock emergency scenarios for example in a mall or university. A "response protocol" involves generating events en masse as a reaction to an emergency or urgent situation. I am not too interested in tinkering with little events but rather creating different response protocols as part of a risk management strategy. (At this time, I'm not certain how feasible this approach would be in relation to a non-enclosed environment.) Tendril, which is used to detect and analyze data, might never be released to the public. However, there is a chance that I might make the Shatterdome public domain. I certainly hope that others find the project worthwhile in relation to their own efforts.
I have given a brief overview of the Shatterdome project. I now want to take a few moments to discuss the technological setting. Usually in relation to big data, the conversation tends to focus on the need to handle enormous amounts of data - amounts that many smaller organizations might never encounter. I consider the "discourse of enormity" rather exclusionary. It immediately distances large segments of the business market from big data. On the other hand, I have observed an extreme perspective in the opposite direction: some practitioners seem obsessed with the reduction and simplification of data. This philosophy likewise discourages businesses from embracing the technology. In both scenarios, the focus seems to be on perceived normatives rather than substantive business issues. If data reduction can make money, it is a good idea. If data expansion can make money, the option has to remain on the table. If dancing like a duck can make money, that too is worthy of some consideration. There is a lot of talk about what people should and shouldn't do - as if there is some kind of prize waiting if an argument is won. Essentially, if an approach fails, its proponents become irrelevant to the production process. Their beliefs and arguments become superfluous. The problem with data science, if it can be characterized as a problem, is how easy it is given the availability of data to confirm failure.
So I personally find the data reductionists and data enormity people rather curious in their deliberate estrangement of the market. The idea of creating a simulation through observation and asking people questions might seem a little bit peculiar. No, the peculiar part is creating a simulation without making observations and asking questions. The average person in an organization has much to offer both in terms of information and perspective. We also have to consider the great opportunity to bring people closer to the data. Data should be something to help liberate and deliver - not impose and alienate. I hope the idea of the Shatterdome helps to normalize involvement and participation in a substantive way towards positive business outcomes. I don't have a problem with case studies, gut instincts, tarot cards, horoscopes, tea leaves, or whatever fanciful ideas come to mind. But why not also try out a simulation, if not for business insights then at least to facilitate administrative initiatives. Even if the simulator is never used, not even once, I'm going to suggest that something like the Shatterdome would be worthwhile purely as an administrative tool. Making business decisions based on metrics without any understanding of the contributing events might be one of the most entrenched non-business-like practices around today. It is this glass dome that I hope to shatter through the Shatterdome project.