When the performance of an employee is evaluated, ideally there are no externalities to complicate the analysis. If the employee has a computer that is constantly freezing up - or the servers in the company frequently operate slowly - the employee's performance data will reflect the functionality and effectiveness of these systems. If the company occupies a highly competitive market, declining sales data is attributable at least in part to competition rather than the behaviours of employees. If managers implement significant restructuring, the outcomes would be apparent in the performance data of employees. Although it is desirable for performance data to be free of externalities, in practice this scenario seems unlikely. In performance, there is usually "internal association" of data - that is to say, the metrics are attributed to the behaviours of employees internally. But there are also "external antecedents" or determinants outside the direct control of employees. Since a worker tends to exercise little personal autonomy in many modern production settings, it is necessary to recognize how the data collected might be "internally disassociated": it is disconnected from the broader reality of production and also of the individual. Although internal attribution is pervasive, those responsible for managing data must take into account the diminished role of employees in highly controlled environments. I suggest that the narrative has been mostly about "individual failure" despite important structural challenges outside the employee. In this blog, I will be examining the Internal Disassociation of External Antecedents ("IDEA" for short) using a number of examples and considering the implications in relation to data science.
In the area of occupational health, the blame for accidents can sometimes be "attached" to the worker. If anything goes wrong in the workplace, the employer would shoulder no responsibility. Underlying this tradition or social convention is the concept of the "injury prone worker"; this suggests that some workers are just plain "cursed." (Blame is laid on the worker who then becomes the focal point for all health and safety metrics.) As the argument goes, a cursed employee gets into accidents more often perhaps due to bad luck, natural absence of gumption, wrong-headedness, or unfortunate genetic defect such as the inability to smell toxic chemicals. A data scientist operating under this slanted paradigm must be aware that not only is data collection focused on the individual, but it is also distanced from external determinants. A tremendous amount of data can be centred on individual behaviours especially if the performance system is designed to distribute workplace incidents in this manner. Consequently, in order to "lift the curse," remedial measures would be worker-oriented. In associating the curse with the individual, we disassociate the external antecedents giving rise to stress, injury, and disease in the workplace. The data shaping the structural capital of the organization becomes insulated from operations. Even worse, business intelligence becomes hypersensitive to the common frailties of individuals. If it were offered, I am certain that Scapegoat 101 would be quite a bird course in business school. I doubt that targeting individuals can effectively lift a curse.
I wonder how many people continue to watch television. Presumably all sorts of data still get collected including demographics and programming preferences. Data pertaining to the television viewing audience seems largely internally associated. I consider it a fascinating topic - the extent to which television metrics measure individual preferences (defined by the person) rather than those that are socially constructed (extending from society). In recent years, I believe that we have become much more exposed to social construction in data through online aggregates. For example, people are sometimes curious about what online searches seem to be "trending up." These are social readings that are not necessarily tied any particular person. If for the sake of argument I do a search at 2 PM, and a trend in real-time indicates an increase in interest at 4 PM, this does not mean that I personally became "more" interested or did a search at that time; but rather, while I and many other people seem to be conducting searches, relatively more people decided to do so later. It is probably still a novelty attempting to correlate sales to trends - e.g. the sale of emergency equipment to online searches about hurricanes and natural disasters. Although the technology and methods might be new, we are entering an age in our understanding of data where disassociating external antecedents from the internal metrics might limit our understanding of the underlying phenomena. Even if I do a search, there could be something external to me invoking a group response that includes me. The measurement might not be of me specifically but rather society as a whole. (I appreciate that the delineation is not easy to make.)
In the popular game show "Family Feud," the contestants attempt to determine how, as an aggregate, respondents to a survey responded to a number of non-expert questions. In order to play this game properly, it would be useful for contestants to distinguish between the most "correct" answers - these being irrelevant - from the answers provided by respondents. The most appropriate answers to succeed in the game are not necessarily the most correct in a factual sense but rather from the perspective of the vernacular. I would probably do poorly in this game because - more often than not - I provide responses that fall outside cultural norms. (This means - even after checking their dictionaries for clarification - people still often say, "Huh?" to me.) If a fool is described as a person without "common" sense, sadly I probably seem rather foolish too frequently; my sense is almost chronically uncommon. Those that regard "gumption" as internally defined rather than socially constructed - and whose sense of gumption conform to the broader normative - would probably do well on Family Feud without even trying. However, for people like me, the process of "fitting in" and "playing the game" means deliberately melting into foreign thought patterns; and it takes some effort and practice. The game measures the extent to which a person is a social insider or outsider - or to which he or she can successfully pass as an insider at least in relation to responses. The score is a rather complicated metric, really.
In the television program "The Bachelorette," a young lady hoping to find her perfect mate typically asks a number of men all sorts of candid questions in different settings and situations - all on television. There is always the threat of sexual intercourse complicating what seems like a highly organized and rational decision-making process. I originally considered this show something of a circus. However, when I come across the program usually by accident, these days I find myself asking some deep questions. Like a scientist pulling specimens from their natural habitats for study in a laboratory, the Bachelorette is strongly premised on the internalization of identity, attitudes, and behaviours. In other words, there is a disassociation of external antecedents. I am not saying that the male participants are socially constructed or in any way artificial. But their unnatural placement in a show combined with snapshots of their current lives might provide little guidance of their actions outside the laboratory. Similarly, their responses to questions from the Bachelorette might reflect their indigenous environments. I know it might not seem like it on the surface, but the Bachelorette is actually assessing both past and potential future performance. She controls the circumstances of the show. This control will be missing in real life. Consequently, on the show, in any assessment she makes, it is at least in part constructed by her. She measures a truth of her own fabrication.
So far I have been discussing IDEA in relation to people. However, I deliberately chose the term "internal" in order to be entity-neutral. I will take a moment to consider Greece's debt crisis. At time of posting, Greece had secured another bail-out package - its third in five years. In its most recent attempt to gain more funds, this country actually failed to meet its debt obligations. (It missed a payment.) Of course, when we talk about a person taking out three loans in five years and refusing or being unable to make payments, this is usually a bad sign. "Lifting the curse" for Greece has been about restructuring Greece. The severity of the austerity measures has motivated me to suggest that neighbouring countries hope to cast out demons through exorcism: outside forces have found deficiencies "inside" Greece that need to be corrected. In internalizing economic problems, the community seems to be trying to contain an epidemic or disease. As with my occupational health and safety example earlier in the blog, I think an argument can be made that at least some of Greece's difficulties relate to its hostile environment. If the main obstacles to growth are external to Greece rather than inside, harsh remedial measures might only provide distraction. It is necessary during analysis to decide whether to focus on internal metrics - for example, relating to Greece's conformance to austerity measures - or the external antecedents reflected in Greece's economic data.
Elements of Social Disablement
Social disablement is the idea that disability is created by society. In contrast, there is the more traditional perspective that disability is chiefly an aspect of the person. I have been interested in the argument that social disablement - rather than simply being "attitudinal" - also contains "structural" components. If social disablement is structural, then it is persistent in organizations even as staffing changes; it can be handled and observed; it can be structurally corrected; and rather relevant here, it can be reduced to its data constituents and examined on a logistical level. I do not dispute the relevance of any actual disability that a person might have. However, it is a powerful concept to compare social disablement to other socially- and externally-conceived phenomena such as discrimination and racism; and its potency is amplified when computers and big data are added to the discourse. In relation to data science, I invoke social disablement in what I consider to be its purest form - through data structures. Humans rely greatly on lifelong intermediaries such as governments to make existence possible. We exist in a reified state as facts, numbers, and events. We persist within support systems as data. My welfare can be compromised by how data is constructed - its sensitivity to the realities that I face. I am vulnerable to structural exclusion. Disablement through data is a pure form of social disablement; it is free of the confusion sometimes caused by physicality. As much as society might define or label certain segments of the population - often contributing to their disadvantage - so too can society disregard and disassociate from the identities of those that are vulnerable.
In an effort to pose the discussion in data-handling terms, I offer a rather abstract conceptualization. Imagine an earlier time in human history when production materials were much closer to those doing the producing. I mean this in physical terms. In a workplace operating without the assistance of robots, computers, or for that matter electricity, people must handle many more materials with their own hands. If it isn't feasible to order wood from overseas to manufacture toy trains locally, it might be necessary to cut the wood from nearby forests. One would be aware of natural resources being gradually depleted due to their close proximity. In the images below, consider blue a flow of data from management (perhaps an actual manager); grey from labour (workers); and green from resources (such as a forest). Management is aware of both workers and the forest on the level that I describe as "Holistic & Ecosystemic" where all things are interconnected. This is not merely interconnections of physicality but also ontology. There is a recognition that one cannot operate without the other. It is possible to organize the data such that neither workers nor management are aware of the impacts to natural resources. The scenario is shown on the "Adversarial" image, which represents a common theme in environmental studies. (I refer to the disassociation between people and nature.) Wood can arrive at the factory nicely chopped into pieces ready for carving. Consequently, if performance criteria is disconnected from the natural resources, it becomes possible to ignore the seasonality of wood supplies and the resource depletion. Is such an adversarial scenario attitudinal? It might be in some cases. However, I believe that structural entrenchment stems from the data system itself - either by accident or deliberately - as organizations attempt to carry out their instrumental functions.
I call the final level "Alienated." In this case, the forest is not on the radar at all. Moreover, some distance has grown between workers and those instigating the work. It isn't unusual for production requirements to be set by a person or device that is distant from production processes; for example, performance benchmarks might be set by a computer at a regional office. Additional rings can be added to show other categories of separation. For example, to the right of the blue ring, a silver ring might be placed to represent consumers. I want to emphasize that I am not referring to people per se but focal points for data. (If personification helps, I am certainly not opposed to the practice.) IDEA therefore does not lead to a cut-and-dry delineation but rather spectrum of disassociation. Taking workers as the data source, it is possible to internalize performance and failure as if it were part of the individual; but in order to do so, it is necessary to disassociate the externalities - such as the forest in this particular example. We can also dissociate family problems, illness, lack of proper nutrition and housing, traffic congestion, crime and vandalism, reliability of computer systems, network crashes, declining customer demand, and a great many other things from data expressed in relation to the individual. I suppose in a worst-case scenario, organizational leaders might associate metrics of organizational success with the regional office; attach any metrics of failure to the individual. If the same problems keep returning even after individuals are replaced, presumably the challenges are materially "outside" the individual.
Attribution - Assigned Versus Attached
I was fascinated when I first learned of the practice of distributing fixed costs in management accounting. Variable costs can be directly associated with production - that is to say, the assignment is a multiple of units of production. But fixed costs cannot be handled in the same manner. Fixed costs are assigned for accounting purposes; this makes it possible to influence profitability through assignment. I can make a profitable department appear to be much less so by assigning costs. Similarly in relation to performance, aspects of production could be regarded as either variable (direct) or fixed (indirect). I would say that it is customary to distribute units of production to those closest to the process of production. As such, a manager might be considered relatively unproductive given his or her disconnection from production. It seems questionable to attempt to "assign productivity" in order to give the manager credit. Within the context of production, there is usually a presumption of internal performance (direct): any assignments would normally be attributable to an employee. If this were not the case, it would be just as valid to assign performance for example to network servers, photocopiers, public transit, all of which contribute to production albeit externally (indirectly). The act of assigning performance specifically to a person disassociates production generally from everything else. This is a way of making sense of the world; determining how to direct resources; legitimizing compensation levels; and of course holding individuals accountable for performance.
I would compare assignment from a coding standpoint to indexing. It is possible for example to index production data in relation to employees by using their names or identification numbers. The concept of indexing makes sense in relation to tables. The structure holding the data might also be characterized as a folder, which we have to remind ourselves are not actual folders but symbols that at some point can be reduced to streams of 0s and 1s. In the image below - to the left - I assign production amounts to Rick. There might be a table indexing the names of employees with columns for X, Y, and Z. Or the data could be loosely formatted as text and presented as a webpage or script; tightly formatted and expressed as an object; offered as ordinary sentences and phrased in the vernacular. Conveyance can take all sorts of forms. Nonetheless, the assertion of facts is premised on internal association and therefore assignment of production data to the individual employee. In contrast, it is possible to make Rick's production data on par with other coincidental facts by indexing "outside the employee" as shown on the right-side image under "Attachment." In this case, the production setting is not conceptually internalized to Rick but coincidental. This alternate approach has a number of important implications: 1) an unlimited amount of data can be assembled as part of the coincidence since there is no implied internalization (no need to fret over how a photocopier can be internally assigned to Rick's production); 2) and, although I hate harping on this point so frequently in my blogs, a relational database approach seems rather inappropriate since there can be such a large number of incidental facts
Any discussion about structural appropriateness should extend from logistics rather than the "apparent" structure of the data. Using a method of attachment, it becomes possible to rebalance the contributing impacts of employees and the external antecedents in the environment without compromising the integrity of employee metrics. I recognize that performance metrics enable the management of resources. I therefore do not dispute the partitioning of facts to fit or conform to this highly instrumental reality. However, except in simple production scenarios, I would argue that great emphasis on internal association can lead to our loss of understanding of the underlying dynamics of external antecedents and determinants. What might seem like a highly mathematical concern on the surface begins its existence as an ontological and transpositional framework - later entrenched in code to persistently define and impose on the meaning of the data. I won't be elaborating on coding or logistical issues in this particular blog; but I want to emphasize that internal disassociation definitely has corresponding analogs within the context of code and the day-to-day administration of information.
Practical Applications and Problems
Internalizing production might involve counting the number of units produced. However, if one concentrates on the client rather than the worker, there might be different perspectives attached to production. For example, the fact that an agent handled a large number of clients doesn't mean that those clients are satisfied with the experience. The data generated by operations depends on the "placement" of metrics. Placement yielding some level of clarity is likely to reduce visibility in other areas. Measurement is a strategic management decision. This means that lowly numbers-crunchers are merely part of the apparatus reinforcing a particular line of reasoning; and all of his or her observations add to the decor. An agent at a call centre handling relatively few calls might generate the most interest for high-end specialty products: perhaps the company's mass-market products have deep-discount pricing. Sensitizing the metrics to high call volume could impair profitability. Consequently, alongside internal disassociation is the threat of association by poor metrics. I would argue that the later contributes to the former - and that the former brings about the later. Suffice it to say that companies continue to struggle with the issue of performance metrics. While data science has concentrated on the analysis - something a computer can do - there is actually a need for stronger insights on strategic development.
I didn't want to trouble readers near the beginning of the blog, but let us say for the sake of argument that the level of association between the internal and external is rather fluid. Sometimes people have direct control - sometimes not at all - and this condition is constantly changing. I go to work every day - or at least I try. I have little control over the amount of daily business that I receive. A sales person might be able to generate sales assuming customers enter the store; but just how many people enter each day is difficult to internalize. How then does one truly account for internally- or externally-conceived sales - if the relationship is floating? Mediocre sales people might be richly rewarded during boom times - talented sales people unfairly penalize during austerity. A company built on performance misconceptions seems likely to both grow and shrink incidentally. It is like deadwood drifting over the water and rolling with the tides - trying to find a lucky break but not genuinely adapting. To do anything deliberately, it is necessary to have data that can embody its phenomena. But as data as a resource becomes more abundant and little thought is given to its actual meaning, it can alienate both people and organizations. Moreover, the continued use of primitive associative methods can counteract the effectiveness of data as an asset, imperiling not just the value of the data but the benefits of any decisions extending from this important resource.