Subscribe to DSC Newsletter

About a month ago, I posted a blog on “Technical Deconstruction.” I described this as a technique to break down aggregate data to distinguish between its contributing parts: these parts might contain unique characteristics compared to the aggregate.  For instance, I suggested that it can be helpful to break down data by workday - that is to say, maintaining separate data for each day of the week.  I said that the data could be further deconstructed perhaps by time period and employee: the basic idea is that trends based on the aggregate conceal details that might be important.  There might be an employee - such as the manager - whose behaviours are quite different from others in the department.  With all of this deconstruction, how does a person go about deciding what is important - given that the division of data into smaller parts seems likely to take some effort?

It is reasonable to expect less volatility when data is deconstructed, for example, by employee: this is because an employee tends to perform within his or her abilities, which generally does not fluctuate dramatically from day to day.  Dividing data by workday on the other hand, assuming the same employees work day after day, seems likely to also measure changes in the market.  Arguably, examining metrics can lead to information about the market, the skills of those interacting with it, and the impacts of the operations on the organization.  In short, there are many metrics to study - and an abundance of contributing elements.

I suggest that the maintenance of deconstructed data resources is an ongoing process.  It is not meant to prove a thesis - where the need for proof would stop being necessary at some point - but rather assist in the process of management.  An employee found to be exceptionally productive during an initial investigation might not remain so.  There is a need to constantly follow his or her progress. “Sam is productive” is not a compelling thesis.  However, establishing the relevance of evaluating data by employee - before proceeding to do so - is certainly worthwhile.  "Would it be useful to consider how the prioritization system affects production outcomes?" is a more engaging question.  In response, I created "technical boundary analysis," which is meant to quickly test metrics for "signals" of potential relationships.  Technical boundary analysis involves obtaining guidance using a combination of technical and event data.  It is an approach that would likely attract people that spend a lot of time with tables - assuming they maintain or have the capacity to generate both types of data.

My use of term “events” is specialized.  I borrow the term from the Java programming language.  I mean it in the same way that the language uses it.  I do not mean a “social event” but rather the occurrence of something - perhaps an exceptional event - or an incident that fits particular criteria.  An event can bring about a particular response - thereby giving it the characteristics of a control mechanism or trigger.  Consider interpreting its meaning loosely as “something that might be important.”  Not all companies have a tolerance for this type of data.  It should be apparent that an event can be defined or constructed by the company - e.g. by management.  Or, it can be brought about by the client.  An event can be positioned to reflect the client’s narrative - or it can be commandeered by the organization.  Events are not quantitative - although they might be expressed using quantities.

The chart below was originally introduced in my blog on technical deconstruction.  It shows a simulation of many days of production.  I applied controls to the data: production on Monday and Tuesday is trending up; on Wednesday and Friday it is trending down.  Data on Thursday is in between - i.e. not being pushed up or down but allowed to fluctuate from its original point.  The influence of these controls might not be obvious using from aggregate (this was my whole reason for entering into the discussion about technical deconstruction).

Boundary analysis begins from this point.  The next step isn’t necessary if the interpreter is a machine.  However, to help me visualize or conceptualize the placement of events - or to help me explain boundary analysis to other people - I sort the data from lowest to highest production as shown below.  It then becomes possible to examine which events fall above or below certain predefined boundaries.  It is reasonable to assert that the events near the top might be associated with superior performance; those near the bottom contribute in some way to inferior results.

The next chart shows the distribution of the workdays above and below the boundaries.  It reflects the underlying design of the simulation as I explained earlier:  the best production occurs on Monday and Tuesday; worst on Wednesday and Friday; and the in-between results are on Thursday.

Finally, I present a simplification of the outcomes: I call this chart “the split.”  Boundary analysis is fast, simple, and easy to understand.  However, I use it primarily to obtain signals to determine what needs to be studied in greater detail - probably using more conventional quantitative techniques.  In this particular example, it seems clear that the workday has some influence over production.  The phenomenon should be studied further in order to bring to light the nature of the influence.

In more abstract terms, the day of the week as data is a condition bringing about a symbol or object reference.  It is necessary to determine whether something exists in order to attach object references; these can then be studied for their technical relevance.  The study of existence, how things come to exists, how things gain relevance and become recognized to exist is called “ontology.”  I define four basic types of events: 1) inductive - bringing about the metrics; 2) reactive - only responding to the metrics; 3) conducive - enabling the metrics; and 4) constructive - involved in the metrics indirectly.  The day of the week for instance is constructive.  It is unlikely that the day of the week per se affects production.  There is “something about” the days involved.

One of my favourite event objects involves time distribution conditionals: e.g. “A > C && B > C” . . . or “Steve spending more time on A than C (and) more time on B than C.”  Different metrics can be examined against these event objects: units of product X; percentage of units of product Y having conformity violations; number of defective units of product Z.  It is possible to obtain “signals” to roughly determine how Steve’s involvement at different stations seems to affect aspects of production.  (Note that I don’t use exact times as event objects since these would likely be reactive.  Workflow placement on the other hand is more inductive.  In any case, this is part of the discourse.)  Obtaining magnitudes would require more detailed analysis: “Steve’s time distribution has an effect - but how much of an effect?” is a question that requires further investigation.


Because the objective is to obtain signals, it isn’t necessary to fret over small details such as the nature of the metric: for instance, events can be attached to percentages, ratios, composites, scores, grades, weights, and sums.  The signals are not impaired by the nature of the metric.  On the other hand, a person can get creative with the event objects without complicating the analysis.  For me, technical boundary analysis suggests how to deconstruct the technicals to obtain details that are more relevant to operations.  After I obtain interesting signals, I check for more convincing evidence.  Obtaining guidance is fast.  Getting proof takes longer.  Boundary analysis is part of a process of discovery.


Views: 246


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Follow Us


  • Add Videos
  • View All


© 2018   Data Science Central™   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service