Embodiment is comparable to the idea of an “ecosystemic” or “holistic” approach. In an ecosystem, each thing affects everything else. In light of the interrelationship, a person would not attempt to correct a problem by considering only a single piece of the puzzle. Instead, there is a need to bring together many aspects of the body. To understand embodiment, it is necessary to recognize how “the body” separates an organism from its environment; in a manner of speaking, the body represents the system through which existence occurs. It is a line of defence against non-existence. It provides a buffer. Life buffers us from death. Without a body, there is no means of transition but only direct contact. I have said that data tends to be rather disembodied. Sales figures have few connections to the outside world; drastic declines might be difficult to explain until after the fact. People love a product until they hate it. People live forever until they die. The world loves Americans an act of terrorism. There is no lack of data. It is just that so much of the data is disembodied reflecting its intended use. The data that we collect is almost never meant to hold anything beyond what we would like to know. I believe that data embodiment is a step towards true understanding.
In order to achieve data-embodiment, giving shape to the system of the data, it is necessary to perceive the system within the context of its changing environment. The tendency will be to yank out a stream of data in a linear form as if this were the only way data can be handled. It might be the only way data can be comfortably handled by people. This is not the case when processing is done by computer. I would say that the interaction with environmental change is easiest to understand through the use of an index, ranking, ordered scale, procedural pattern, or fractal. The system can be shaped through event distribution, a separate and rather detailed topic. I am going to use some random data to just demonstrate the dynamics although a bit later will try to include some real data. Assume that an organization has 200 events available. The organization can distribute these events each day to delineate its existence in the local environment. A public school might distribute a few hundred events to note incidents of bullying. An event selection might include the following: stranger noticed in building; teacher absent; unauthorized vehicle at perimeter; assembly of children at bad location; poor lighting; gang members spotted. In terms of the scale or gradient, assume 20 divisions: 0 at one extreme means no bullying; at 19, there is violence and bullying with multiple perpetrators. Each day, the events are distributed under the applicable gradient measurements.
Since I have chosen to discuss a random sample at this time, I would expect the “Crosswave Differential” to be somewhere near or at the middle of 0 and 19. A Crosswave pattern (XW) is established by how the events interact with the gradient: as one goes up the gradient, fewer events will exceed the set point, and more events will fall below it. Intersection occurs when there is a balance of events on either side of the set point. On the other hand, a Crosswave Differential (XD) is created when there are two sets of measurements as in the case of “treated” and “non-treated” events. This does not mean the event itself is treated; it means that the event has been invoked in order to signify treatment. Strictly speaking, the absence of a treated event does not necessarily mean the absence of treatment. One cannot safely infer details in the absence of data using the event-distribution method. Only the invocation of a non-treatment treated event can confirm that treatment never occurred. It might to difficult to appreciate the logic in this until a person has to deal with an actual problem scenario. I believe that the illustrations below provide a reasonably good explanation of the differential.
The next illustration is from the actual simulated data. I selected a handful of events to signify “treated.” Both the treated and non-treated events result in the same general XW location, resulting in an XD of 0. In order to improve visibility, I left out some details such as the bars and gradient values. The XW locations are about the same when I randomly select events from a body of random events, which is my main point. (TO means treated optimal; TS treated sub-optimal; UO non-treated optimal; and US non-treated sub-optimal.)
Ad Naseum, as Mr. Howard my high school math teacher used to say, we reach the midway point for all these illustrations. I now introduce another type of illustration that I call the “Push-Pull Analysis” (PPA). Without much preamble, I feel that an objective person would find that the treated events have little bias either to their right or left. (Although this is not necessarily the case for other monitors, in this simulation the left represents the least amount of bullying while the right is the most.)
I can’t say that I have a strong attachment to numbers. That is why I write computer programs to do all of the work for me. Alright, so now I will make a “non-random” selection of events. As part of a diligent risk-management program to minimize bullying in the school, assume that I have asked to systematically identify those events that might contribute to bullying. I don’t know what these events will be ahead of time. I only know them by their XD score. I will pick a number of events with extreme differential values and then use the selection to define my next treated group. The previous events in the treated group were as follows: e10, e20, e30, e40, e50, e60, e70, e80, e90, and e100.
The “non-random” event choices are e20, e148, e47, e171, e158, e43, e167, e4, e192, and e132. Usually the resulting illustrations are not as clear-cut as those shown below. I feel that in real life, changing events might not necessarily lead to any particular outcome. However, the rationale is to target the most pertinent events and then use actual field results to confirm effectiveness. An event might indeed be entirely random – as is the case for each event here – meaning that the outcome should not necessarily change even if I change the events. An intervention is a type of event where the outcome can be incorporated into the metrics. I feel that an act of intervention combined with its control-effect does more to suggest effectiveness than ineffectiveness, but I would agree that there is much room for debate. An event can be as simple as “patrolled hallway between 2:30 – 3:00 PM.” Below I show that the XD is no longer 0; the point of equilibrium for the treated events is now further to the left (closer to 0). The objective is to try to push the pattern towards the desired direction.
The next illustration is from the same non-random selection. The bars are rather flawlessly biased, which I must emphasize is extremely peculiar. I expect some bias just nothing quite this clean. I wonder if I should show a second random selection where in fact I doctored up the algorithm to be a bit biased for specific events; this would probably better reflect real-life conditions where events simply aren’t distributed randomly. For instance, there might be existing security protocols to prevent bullying and violence. The existence of events (that is to say, their distribution) can therefore be entirely deliberate, as is the case for an intervention event. I will ask readers to take my word for it: even the non-random distribution of events leads to similar dynamics. When events are deliberate rather than random, just by “chance” sometimes the random event might seem superior! Make use of statistical tools to examine the situation further. In a worst-case scenario, learn by trail-and-error. I should also mention that in real life, I suspect that the reality is probably dynamic. People will adapt to each other. What works for a little might cease to do so. There is a perpetual need to gather information, study, strategize, and take action.
All of that effort has led to the successful identification of events to seem to reduce bullying as indicated by in distribution above; but this is not to say that taking action in real life will actually lead to a reduction. In fact, as I mentioned earlier, the data being entirely fabricated, it is possible to target events that are purely random in nature, thus rendering the selection superfluous. So what is a person to do? Well, if bullying were a game of chance, then it would indeed make no sense to try to control events. If however bullying were merely quite difficult to understand and control due to its complexity, then I genuinely feel that an embodied approach gives us the best chance of dealing with the problem. The knowledge would persist in an open framework. Every past strategy that has succeeded or failed is retained along with the context of both intervention and outcome. I therefore suggest, even in the event of intervention failure, irrespective the decision-making regime, data embodiment represents an excellent approach to record-keeping. We gain a more robust understanding of implementation problems; this can help organizations better learn from past experiences.
All right, before I end this blog I will go ahead and introduce some data from my own records. I keep track of how well I sleep. My assessment is entirely phenomenological although anchored. “Phenomenological” as it applies to the data in this case means that I make a qualitative assessment based on the reality as I perceive it rather than through a method that is objectively verifiable by somebody else. Not all my monitoring programs are based on such gradients. For instance, I also keep track of my weight, which is not a matter of perception but confirmation. When I say “anchored,” this means that I try to maintain consistency by having firm reference points. So I often ask myself whether or not a particular reference is applicable. In the left image, there are 54 samples in the untreated, becoming 164 in the right image; there are 79 treated samples becoming 140 in the right. Based on all of this, it seems that the first image is for first 133 days of data; the second for the first 304 days of data. I think that the treatment product is pretty familiar to most people. However, not being a doctor I will exclude the name of the product from this blog. Let’s just say it is a legal product that has been around hundreds of years with a good reputation apparently still used by some people for their headaches.
My sleeping is normally between “better” and “super.” The software that I created for my record-keeping, which I call Tendril, has the ability to incorporate lag into the analysis. This essentially means that it can connect what I did today with what happens tomorrow. I just want to point out a few things from the images. The first thing I want to point out is how I have images pertaining to my personal health, which is really a splendid thing. I also use Tendril to keep track of my exercise routine and driving routes. It makes life much easier being able to put all of the experiences into a safe and accessible place. An interesting idea about “lag” is how benefit can persist over a period of time after treatment. The images show from left to right a slight shift towards super for the untreated samples. I can’t say that the shift is entirely about the product in question; in fact, the shift is probably about a lot of things. I also find that the differential can narrow as a result of rationalization: that is to say, as I apply treatment over a longer period of time, the apparent benefits might decline as I encounter different circumstances.
By no means am I saying that this product helps people to sleep. I am saying that for me it has been associated with improved sleep perceptions. I neither encourage nor discourage people from using such a product or any product. Nor should people consider taking this or any product without talking to a doctor. I just want people to understand that I am not nor do I claim to be a medical professional. Please do not misinterpret this post as any type of medical advice. This is a blog about data science focused on the handling of data and certainly not on the treatment of disease. Just to emphasize the point, I close the blog with one final image from my “curling” below. In Canada, curling is competitive sport. I understand that we have among the best curlers in the world. However, when I say curling for me I mean lifting 10-pound weights, which has likewise been associated with improved sleep perceptions for me. I would like to mention that in fact that many of the more promising “events” seem to be exercise related, which makes perfect sense given that I work under conditions of limited mobility. During a long the journey, given the right tools, I imagine that a person can build up a substantial pool of useful data through embodiment. Data-embodiment is about understanding the context of problems concealed in the murky fens of facts and details. How people might choose to deal with such insights is a separate issue.