My favourite explanation of the "butterfly effect" so far is as follows: Under particular conditions, even the tiniest movements of a butterfly can trigger storms and hurricanes. This principle is not limited to butterflies, of course. I think that many of us face pivotal moments in life that leave lasting effects. Perhaps no different than other students, I remember running out of cash during my undergraduate years. I consider this my personal butterfly moment. I had no money for food. I couldn't cover the next payment for rent. My old Skylark was running on vapours. I lived away from my parents who were also out of the country at the time. I really thought that was going to be "the end." However, there sitting on my bed when I got to my room was a letter from the federal government containing a cheque for about $980. This educational disbursement, my final grant from the government before completing my studies, was a critical turning point. A wisp of wind could have sent me in a different direction that evening. In this blog, I will be exploring the butterfly effect although probably not in the manner many would expect. I will be focused less on the butterfly and more on the setting that enables the effect. I call this setting the "event threshold" or edge: it is forms part of a broader discussion on the potential power contained in massive amounts of data. However, before proceeding to discuss edges, I should probably revisit what I mean by events. I use this term in my blogs generally without much preamble. From my explanation of events, I will progress towards a technique that I use to map out events called "tracing." Then hopefully it will be blue sky for butterflies.
A sentiment that I find frequently expressed in regards to big data is as follows: it is difficult to pin down the exact meaning. I completely agree. However, rather than mull over the exact meaning, I find it important to emphasize its absence (i.e. of meaning) not just in terms of the data system but the data itself. In the science from past centuries, the meaning of any data collected fell within the experimental boundaries: the data either supported or refuted hypotheses. We controlled the outcomes of experiments by imposing on the data. In big data, we face a situation where the data might lack clarity. The data could be dirty: it hasn't been collected experimentally but rather experientially as part of day-to-day routines. The data might contain all sorts questionable details - causing its exact meaning to be evasive. Any attempt to externally define the data and impose simplicity over it diminishes its usefulness; we lose sight of the underlying phenomena giving rise to the data. To skim through something complex with simple metrics is to violate its essence, purge its meaning, and subjugate our understanding of reality. People exist in the data system as data. So to herd the data, butcher it, and systematically dismember bits - to disembody it - is to assault a relationship of trust. People participate in decision-making through their data. "Institutional perspectives" will fail us in relation to big data: these are perceptions that assume authoritative understanding. Institutions for me include the science of past centuries and its methodologies: the focus has been on rendering a response or verdict rather than embracing the meaning of the underlying phenomena. I suggest that we need different ways to detect, rationalize, and describe phenomena through big data.
Throwing Events Rather than Taking Measurements
When most people take measurements, they have a standard of measurement against which to evaluate the things being measured. For instance, a yard stick or a meter stick measures distance. Expressed a bit differently, a person uses such a device to obtain measurements of distance. Distance isn't necessarily part of the object; but rather distance is what a person gains through the act of measuring. I hope that isn't too confusing. Try measuring poverty using yard stick. Phenomena don't necessarily have distance. Through the use of a measuring stick, it is possible to acquire readings of distance and nothing but distance. A basis of measurement can represent an external imposition over the intrinsic reality of an object. A pervasive metric that exists in western society is "cost." It is possible to measure a great many things on the basis of cost. A married couple can evaluate their children by acquiring information about costs associated with raising offspring: e.g. food, clothing, medical care, supervision, and education. A measurement is normally performed when the person doing the measuring has an expectation or preconception in relation to the thing being measured; further, the measurement normally serves to satisfy specific instrumental needs. Arguably, the act of taking measurements is itself rather instrumental in nature. On a spreadsheet containing sales figures, one does not enter tidal temperatures. We are more interested in extracting what we want and expect to gain from the data than allowing the phenomena to freely offer what it has to give or say. As such, our understanding is limited by the social construction of our desires, expectations, and plans for the data.
An "event" is conceptually different from a measurement. When I start throwing events, it is because I don't already know what to expect from the phenomena. If I don't know what poverty is about or why it occurs, it can be detrimental to knowledge for me to make use of measurements that limit the expression of the phenomena. If I don't know why people are sick so often in an office, then applying a cost or benefit analysis limits or steers my understanding of the situation to these metrics. The "metrics of criteria" can obscure the nature of phenomena since these metrics were never intended to explore the nature of anything. They are only meant to adhere to our prescriptive needs: to affirm or dispute what is being sought but not to undermine the broader assertions. (I have some examples of this perhaps to share in other blogs.) To understand the "nature" of things, it is necessary to use the "metrics of phenomena." When animals are caged in a zoo, they do not necessarily behave naturally. When information is forced into prescriptive boundaries, it likewise might not behave as it would under normal circumstances. "Tracing" involves distributing events that create delineations within data. I will be using some personal examples in just a moment. However, at this time I just want to emphasize, delineation is not about imposing what I think I know; but rather it helps me understand what I don't know or what might ultimately be beyond my full grasp. In the Java programming language, events are "thrown" often for diagnostic purposes to help the user come to grips with something that doesn't quite fit or make sense. My use of events is more elaborate. I throw events continuously to determine their conceptual placement in relation to dynamic feedback (as in systems theory).
Around this time last year, I started experimenting with an alpha (i.e. a development prototype intended mostly for research) that contains two relatively unique features: 1) it makes use of data that generally lacks any prescribed format; and 2) it is driven by events rather than measurements. The program holds data in a manner that is fundamentally different from a spreadsheet; in the case of the latter, the format is normally preset, and the main task of the user is to fill fields with the required data. A spreadsheet is well suited to fulfill prescriptive needs and criteria. However, I hoped to use my alpha on many different types of data where I might not know the structure in advance. For instance, in the real-life examples to follow shortly, I use the program to organize personal information. On any particular day, I don't necessarily know everything that I am going to do or everyplace I am going to go. So it seems structurally unsound to use a spreadsheet containing fixed rows and columns to hold rigidly defined data: this almost implies that I know what to expect, and I would only have to count each occurrence. Instead, the prototype retains those events that are relevant. I keep a record of these events not because I already know the outcome. I know far less than the outcome. I don't even know what the data means. I would like to determine how the events relate to me. In contrast, for example in the financial industry, a company generally knows how its data relates to clients. If I have an account with a company, the transactions are the financial events that are important to me in terms of my relationship. This does not mean that every event associated with a person has relevance that can be clearly defined. It represents something of a departure to be unaware of the relevance of data and yet collect it anyways; this is particularly so given our lack of methodologies to extract meaning.
I normally associate events with different contexts in a structural sense. A context might be a scale indicating the quality of a product; perhaps a series of steps towards completion; maybe a theme or fractal. One context that I am sure everybody knows in relation to personal data is weight. (In practice, I associate personal data with about forty contextual interests; but of course I check my weight periodically.) This measurement is among the easiest to objectively confirm. I know which contexts are important to me, but I don't know how the events relate to the contexts. It is possible to examine the apparent contribution of each event towards particular contexts. However, I find this approach awkward given that there can be a large number of throwable events. A person can instead handle events as an aggregate - that is to say, as a class of events. In order to group events, I make use of keywords or "descriptors" for sorting purposes. Below is an image for the descriptor "exercise" in relation to a context on the alpha called "xbreathe." I call the visualization a sliding balance or usually just a slider. The xbreathe context is associated with events that lag by a day: this means if I do push-ups or sit-ups on one day, the association would occur on xbreathe the following-day. Handling events as an aggregate, it would appear that for me, exercise is associated with improved breathing perceptions for 52.77 percent of the events; decline in perceptions for 8.33 percent of events; and no clear improvement or decline for 38.88 percent of events. However, from the 38.88 percent, 30.55 percent tilts towards decline and 8.33 percent towards improvement. I offer these numbers merely as orientation to help readers understand the nature of the slider. This blog is certainly not about my exercise routine or xbreathe.
I can also offer numbers and some general ideas about weight, which as I mentioned is another context that I maintain. I want to remind readers that I am not a medical professional. The sliders are not meant to guide the general public. Indeed, the data is entirely about me in a literal sense. It contains no information about other people. The next three illustrations seem to point to the following generalities about how events relate to my weight (I'm sure of great interest to many people): 1) taken as a whole, weight is inversely related to events of food consumption; 2) physical activities contribute to weight loss; and 3) taking ordinary food supplements (i.e. excluding weight-loss formulations) hardly affects weight but if anything adds to it. I associated the context xweight with the descriptors "food," "exercise," and "vitamin." (Events can be sorted in different ways: e.g. fruit, protein, fast-food, natural, and so forth. With proper design, events and descriptors can be used to record specific retailers, brands, lots, and product variations.) For me, it seems that exercise hasn't been particularly relevant in weight reduction; but, it is still more relevant than food and vitamins. Any exercise I do, I admit that I tend do little. Nonetheless, as a class of events, exercise seems to be offer a promising lead towards activities that might become more relevant. I present these sliders not so much to inform readers of my weight circumstances bur rather demonstrate how sliders might be used to examine different aspects of phenomena.
The prototype can generate lists showing the events for each area on the slider. For the events associated with blood pressure, I found that most fast-foods and most salty foods are associated with higher blood pressure: 16 of the items listed on the high side taste rather salty compared to only 5 items on the low side. Interestingly, the event signifying 2 sticks of cheddar cheese (20 percent of the recommended daily allowance of sodium) appears on the high side; but 1 stick of the same cheese appears on the low side. I was surprised by how these events were distributed despite the 1-day lag; this means that elevated pressures were detectable the following day. On the lower blood pressure side, I found many plain (non-processed) foods and beverages along with a suspiciously large number of desserts. I'm uncertain if the foods and beverages themselves contribute to lower pressures or their relationship to broader eating habits. Consider a slider showing certain exercises associated with elevated blood pressures; equally intriguing is the possibility of reducing blood pressure by doing specific exercises. I have no background in kinesiology. However, if I had to guess on a common theme, I would say that for me, exercises associated with elevated blood pressure seem rather torso oriented. Once more for me and not necessarily anybody else, exercises away from the torso seem associated with reduced blood pressure. I will have to gather more data although I can't say that blood pressure has ever been much an issue in my life. Again, this exercise is meant to show how tracing can be used to map out the relevance of events given different contexts. I hope others find the sliding mechanism - or at least the idea of using one - worthwhile.
This is probably a good time for me to add a side note. The process of event distribution giving rise to sliders is highly systematic in nature in a way that to me seems relatively unique: it associates events with different contexts in a "big way." It is not ambivalent but obsessive. It is taxing on resources. If there are 10,000 different types of events, these are associated with the contexts all at once. (Of course, this is a conceptual explanation. Sadly in a literal sense, operations are carried out line-by-line exhaustively for all the data.) No attempt is made to pre-screen the applicability of events. As such, the alpha might find that "missing batteries" relates to "inventory shrinkage" despite the absence of clean or clear experimental evidence or rational explanation. Unless a person already knows the relationship to inventory, missing batteries might merely point to lack of portable lighting during rare power outages. From the standpoint of problem-solving through the use of massive amounts of data, confronting different concerns from a position of ignorance wouldn't work if our solutions require understanding a priori. We would simply be promulgating vacuous preconceptions throughout the data system at an immense level. Through event distribution, our objective isn't really to determine whether or not batteries are missing but rather what it might mean to have missing batteries. I suspect that security personnel would encounter difficulty carrying out routine inspections given missing battery packs for their flashlights and communications equipment. This is not to say that a person necessarily has to know the meaning of missing batteries in a rational sense: it is unnecessary to know the "meaning" of anything in advance (a product of reasoning) to determine its meaning in terms of contextual relevance (a product of math and algorithms).
Event Thresholds & Event Horizons
I refer to the vertical line that partitions the area in ambient green on a slider as the "edge," which represents a conceptual threshold. I find it difficult to dismiss the separation that occurs at the edge as incidental. In particular, if there is a change to the underlying circumstances of the data, the class leaders would likely emerge first along the edge. When seeking out intervention opportunities, I routinely forage for interesting events along the edge. To me, the edge contains aspects of the underlying phenomena that can sometimes be characterized as pivotal. For example, I consider consumers to be transient edge participants. They are not necessarily comfortable occupying extremes; therefore they might be found more frequently perusing the boundaries. I wonder if edge studies can be applied to the organization of isles at a retailer. Although I routinely throw a great many events, this is not to say that every possible event has been thrown. The reality as it is presented to me likely represents only a partial depiction that poorly reflects the whole. The balance remains accessible only through events that I have yet to throw. So it is great fun venturing along the edge and setting up events almost like antennas to detect things that pass nearby. In ecology, an edge delineates where two different habitats meet each other. For instance, there might be grassland followed by a transition to a forest. Or there could be an aquatic environment near some wetlands. The edge is where different species have the opportunity to interact. In the human ecology, we have edges for the conveyance of ideas, goods and services, and even risk-reward trade-offs. The edge is where people place their bets.
In an organization, there are also edges. Many people would probably separate management, administration, and production workers. Within these delineations there are many intersections and overlaps. Those familiar with my explanation of different informational sources in an organization would find that an edge that I call "direction" separates the organizational constructs "projection" and "articulation." (This is too in-depth to expand upon here.) In Kurt Lewin's Force Field Analysis, there is something of an edge. He said that for change to take place in an organization, unfreezing has to occur. To me, he was discussing equilibrium dynamics in a system containing many edges: interests collide, overlap, and mesh along these edges. Lewin seemed to invoke edge dynamics in relation to change and intervention. My conceptualization is more mathematical and algorithmic. While I do not suggest that an edge represents an event horizon - i.e. a critical turning point spawning different outcomes - I believe that event horizons are delineated by edges. Understanding a horizon ultimately requires knowledge of these intricate partitions in the data. A wayward butterfly can land at a sensitive delineation and trigger a shift. It might be possible to someday conduct warfare across the event horizon through silent campaigns against entire populations. Some nations might struggle while others prosper without the use of soldiers or guns. Maybe hackers and supercomputers come closest to interfering with the natural distribution of events. So perhaps I'm a little bit like Herodotus preparing to witness the formation of epic battles to come. My tools are just a bit different. Like him, I'm perfectly fine taking omens and signs into account since I need not know the meaning of things to understand the relevance. I chase ghosts and butterflies not so much because they threaten civilization; but rather I think that data-dependency makes us vulnerable to intervention by the least detectable and most intangible agents.
Kinetic Intervention (KI)
Perhaps the biggest obstacle in my use of the term "event horizon" is public preconception. In the movies that I have seen - being quite a fan of science fiction as many may have noticed in my use of specialized terms - an event horizon is where different time-lines are spawned usually by those with time-travel capabilities. Sadly, this is not an ability that I possess: Although I have pondered on reverse temporal intervention, I couldn't think of a way to detect success even if one could influence a time-line. Without detection, intervention would likely lead to chaos and mayhem whether over the time-line or anywhere else. For me, the idea of an event horizon simply extends from a discussion of probabilities. However, a person need not know probabilities in order to intervene. When children learn how to catch a ball, they usually don't have formal knowledge of physics or probabilities; they instead learn how to place themselves in the flow of sensory and motor data to alter the course of events. I call this principle of influence through involvement in the flow of events "kinetic intervention." I consider this concept important when the underlying phenomena exhibit great complexity. One cannot hope to fully internalize or grasp something massive and complex; doing so can lead to the most trivial and yet destructive outcomes. Nonetheless, perhaps as a general rule, every situation that a person faces was at some point too complex for him or her to understand entirely. While frequent exposure and learning can mitigate some consequences, this does not alter the inherent need to intervene or be involved in the "flow." The question is whether or not intervention can be executed deliberately and thoughtfully. A child catching a ball is unlikely to lead to anything resembling a catastrophic butterfly effect. But I have concluded, we are altering the settings that humans occupy such that more of our activities can cause thresholds to tilt or slide undesirably.
Although the prospects might seem intriguing for kinetic interveners, there are complications for any aspiring KI master. For instance, data viewed in relation to events lacks absolutes. A person is unlikely to definitively know the beginning or ending of any underlying phenomena or its placement in relation to other phenomena. This is because there might be no beginning, end, or placement in human scale: these are metrics of criteria meant to characterize phenomena in terms convenient to our needs. Imagine an evil KI master bent on destroying a particular government maybe in exchange for cash. Actually, just to be safe, consider a good KI master that hopes to do away with a terrorist organization. He or she would have to know the relevance of this organization or institution to other people. The building where the organization is situated is simply a piece of capital. The migration of influence over important segments of society and further towards the vernacular is not easy to pin down. So it is unclear what events to throw and what contexts might be important. I drew a diagram to show the complexity of phenomena-realm determination. (The little stick people are standing atop the base-structure for data-embodiment.)
I don't know whether to describe that symbol on the right as infinity or a butterfly. I was trying to show that symbolic aggregation can occur indefinitely and even fold into other parts of an event horizon; but perhaps this is precisely what can make a butterfly so lethal. Another important constraint that I feel prevents humans from exploiting the horizon is the issue of detection as I previously mentioned. We can understand much more than what we can detect. Moreover, we might incorrectly detect things. Or we might misrepresent what we detect. For instance, my prototype can give me a list of events that seem associated with improved perceptions of mental concentration. But a "perception" of concentration is a fairly precarious thing. I also admit that attempting to make use of lists of events that seem to make me "feel smarter or dumber" is a radically fringe concept. (He writes as he pops another smart pill, imbibed by the unbridled power of vitamins.) I describe the process of gradually mapping the relevance of events to different contexts as "tracing." Through tracing, phenomena can tell me what metrics seem to apply or are relevant to it; these are the metric of phenomena. So in this blog, I have described a process where the phenomena itself gives give rise to its own metrics; this is a pretty complicated concept, really. But consider the benefits of interacting with something massive and organized, using it as capital rather than dissolving its value through instrumental preconceptions and behaviours.