When I read a blog, I often find myself in deep thought as I approach the end, trying to determine if the author has said anything that I might be able to use. A blog doesn't have to say anything. Nor does it have to be useful to me specifically. It might simply offer a personal reflection on life. As a person who also writes blogs, I generally try to pin down my narrative by making references to a particular technology. In the background, I usually have something to say about development, people, society, and computers. I connect everything to our use of data. This blog is no different. It occurred to me recently that the idea of "development" might be unfamiliar to segments of the data science community. I have chosen to share the process that I go through when designing and developing new applications. Despite my use of computer code - which is possibly a rather dry and mundane subject on the surface - I consider programming a creative process. Many distinguish between science and art. I don't. Some claiming to be scientists have declared computer science and indeed data science to be non-scientific. I'm not prepared to broad-brush science as others have. I would certainly suggest that writing code can indeed be regarded as an extension of art. Whether a person is sketching wiring diagrams, doing a computer-assisted drawing, or perhaps trying a bit of freehand as in my pieces above, I think we often underestimate or understate the relevance of art in our technological society. Further, we don't always recognize that art is present even if it is most important part of the blog.
The act of drawing or painting is only a certain aspect of a particular type of art. Before that brush or pencil meets the medium, there are impressions and thoughts of expression that consume the artist. This is where the art begins its journey. The trip to paper or canvas merely enables closure. In this blog, I will be considering not coding per se but all of the processes in the background; perhaps these are often subsumed under "development." Many might find it difficult to concur with me that development is a precursor to art or that the outcome is anything artistic. As part of this discussion surrounding development, I will be introducing "hopscotch and robots," which is designed to generate large amounts of data intended to mimic organizational processes. It will become apparent shortly that hopscotch and robots involves converting visual information into structural data. I regard hopscotch patterns as a form of abstract expressionism; and I like to think of myself as a type of artist. I will be pointing out in this blog how the development of these patterns is not a mechanical process but rather something deeply creative and philosophical; that it can form the basis of thought itself.
Train Conductor Versus Explorer
Discussions over development paths help bring to light a key distinction between data science roles. One path enables a supporting role over predefined tasks and responsibilities: e.g. a fixed route for the role of a train conductor. This person is not allowed to take the train off the rails. The other route is far less rigid: it might be associated with figures such as Christopher Columbus and Ferdinand Magellan, both having led Spanish armadas over poorly explored parts of the world. (Poorly explored by Europeans.) There was a business requirement imposed on these explorers: gold, slaves, and other desirable resources. Neither Columbus nor Magellan can be described as train conductors; for their ships were free to move about over open waters. The technology in those days did not permit Spain to control the day-to-day activities of these expeditions. Such technological constraints no longer exists today. It is difficult to find many people in our time occupying roles that allow for much personal autonomy. Perhaps too many companies lumber about in meeting and strategy rooms trying to micro-manage operations. Development doesn't happen well in these surroundings. There might be dozens of managers and only a couple of developers. In many respects, these developers can sail away on their own, leaving the managers to find other useful things to manage. The idea of managing development is a bit like controlling how an artist draws. It can be done if we ignore what is inside the artist. I guess I mean to say control or suppress rather than simply ignore; for we cause the artist to produce what we want. We ensure that the patterns reaching the canvas conform to our expectations.
A company or agency might be rich enough - and occupy a sufficiently protected monopoly - to pay people to do and accomplish nothing but attend meetings. One would not expect that the activities of these elites to interfere with art, but perhaps this is precisely what we expect to happen in relation to business. In organizations, the art might be sterilized. Specifically in regards to data-science related developments, I believe that the temptation is quite extreme for managers to provide little if any freedom. This is because the fledging field is complex, treated with ambivalence and skepticism, having poorly defined metrics of performance. I don't deny that such sentiments are relevant and perhaps well-placed in some cases. A great deal can indeed go wrong. While I have come across various ethical concerns about Columbus, I believe that few would question his competency at sea. (The monarchy didn't leave armadas to just anyone.) Arguably most developers lack the kind of expertise to produce useful and worthwhile products. But really, how many managers produce useful products? They don't actually produce, right. This is a topic that I might develop further in blogs. For now I just want to emphasize how, when I use the term "development," it is outside the context of most controlled organizational settings. The areas of data science that I come across in personal projects might never be encountered in a corporate setting. The research is independent and rather unique. However, I have no funding for pure research. It is market-oriented research occurring outside a corporate setting. I believe "entrepreneurial research" might be a suitable term. But as I say, I think of it as art. One man's sociopath is another man's visionary. The things we do in secret define us far more than our willingness to do what others want.
Market Prefers Cake - Bread I Forsake
Last year I posted a blog describing a qualitative event engine that I was planning to develop. I called it the "Shatterdome." I now have an operational prototype. I spent about two months thinking of the design and two days writing the prototype. This reflects how I tend to divide my time over the course of development. The more peculiar the program, the more time I generally spend considering my approach or strategy. Using the Shatterdome, it is possible to map out procedures and processes including those that do not yet exist. The basic functionality is surprisingly straightforward - using hopscotch and robots to be explained shortly. I don't believe the simulation engine would alienate reasonably tolerant people in a business meeting. However, it certainly represents a different way of dealing with management concerns. When I undertake developments primarily to satisfy my curiosity, having no assurance of any return on investment, the idea of "giving up" constantly comes to mind. When I do something to appease intellectual interest, it doesn't take long for me to satisfy it. I am usually left trying to put together a business justification. For this reason, by the time I have a functional prototype, whatever initially motivated the development is probably gone. The future of the project then rides on a strand of gossamer. If the market prefers cake over bread, I offer cake. The client is always right. If we dig deep into the mind of an artist, thoughts like this come out. The client gets what the client wants. Were this not so, there would be no market. I go along with this arrangement since it is the only one that exists. If there is no market, development terminates. I take away the chicken. I bring out the fruit-cake. People complain about fruit-cake, but they keep buying it. Since I tend to get what I want from development near the beginning, I often find myself eager to move on. I just resign myself to the truth. Enjoy the cake.
Too Easy to Take - Better Not to Make
Being a member of a community, I make time to "share" my ideas and findings. One aspect of development for me is gaining something worth sharing. More specifically, I want something that I can freely share but which cannot be easily taken away, if that makes any sense. This means among other things, if what I have to share can be easily taken away, there is no point continuing down that development path. It occurred to me some time ago that others by reading my blogs could easily copy the materials. When I suspect this has occurred, I am extremely flattered by the way. The contents are meant to be shared. This is the whole idea behind sharing - to get the word out. But the development shouldn't be so straightforward that something like a blog can carry a significant amount of what I have to say. Nor should the ideas be particularly easy to understand - even if I do my best to explain things. Nonetheless, since there is always a risk of catastrophic loss (in a manner of speaking), I generally seek out a business reason to write my blogs. My reason is to have readers spread my ideas almost in a marketing sense. I am not normally trying to sell anything; but rather I am attempting to create a market for the ideas. Others can freely develop derivative products and ideas. Basically, I want their self-interest to promote my cause; they can certainly decide to work on the same things. I assume that plagiarism and intellectual infringement will occur. I build my blogging around this "intellectual infringement model." However, if other people can take something substantive from my work - that is to say, without payment - then I only have my curiosity as a source of inspiration. As I said, often this isn't enough to keep a project going. If the business reason disappears, there is no reason for the project to persist.
That Which I Shake - Now Must I Bake
Okay, that's it with the rhyming headers. I am eager to share hopscotch and robots. The Shatterdome uses a type of robot that I call a "hopper." A new robot is deployed for each simulation run. These robots are meant to "hop" on an array or lattice of events that could be generated for example by organizational processes. The events might reflect the system operations of a transit system; departmental divisions; and tasks and functions perhaps pertaining to human resources. This hopping activity generates a considerable amount of simulation data. Workloads represent a specific type of data. I personally have a great interest in how structural changes influence loads. I believe that hopscotch and robots can be used to examine many different types of management concerns. The engine is relatively simple based on my description thus far. I can't say that the sourcecode is particularly sophisticated. The only complication that comes to mind involves controlling how the hopping takes place: entirely random hopping would be unproductive. If readers would like to implement their own robot hoppers and perhaps share what seems to work, I would certainly be interested in the material. I have been experimenting with my own approaches of course. I will be sharing some of the results in this blog shortly.
The "robot" part of "hopscotch and robots" hopefully makes sense based on my explanation: these robots are designed to jump from event to event using the instructions contained in the events. The "hopscotch" aspect of the term relates to how the hopping area resembles a hopscotch pattern. I am essentially talking about a type of flowchart although a hopscotch pattern for the prototype uses ASCII characters. The patterns are shown below for Hopscotch #1, #2, and #3. I lacked the ability to make use mechanical drawings during Hopscotch #1. I had to configure the events manually. My objective at the time was simply to get the robots hopping. After the advent of event-extraction through mechanical drawings, I found it much easier to increase the complexity of the patterns. The hopscotch pattern doesn't have to be linear or fully contained on a single chart. The squares can be distributed over several charts. The paths between the squares can be recursive. For example, loops can be formed. Hopping over an array of events is not complicated from a programming standpoint. Controlling the hopping in a coherent manner is more difficult. The deeper and more difficult question from a data science standpoint relates to event design or construction. The robots are hopping, but what exactly are they hopping over? I said that I spent two months thinking about this project and two days writing the prototype. This isn't precisely the case. I spent two months strategizing over how to make the hopping coherent. I have been grappling with event design for many years.
Qualitative Event Streams (QES) and Quantitative Metrics Streams (QMS)
I'm uncertain if my colleagues would agree. For me, data science is really about dealing with constant streams of data. The taps never shut down. When I was doing my undergraduate degree, I recall receiving after the cut-off date responses to a public survey I had distributed. I was explaining to my supervisor how terrible it seemed to turn away data. In many applications found in real-life situations, there is no need for a cut-off date. The inflow of data is constant. To make use of the data, it is necessary to determine the parameters for inclusion; in my case, this is often a matter of conforming to a periodicity scheme or cycle. There might be weekly and monthly performance comparisons for example. I consider it important to distinguish between the comparative criteria, which are external in nature, from the qualitative and quantitative manifestations of data internal to the phenomena. Considering hopscotch and robots, we see how internal organization can give rise to both types of data. The data extends from organizational phenomena. I believe that the data is most apparent during runtime - amid motion, movement, rhythm, pace, percussion, compression, expansion. Therefore, by embracing streams of data, we make visible to data science aspects of reality that might be difficult to appreciate under more static conditions. This is one of the reasons why I think physics might have an upper hand over statistics in certain situations. This is not to say that a person must be a physicist to appreciate a dynamic ontological basis for the recognition of phenomena.
On the second day of development, the user interface gained the appearance shown below. Any interface is always in a constant state of development for me; so I expect significant changes to occur over time. (As I finish writing this blog, I can already say that the interface has become larger and more complex.) It is possible to set up the robots without using an integrated interface: the task can be performed manually using a text editor (such as Notepad) and a file manager (Explorer). A manual approach offers much flexibility, but it is time-consuming and perhaps prone to mistakes. So as I previously mentioned, I created a method of extracting primary robot behaviours from text-based mechanical drawings. The interface interprets the drawings much as a person might, and then it automatically sets up the files and folders for the event objects that would otherwise have required a person. While the hopping mechanism only took several hours to write, I did the drawing extractor over a longer period of time the following day. In the background on the next illustration is Hopscotch #4, which on closer inspection some readers will recognize as a retail outlet: it contains 6 departments and 4 cashiers. I won't be covering this particular pattern here since it is a bit detailed. A more elaborate mechanical drawing that I am currently developing is simply called "Me": it is meant to simulate the many different events that I normally generate each week. Thus, a key benefit of the interface is being able to rapidly generate and maintain different databases irrespective of the underlying subject matter
Checking for Functionality
I ran a simulation on Hopscotch #3 so we can infer the functionality of the robots. Some readers will be familiar with my extensive use of "plough charts." On the plough chart presented here, the slope indicates the pace of robot traffic. Hopscotch #3 has four implied exits: [aux2], , [bank], and [outr]. These exits are implied because robot traffic may only enter these points and never leave (although there are no other places to go). The only entry point is . The most robot traffic occurs at [pat1]: this is probably because all the robots must pass [pat1] although a significant number are likely to return to [pat1] from [lamp]. A robot might do a loop from [pat1] to [line] to [lamp] back to [pat1]. It might even go from [pat1] to [lamp] then right back to [pat1]. Many interesting routes are possible. Infinity is possible although highly improbable. I plan to introduce a mutation in the future to help the robots become more aware of themselves and each other although they operate at different deployment periods and are really rather brainless. I am thinking about depositing robot runtime sequences at the events. This idea is inspired by my extremely brief exposure to Buddhist phenomenology.
Consider the travel log for robot #1 noted below: during the simulation, it found itself on [pat1] a total of 4 times. This indicates adherence to the hopscotch pattern.
Opening trip:  ... [pat1]
Then loop: [line][lamp] ... [pat1]
Short jaunt: [lamp] ... [pat1]
Big voyage: [lamp][pah2][stat][pah2][stat][pah2][stat][pah2][lamp] ... [pat1]
Okay enough: [line][lin7][line][bank]
The illustration above shows that, with the exception of [aux1], all of the route events had significantly more robot traffic than the exit events. This is because an "exit event" is triggered when a robot enters a dead-end with no return. A "route event" on the other hand occurs when a robot is simply passing by. Since I sent out 20 robots during the simulation, any traffic beyond 20 indicates some repetition of visits; this generally occurs only in relation to route events. One might reasonably ask, why not simply calculate probabilities or use engineering formulas - e.g. for parallel circuits? I am not interested in the likelihood of events per se but rather their consequences. These robots have the ability to "carry" items given to them at different locations; for instance, they might be given "stress"; or this might be taken away or reduced by positive events. I am interested in what the robots experience, when, where, and why. These questions can be addressed in relation to design: e.g. event X occurs as a result of condition Z. The fact that event X occurs only 19 percent of the time in real life and 35 percent during the simulation is not necessarily all that important (although it might be). Thus, an important aspect of the simulation is related to its ability to emulate an internal context rather than replicate external metrics of criteria.
Some Issues Raised - Probability Versus Choice
As is normally the case on the introduction of a prototype, issues come to mind during runtime that I don't anticipate at design. One of the more interesting questions that attracted my attention involves "choice versus probability": rather than make a random choice between available options, I could program the system to instead go by historical or preset probabilities. "Yes, it is a random choice for the robot, but it is a random 20 percent option A; 40 percent option B; and 40 percent option C": that is to say, it is random but only within defined parameters. At an early stage of development when I first encountered this issue, I was predisposed towards a simulation model that allowed for externally defined boundaries. But as I initiated coding enhancements to accommodate this behaviour, it occurred to me that I was short-circuiting internal design through the imposition of external boundaries - thereby rendering the internal design superfluous.
One of the benefits of complete randomness pertains to the consistency of the results - ironic though it might be to assert randomness as consistent. Once I externally impose a distribution, the metrics gained from the simulation start to reflect not so much the impacts of design but rather me. I would be measuring my external definition thereby reinforcing my assertion of probabilities - effectively rendering invisible what otherwise would have been brought to light. It would make more sense, upon acquiring historical distribution if there is any available, to make design changes to reflect that distribution. It seemed to me unproductive and contrary to the objectives of the underlying exercise to maintain faulty design while imposing an historical distribution to make the design "appear" accurate. Another problem relates to any qualitative interpretation of the events: e.g. if I choose between "sadness" and "happiness," it seems extraordinary to assert an 80 percent chance of being happy as if this were a reliable human predisposition. I'm uncertain if this makes sense to the casual reader. Suffice it to say, I came to a point in development that caused me to question the role and significance of design.
I decided that it would be far more desirable to adjust design to reflect probabilities than impose probabilities on inadequate design. However, I also took note of the fact that a significant number of my peers in this forum are statistically inclined. They can cover the territory as it relates to probabilities even without taking organizational structure into account. In contrast, I am quite interested if not borderline obsessed with how different structures led to distributions. I therefore leave the determination and imposition of statistical distributions to my colleagues who are in a better position to carry out such tasks. I have chosen as my focal point the evaluation of impacts resulting from organizational construction (to facilitate the implementation of strategies of a rather specific material nature): e.g. If this department were created to handle this series of tasks at this point in the process, how would this affect the flow of data and production on a more systemic level? I am therefore seeking out precise insights to help me determine how capital resources might be configured to gain specific data streams.
Blending and Partnering
Occupational health and safety for some organizations is associated more with human resources rather than industrial engineering. This means that an administrative professional might be responsible for work that would otherwise have been performed by somebody more engineering-oriented - perhaps an industrial hygienist. The reason for this has little to do with differences in accreditation or ability. Accident-prevention tends to demand administrative resources. Although data science has been described as an exciting new field, I want to emphasize how problematic it can be not to be anchored to quotidian workplace concerns. If data science didn't have a role to play in a company in the past, this means that the company might be able to get along without it in the future. Flowcharts have been around for some time. Although I don't suggest that data science commandeer flowcharts, I believe that all fields making use of flowcharts would benefit from a data-science orientation. Rather than simply present the flow of operations, it should always be possible to generate data streams to ascertain loads and stress. Anyone can make a flowchart. But in the future, I would like data scientists to be recognized for their ability to generate data and interpret the impacts of structural and procedural changes from flowcharts.
Another rather administrative type of concern relates to the documentation of organizational processes: this is a time-consuming and complex activity that might be addressed differently by different people depending on their backgrounds, methodologies, and management systems. After the Shatterdome gained the ability to rapidly generate event objects from mechanical drawings, I decided that I could use a separate program to convert the contents of these objects into webpages. Apart from specifications important to robot operations, the events could be used to hold supporting administrative data: the responsible individuals, departments, and managers; cost-benefit characteristics; history of changes; metrics normally associated with the events; abstract qualification requirements; actual qualifications of those that have successfully performed the duties. The list can be exhaustive. By means of structural design experimentation, it should be possible to systematically extract human resource requirements and financial impacts from these simulations. The simulations are meant to enhance rather than replace the involvement of those in positions of responsibility. Datasite management can be part of integrated paradigm. (The Shatterdome already generates "datasites.")
Integration with Tendril
In 2014, I frequently wrote about another prototype that I call "Tendril." Tendril has the ability to identify which events seem important in relation to any number of organizational metrics either taken in isolation or combined as a system. I want to take a moment to distinguish between different types of events: 1) those asserted as important during design; 2) those found to be important at runtime; and 3) those that actually seem to be important in relation to real-life operations. It is fine on the absence of data to start off with a mechanical drawing based on research and reasonable expectations. However, for me the initial mechanical drawing is simply a starting point. Change and development is an ongoing process. Quite simply, life is constantly changing. The idea of "optimal design" is fundamentally flawed since the starting point is never the ending point except in death I suppose. It is necessary to determine what events actually seem important in real life "at the time" and then make adjustments to the drawing. This is a perpetual process. An application with features similar to Tendril would be worthwhile to make hopscotch patterns "effective." A mechanical drawing is an ontological assertion of how events should be recognized or gain recognition in an organization. We need a means of examining how things come to exist and of navigating through reality such that we attach existence to those events critical to an organization.
In my applications, reality is made visible through event design. When organizations fail, they do so with their eyes wide open. When they make a mistake, it is from accomplishing exactly what they set out to do. I consider it hazardous to make assertions and assumptions with a tool like the Shatterdome without some means of confirming effectiveness and guiding design changes. Internal design is what we draw, but external design is actually what generates real-life data. These designs share only rudimentary similarities - those that are closest and most important to us. But transpositional relevance can be adversely affected by large amounts of data, idealism, social construction, giving rise to perceptions that might be pathological. However, I believe the biggest culprit particularly in an emerging discipline like data science is simply the learning process. It takes time to cope with complexity. There might be thousands of important activities taking place among a body of individuals in an organization; but in my drawing I might simply blackbox these behaviours as an inline process. If blackboxing is temporary then so too is the transpositional differential or conflict. Otherwise, blackboxing complexity might simply be a contemporary manifestation of an old problem: it is the fallacy of nominalism. It can trigger a reversion into positivism and an estranged form of rationalism concealing the true nature of barriers confronting an organization. In short, without Tendril, the Shatterdome might give us a maelstrom of spinning shards of glass.
This feature is currently absent on the Shatterdome. However, later this year, I plan to work on a language interface both to create new event maps and access those that already exist. For technical reasons, I intend to use "Tagalog" as the interface language as opposed to a Teutonic or Romantic language. I don't have a linguistics background. I'm under the impression that Tagalog is just one of many languages that makes heavy use of suffixes and prefixes in word construction. I feel it should be easier to supply the Shatterdome with a Tagalog rather than English interface. English is fairly linear. It brings about linearity - as if there is a door prize for linearity. Well, I suppose there are door prizes for linearity, come to think of it. But I will be trying something more recursive, multi-dimensional, and unstructured. It's certainly not going to be anything like Watson. I leave the supercomputer market to those with lots of resources. I'm trying to gain functionality that I call "Accommodating Disorder." This is not a disease that Canadians have because we tend to be accommodating. This is where things don't make sense and never have to make sense; but nonetheless there has to be a means of navigating towards desirable outcomes. So we are coping with the absence of order. I don't believe the human mind is properly configured for this type of journey without help from our tools.
Usually after completing a prototype, I think about all that has transpired. The thoughts that enter my head at this point greatly affect the future course of development. Despite its imperfections and need for continued development, a program that allows a person to extract simulated data streams from different organizational structures is nothing to sneeze at. I ask myself whether or not development should continue: if so then at what pace and in what direction in light of my personal circumstances? This is when development tends to be influenced by social events. Would it be better for society if something were left undeveloped, I sometimes ask myself. Development is how I choose to involve myself in the inner-workings of society. How this role manifests itself tends to dictate the things that I produce. I have found that uncertainty over role often suspends development. When the "market" is unclear, and I recognize a need to reinvent myself, I stop trying to "sell" into the market; perhaps this is quite a logical response to uncertain conditions. The ensuing search wherever it takes me might be to a place for example that has no Shatterdome. The future of a prototype and its supporting technologies might, as I noted earlier, ride on a thread of gossamer. I want to emphasize that using this development model, the market strongly influences what developments continue - if it continues at all.