Leverage AI to Create Autonomous Policies that Adapts without Human Intervention
Policies are the foundation for any successful organization. Policies are the rules, or laws, of an organization. Policies document the principles, best practices and compliance guidelines that aid decision-making in supporting the consistent and repeatable operations of the business. Heck, one could argue that an organization’s culture is better defined by its policies than it is by the character of its leadership team.
Unfortunately, the management, creation and execution of policies haven’t changed much since the days of “time-and-motion studies”. In many cases, policies are nothing more than a static list of what-if rules that govern what workers are to do in well-defined situations. For example, [If your car has been driven over 3,000 miles since the last oil change, thenchange the oil] or [If you haven’t visited the dentist in greater than 6 months, then visit the dentist].
But what if…what if these policies weren’t just static if-then rules but were instead AI-based models that changed to optimize the actions based upon the constantly evolving state of the environment in which the business operates…without human intervention?
In much the same way that we are seeing AI being used to create autonomous vehicles, robots and devices that learn and adapt without human intervention, can we leverage AI to create autonomous policies that learn and adapt without human intervention?
First, let’s modernize our definition of “policy”:
A “Policy” is a codified set of agent-based (human or machine) analytics that guide actions (or decisions) based upon current state (or environment) that optimize, automate and operationalize (scale) an organization’s business and operational models.
And my hypothesis is this: if a policy can be documented and automated, then it can be integrated with AI / ML to become autonomous so that the policies and procedures learn and adapt without human intervention. Policies that self-monitor, self-diagnose, self-learn and self-change/evolve?
Identify à Document à Codify à Automate + AI/MLDL yields Autonomous policies that learn and continuously evolve without human intervention
If an autonomous device can gain information about the environment through interaction, learn through those interactions and update its operating model without human intervention, then why can’t the policies that support the operations of the business do the same thing?
Achieving Autonomous hinges on the ability to apply AI, or Deep Reinforcement Learning, to the governance and evolution of these policies. Deep Reinforcement Learning is the combination of a deep Neural Network (Convolutional Neural Network) for image recognition and classification, with Reinforcement Learning for autonomous agents to learn and improve operational effectiveness to yield. Combining convolutional neural networks (CNN) with Reinforcement Learning allows the agent to recognize its current state and rank the best actions to perform given that current state.
The goal of Deep Reinforcement Learning is for an autonomous “agent” to learn a successful strategy from continuous engagement with the environment. With the optimal strategy, the agent can actively adapt to the changing environment to maximize rewards (current and future) while minimizing costs (see Figure 1).
Figure 1: An agent interacts with its environment, trying to take actions to maximize cumulative rewards
Deep Reinforcement Learning factors in Figure 1 include:
For example, today we have a society policy or rule dictating what drivers are supposed to do when they arrive at an intersection at the same time. When two vehicles arrive at a 4-way stop at the same time, and they are located head-to-head and one of the vehicles intends to turn right and the other intends to turn left, the vehicle turning right has right of way. Move forward slowly before entering the intersection to indicate to other drivers you are making the turn. The driver turning left should wait until the other car has fully passed (see Figure 2).
Figure 2: “The 4 Rules of 4-Way Stops”
However, in a word of autonomous vehicles, those if-then rules to guide safe decisions navigating an intersection just won’t work. The promise of flawless traffic and reduced traffic congestion would give way to a series of frustrated autonomous vehicles starting and stopping at the intersection.
So instead of the old policy for determining to whom to defer when multiple cars arrive at the intersection at the same time, we’d have to develop a new policy that can continuously learn and evolve as the flow and density of traffic patterns changes throughout the day and in response to special events and situations (see Figure 3).
Figure 3: Enterprise TV Commercial, ‘The Future of Transportation'
The autonomous vehicle (agent) must constantly monitor, diagnose and learn in order to actively adapt to the changing environment to maximize future rewards while minimizing costs without human intervention.
If an autonomous device or vehicle can gain information about the environment through interaction, learn through those interactions and update its operating model without human intervention, then why can’t the policies that support the operations of the business do the same thing? For example:
Using Deep Reinforcement Learning, we can transition from static policies to autonomous policies that learn how to map any given situation (or state) to an action to reach a desired goal or objective without human intervention. These autonomous policies would dynamically learn and update in response to constantly changing environmental factors (such as changes in weather patterns, economic conditions, price of commodities, trade and deficit balances, global GDP growth, student debt levels, fashion trends, Cubs winning the World Series, etc.).
Do autonomous policies – policies that are constantly learning and updating based upon changing environmental factors – lead to an autonomous business? Is this the modern math of the Autonomous Business?
Autonomous Policies = Identify à Document à Codify à Automate + AI/MLDL yields Policies that learn and continuously evolve without human intervention
That’s something to consider over a Guinness or three on my next trip to Ireland.