Summary: A major problem with chatbots is that they can only provide information from what’s in their knowledge base. Here’s a new approach that makes your chatbot smarter with every question it can’t answer, making it a self-learning lifelong learner.
If you’ve been keeping up with the explosive growth in chatbots you probably already know that there are two basic architectures:
Rules-based Chatbots representing over 90% of what’s currently available. They are relatively simple and fast to build, with decision-tree or waterfall-like logic structures of predefined queries and responses.
AI Chatbots use deep learning engines to formulate responses. They do not have rigidly defined structures and are able to learn conversational responses after some initial training.
And while the application of NLU in both cases is exceedingly clever, they both have an Achilles heel – they don’t know what they don’t know. That is to say that a chatbot can only respond based on what’s in its knowledge base (KB).
Many chatbot applications can be made very effective with limited content. If you’re building a recommender it can access your full library of the products you offer along with their prices and specifications. If you’re reserving an airline flight, there are a finite number of offerings available between any two cities. But as you expand outward from these limited applications to very sophisticated question answering machines (like Alexa and Siri may one day become) then your knowledge base becomes extremely large, perhaps even encompassing the entire web. Right now that doesn’t work. However, that’s where we want to go.
This isn’t simple search where your chatbot can respond with a long list of things you may have meant. Like IBM’s Watson it needs to respond with the single most correct answer. And that answer better make sense or your users won’t come back.
How Your Chatbot Finds the Right Answer
Matching your user’s query with facts in the knowledge base is complex but reasonably well developed. The procedure is formally known as ‘Context Aware Path Ranking’ or C-PR for short.
That procedure creates both a knowledge graph and a knowledge matrix linking a ‘source entity’ (what your user asked about) with the ‘target entity’ (what’s in the knowledge base) through a logical relationship (like ‘is found in’ or ‘is a member of’ or ‘is commonly used together’). In fact it is expressed as a ‘query triplet’ shown as (s, r, t) where s is the source entity, r the relationship, and t the target entity.
You can see intuitively that if your version of Alexa is meant to be able to answer every question that could possibly come to mind that the knowledge graph and matrix would be impossibly large. So just as Alexa and Siri are intentionally limited in their domain knowledge, the question remains how to push outward into larger and larger areas of knowledge.
What Do Your Users Really Want to Know
This question determines how big your knowledge base should be. It’s unlikely that your users want to be able to ask absolutely anything based on how you’ve positioned your chatbot. One approach might be to try to define this in advance. That is during development to figure out just how big that knowledge base needs to be. But history shows that however carefully we plan we always miss something. We’ll include data that not important and exclude data our users want to know.
A pretty obvious solution is to take note of what users ask that we can’t answer and add that to the knowledge base. In research this is called KBC, Knowledge Base Completion. And these techniques work but only if all three elements of the query triplet ((s, r, t) where s is the source entity, r the relationship, and t the target entity) already exist in your knowledge base but simply haven’t been mapped yet. Map them on the knowledge graph and the knowledge matrix and they’re available to your users.
Here’s What’s Really Needed
If your user asks a question or makes a statement in which any of the S, R, T elements are not present the chatbot can’t respond. What’s needed is a system where the chatbot can continuously learn from new questions and incorporate them automatically into the knowledge base.
In other words, anything you say to or ask of your chatbot should make it smarter just by talking to it.
Fortunately, three researchers from the University of Illinois at Chicago, Sahisnu Mazumder, Nianzu Ma, Bing Liu have just published the results of their work that opens up this possibility.
The study “Towards an Engine for Lifelong Interactive Knowledge Learning in Human-Machine Conversations” is all about a new technique called lifelong interactive learning and inference (LiLi), imitating how humans acquire knowledge and perform inference during an interactive conversation.
Here the ‘S’ (what the user want to know – an entity) is captured from the input conversation and the ‘R’ (relationship) and the ‘T’ (target entity) are discovered by a clever combination of reinforcement learning and LSTM deep learning models in real time.
LiLi becomes the lifelong learning component that adds to your chatbots knowledge base every time a user makes a statement or asks a question that’s not currently in the KB.
How It Works
The problem divides logically into two parts. If the user makes a statement (e.g. Obama was born in the USA) the system makes a query of the KB (Obama, BornIn, USA) to determine whether this ‘true fact’ is present. If not it is added.
Suppose however that the triplet that is already in the KB is (Obama, CitizenOf, USA) but not the triplet containing the ‘BornIn’ relationship. Then the RL/LSTM program will, over time, cause the system to recognize the high likelihood that ‘CitizenOf’ and ‘BornIn’ are logical equivalents in this context.
The methods by which the entities and relationships are extracted from the conversation are a separate NLU process you can review here.
The second case however is the more difficult. This occurs when the input is a query where either or both the relationship or the entities are not in the KB. For example, if the user asks “was Obama born in the USA?” how do we proceed if ‘Obama’ or ‘born in’ or ‘USA’ are not already in the KB?
Upon receiving the query with unknown components LiLi executes a series of interleaved steps involving:
Needless to say, deciding which clarifying questions are appropriate and how many times you can go back to the user to ask them are not trivial issues that are addressed by LiLi.
Here are the actions that are available in LiLi in the typical order in which they would occur.
The Data Science
In brief, the RL model which uses Q-learning has the goal to formulate a strategy that makes the inference task possible. LiLi’s strategy formulation is modeled as a Markov Decision Process. LiLi also uses an LSTM to create a vector representation of each feature.
The system also contains a routine for ‘guessing’ in the event the user is not able to offer a clue or example which is reported to be significantly better than a coin toss.
See the original study here.
LiLi represents a significant step forward in making our chatbots smarter without artificially inflating the knowledge base beyond what our users really want to know.
Other articles in this series:
About the author: Bill Vorhies is Editorial Director for Data Science Central and has practiced as a data scientist since 2001. He can be reached at: