Artificial Intelligence is on the rise! Not as a machine rebellion against human creators in the distant future, but as a growing modern trend of using machine-based predictions and decision-making in informational technologies. AI hype is everywhere: self-driving cars, smart image processing (e.g. Prisma), and communication domain use like conversational AI a.k.a. chatbots.
The chatbot industry is expanding fast, yet the technologies are still young. Conversational bots used to be rather vacant like the old school text-based game “I smell a Wumpus”, but now they evolved into a top quality business tool. Chatbots offer a new type of simple and friendly interface imperative for browsing information and receiving services. IT experts and industry giants including Google, Microsoft, and Facebook agree that this technology will play a huge role in the future.
To enjoy the marvels of Conversational Artificial Intelligence tools (or chatbots, if you are into brevity things), you must master the basics and understand the typical stack. In this article, we will discuss all kinds of instruments you can gear up with, how they are similar and at the same time different from each other, as well as their ups and downs.
But before we hop on the journey of discovering these, let’s get into the deeper understanding of the chatbots and their topology.
Diverse chatbots facilitate a myriad of business tasks from advertising to team building operations, often sharing core common features.
Business matters require professional organizational assistance, but not all of us can afford a secretary to handle the basic tasks. Luckily, chatbots are now here to help. They might be programmed to keep track of our work schedule and remind us about any upcoming events. This type of bots is useful since it is very simple in its foundation and uses the fast communication platforms - messaging systems as its interface.
The chatbot may take on a more formidable task - being representative of a company in conducting interaction with actual clients. The customer support workflows are mostly predictable and scripted even for human staff, therefore easy to implement into a chatbot. The typical bot behavior algorithm is to accept the user’s query, parse it for information, find the similar cases in the database, and respond with a prebuilt answer.
Using bots to support a team of developers is now extremely popular due to several reasons. It dwells in the development environment and thus is constantly under the scrutinous gaze of software engineers whose requirements for quality are much higher. However, the bot resolves a very strict set of tasks and accordingly does not require the complexity of commercial bots. They usually represent some simple scorekeepers, sentient chatbots that guard the development servers and report the commit information, simple schedulers and so on.
Publisher type of bot is gathering more and more interest daily. Many grand news sources (WSJ, NYT), as well as technological outlets (TechCrunch, MIT Technology Review) share content in a convenient form of brief text messages via major platforms like Facebook Messenger. The principle behind this bot is pretty simple: it gathers subscription information from the user, schedules the delivery of relevant news, and handles other user requests (e.g. unsubscribe, change the topic of subscription, explore).
Entertainment bots are still rare and serve a peculiar purpose: to manage reservations of events/cinema/theater tickets in a dialogue-style workflow. Some bots can also provide a full-fledged immersive experience of entertainment website via messenger. For example, Fandango Facebook Bot allows users to watch new movie trailers, read reviews, and find cinema theaters in their proximity. Cozy!
Yet another popular and fast-growing use case for the bots is the assistance with travel. In this case, the customer-oriented chatbot strives to help people with sometimes strenuous work of selecting the optimal transportation mode and transforms the workflow of tedious form completion into a casual chat in the messenger app. The travel chatbot is not only able to retrieve and confirm the booking information, but also notify about the times like check-in beginning and boarding, update on the status of the flight, and gather valuable feedback from the customers.
Well, ultimately, the chatbot is a program, designed to handle communication with the human user via conventional conversation by textual means (chat platforms). It waits for the user to say something and answers it as programmed. This constitutes the bare bones of chatbot with the simple algorithm on its surface: accept and interpret the input, provide a relevant response to the output.
However, chatbots are a bit more complicated than that since they now possess the power of context, either local (persistent in one conversation), or global (persistent across many dialogues, extending beyond the linguistic context, e.g. a pizza ordering bot that processes your current orders, location, timezone, etc.). While the former is usually saved in temporary memory like cookies or sessions, the latter is stored in databases or accessed inside party services via APIs.
Having introduced the concept of context, we also threw in some web application terminology (cookies, sessions, databases), which gives a hint of what the chatbot is alike now. The chatbots share many traits with web applications, which serve pages online (they similarly accept requests and respond to them, they use many standard tools like databases). So in a sense, chatbots are web applications.
Accordingly, chatbots become a new type of interface to the information and services that exist. This interface is compact, easily accessible, and very simple. It also promotes your service far more from its residence (your website) to a variety of platforms, facilitating user access to it without the effect of advertising and marketing (the mechanism before: user sees the ad in the Messenger chat, navigates the link to the website of the ad, orders product; the mechanism now: user sees the ad in the Messenger chat, orders product right in the conversation).
The internal work of chatbots seems simple on the surface, yet, in practice, it is not so easy.
The bot must first understand what the user says. There are several options here: pattern matching of user input and classification of the intents with Natural Language Processing (NLP). The former is fairly simple and straightforward in use, but rather hard to maintain at a bigger scale with flexible inputs. The latter relies on machine learning in interpreting the inputs and is harder to implement (at least without the help of platforms that already applied the technique). A set of examples is required to classify possible intents and identify the purpose of the particular input from a range of possibilities.
Luckily, there are some platforms that implement such logic, and you need not worry about every aspect of it and may use their services. However, you need to be familiar with the main NLP categories and their essence:
Entities are specific mappings of natural language word combinations in the human discourse (verbal or written) to standard phrases conveying their unobscured meaning. These are much like extracted variables, for example, DateTime specified like Christmas will imply 2017-12-25.
Intents, on an opposite, are general traits that map the user’s message to the corresponding bot action (prediction workflow). For example, the phrase “What is the weather today?” will map to ‘weather_inquery’ intent by its entire wording, and not some particular part.
Actions are the steps that bot is capable of committing as a response to the corresponding intent. These are usually the conventional functions, which may take optional parameters from the caller with detailed information (context).
Contexts vary depending on the platform and do not have some strict form or topology. They are most commonly represented as key/value mappings. They keep track of current implications of entities and differentiate the meanings/intents of phrases.
Well, we understand that chatbots constitute kind of web interfaces, but it is important to remember that it is an application of artificial intelligence and this implies the taxonomy, natural to it.
Having introduced the methods of bot’s language comprehensions, let’s take a look at a typology of bots with the regard of their purpose and responses.
First of all, we can differentiate types of conversational AI basing on the sphere of the operation (whether it is strictly specialized in one domain, e.g. weather bot or pizza bot, or just a general conversationalist) and on the way it computes the response to the user from the input (will it retrieve the predefined response or will generate the response corresponding to the input).
Regarding the way of retrieval based response, it is important to make a distinction between static and dynamic responses. The former is the simplest, much like a template filling, where to every input there is the corresponding answer. The latter is a kind of knowledge base, which returns the list of possible responses with the scoring of relevance.
With the closed domain chatbot, you will strive to solve a finite problem of communication - make a reservation for hotel/restaurant/flight, order pizza, buy shoes, etc. Thus, it is apparent that the inputs are the limit and we do not expect the user to talk about politics, psychology, or philosophy with a pizza ordering bot.
Whereas open domain bots are mainly focused on the conversation with the user itself, it does not seek to understand every aspect of what user says, it does not retrieve the entities and intents, nor it needs to keep track of the context. It only aims at imitating real-life conversation. Its main purpose is entertaining or answering general FAQ-style questions.
Once you understand the basics it is time to build; however, before you need to decide what platform or tool to use to make it.
If you want a jumpstart, but not yet ready to get to code it is recommended for you to start off with non-programming chatbot builders. They are oriented on nontechnical users and quite easy to apprehend and use. You do not get into much of the technical details, but rather work with the pure concepts (some learning curve is in place though). They are ideal for building simple bots and do not suit for complex commercial purposes. The major let down here is that they have little or no NLP features at all (thus not good for complex bots). Some notable platforms to look at include: Chatfuel, ManyChat, Octane AI, Massively AI.
For the purposes of more serious development, we would like to disambiguate bot frameworks and AI services.
Frameworks constitute an abstraction for the generic functionality of chatbot workflows in a packaged and convenient way. The chatbot frameworks are much like any other software frameworks (e.g. web application frameworks), they provide us with tools and utilities. They are usually implemented for a certain programming language. In addition, some of the bot frameworks also have hosted and interactive development environments to facilitate creating bots to even bigger extent.
AI services are independent, cloud-hosted platforms, often exposing GUI for interactive creation of chatbot logic, featuring Machine Learning powered NLP capabilities, and enabling communication via RESTful API.
Another emerging option is to incorporate a chatbot and Interactive Voice Response (IVR) system right into the Web Service building kits or frameworks. One of the latest developments in this area was made by Aspect and their Customer Experience Platform (CXP) solution for building web services. They made it easy to bring up data backed sites and provision them with textual and voice bot-interfaces. More on this approach can be found here.
As we mentioned, frameworks are libraries for certain programming languages. We will be looking at three major frameworks: Botkit for node.js, Microsoft Bot Framework with the Bot Builder SDK for .NET (also available for node.js, but we will focus .NET, in particular, C#), and Rasa NLU for Python.
Let’s start with Botkit. It is designed to help busy people build bots for their needs fast and easy, without having to dig deeply into the gears that roll under the hood. It provides a unified interface for sending and receiving messages. Initially intended for Slack, it now has extended functionality to support connection to various messaging platforms. The framework has intuitive workflows organized in a clear and concise manner, it’s well documented and provides an abundance of live chatbot examples for you to explore, so it is really easy to get started with and use furthermore.
It’s important to note, that the framework has no NLP capabilities, but it can be resolved by connecting existing or custom-built services of NLP via middlewares.
The library constitutes of core functionality and connectors that support different platforms. The core functionality is implemented in the form of event listeners, e.g. .on_message_receive, .hears, etc. The connector’s implementation varies depending on the platform it serves for.
It is super easy to start building with Botkit. First, you need to install Botkit, which can be done via npm installation: `npm install --save botkit`. Then, create a file and put your code there (an example of a basic console bot can be found here) and run it with node.js: `node my_bot.js`.
Microsoft, willing to keep up with emerging trend, gave the world a new vision of chatbot construction. Microsoft Bot Framework is comprehensive and easy to use building SDK, which is composed of two main components: the Framework itself - the Bot Connector SDK that responsible for integration and basics of bot logic and LUIS.ai, which corresponds for Natural Language Understanding, giving the bots a human-like sense.
Although LUIS.ai is featured as a component, the Framework itself may be used without it and is really impressive on its own. The tools are robust, and possibilities for integration are almost limitless (Messenger, Skype, etc.). The development environment features great tool for interactive and simple testing. You may also publish the chatbot you built to the public, send for review and, if it passes, it will get into the Bot Directory (https://bots.botframework.com/) - one coin to the Microsoft’s pot - easy to share bots.
Not quite framework appointed explicitly for building chatbots, however, Rasa NLU is one of the solutions that facilitate their back-end. While Botkit and Microsoft Bot connect to messengers, Rasa NLU is similar to NLP services, providing processing power on premises. Additionally, Rasa has Python interface to integrate the entity extractors directly into applications written in Python. It can also run as a service within other frameworks, exposing REST API endpoints.
Please see the table below for a brief comparison of pros and cons of these tools.
AI services, as we mentioned before, are cloud-hosted solutions for NLP needs and building smart bots that can predict flows of complex conversations. They provide UI for construction of prediction models and training models of the Machine Learning based understanding of language entities.
Let’s look at the top players in this circle.
Wit.ai is a platform that, according to its website, makes it easy for developers to build applications that you can talk or text to. It was recently bought by Facebook and became the ground for building Facebook’s Bot API.
Wit.ai makes it easy to define bot’s behavior abstracted away from the inner logic via registrable bot actions implemented in the language of your choice.
The key concept for defining behaviors in wit.ai is so-called “Stories”. They represent basic skeleton of the possible dialogue by one example of it. A “story” is a grouping of related intents, while the intent itself is a user-defined entity of trait type that does not define the entire flow.
The Machine Learning model of NLP is trained by examples, which is really great. Just show wit.ai what can be expected, and when the user sends similar requests, it will respond accordingly.
Wit.ai has a powerful mechanism for understanding language entities. Another great feature is assigning a role to the entities, which greatly helps with processing on the server-side.
As to interaction with the server-side, the wit.ai implements the “Bot sends” commands and offers webhooks integration for custom bot actions, like calls to API.
Api.ai is yet another platform that gives you the ability to build bots with NLP support.
Unlike wit.ai, it heavily relies on the intents for prediction. In fact, the two main concepts of api.ai are intents and context (wit.ai also has context, but it is used in a different scope). The intent makes a connection linking the user’s request with the corresponding action. The context, represented by a string value, makes a distinction between requests with small deviations from intent.
The basic workflow of api.ai is different from wit.ai. When api.ai receives a request from the user, it first matches it with an intent (if no intent is matched then default intent is implied) and then calls the corresponding action. The intent matching may be restricted by listing contexts that must be present for intent to match (the match can produce or delete the contexts) creating a workflow like an application with different states.
Just as wit.ai, api.ai provides a possibility for intent extraction. Moreover, it implements the slot-filling system out of the box (the slot-filling system is a method of requesting the information from the user and extracting them as entities by pertaining the list of already acquired entities and asking for missing ones).
The server-side logic is also completed with the webhooks and api.ai is basically calling services to get the response. An important note here, the server-side code can modify the contexts and thus affect the flow of prediction.
LUIS may be relatively named a newcomer to the game of AI services, as it was introduced to the World at the Microsoft Build 2016 event.
Like its competitors, LUIS provides an entity recognition and training system featuring a hierarchy of entities for subdivision of their meaning (like roles in wit.ai).
LUIS relies on intents in the prediction of the actions to be taken and uses the same logic as api.ai.
The training section is also in the form of UI and gives flexibility in the way of training. The logs of user requests conveniently allow interpreting and correct the interpretation to train the model further.
LUIS also features action fulfillment support gathering needed intents and contexts to perform chains of actions. It is still in beta and allows only simple testing. Another beta feature which also needs proper attendance is the dialogue support designed to help organize the related request and group the questions from the bot to the user in a concise form.
IBM Watson is a cognitive cloud service that enables a multitude of operations from natural language processing, speech recognition, sentiment analysis, conversing, and more. You may (or may not) remember IBM Watson as a cognitive supercomputer system that beat the man in Jeopardy. Well, you are right thinking that they have some connection, because they do - IBM brought the power of that supercomputer into a cloud creating a broad platform to help master countless tasks.
The downside of this though is that the great number of features may rather be puzzling and you will likely spend a lot of time figuring out what to use. And even after that you most probably will have to invest a great deal of resources into getting into the technology in order to use it.
It is also quite expensive (around 2 cents per API call) and thus is not very good beginner’s tool. It is better to consider IBM Watson as working solution if the use case for it in the company will be broad and finitely defined.
A summary of the key information about these tools is provided in the following table.
Chatbots are rising and evolving to be more user-oriented, integrating with other existing technologies. Profound understanding of their essence, functionality, and operation helps build efficient bots that will boost marketing, advertising, and overall consumer experience of your product.
The variety of platforms for chatbots creation is both astonishing and daunting, but each is developed for different use cases and plays a particular role in bot development. As the Deep Learning techniques advance, we expect the Conversational AIs to utilize them in the near future, making a huge leap towards passing the Turing test.