When we look at broadly different kinds of Machine Learning that are used in practice in Artificial Intelligence
Historically, there have been several approaches in Machine learning for AI like supervised learning, unsupervised learning, reinforcement learning, case-based reasoning, inductive logic programming, experience based generalisation etc. there have been several examples of waves of machine learning for different AI problems. But, of them the 3 most important categories of machine learning which are of practical use or business use today happen to be:
- Supervised Learning
- Unsupervised Learning
- Reinforcement learning
Let’s look at each of them separately and we’ll have a brief summary of each of them so that we have an idea of what we’re talking about.
is today the most mature and probably in some sense the easiest form of machine learning. The idea here is that you have historical data with some notion of the output variable. What is the output variable? Output Variable is meant for identifying how you can a good combination of several input variables and corresponding output values as historical data presented to you and then based on that you try to come up with a function which is able to predict an output given any input. So, the key idea is that historical data is labeled. Labeled means that you have a specific output value for every row of data, that is presented to it. So, in that sense, the problem of supervised learning is that you really can only work when you have clearly labeled data with input-output values of all the historical data that’s presented to you. Then, once you have the historical data with their several input variables with their values and corresponding output value present with you, you can use that to infer a kind of potential function between input and output and the function can be actually used for any given input that is coming in and guesses the corresponding output whether the output is discrete or continuous, does not matter. Both are supervised learning.
Specifically, in the case of the output variable, if the output variable is discreet, it is called CLASSIFICATION. And if it is continuous it is called REGRESSION. So, in case of classification, the formula of function takes a new input and classifies them into one of the discreet possible output values or in case of REGRESSION, we take the input values and give the corresponding continuous value to the output. So an example in discreet could be SPAM CLASSIFIER, that takes input data and then classifies it into spam or non-spam. An example of Continuous data could be stock prediction where you take a look at a lot of data from history to potential stock prices with varied different conditions and predict the exact value and get the output function and then use it for producing a new scenario where, at a new instance, with different conditions of environment, stock price is calculated.
So, what happens in supervised is, that you have the luxury of having labeled historical input and output data.
Unsupervised learning DOES NOT have the luxury of having labeled historical data input-output etc. Instead, we can only say that it has a whole bunch of input data, RAW INPUT DATA. So what does this unsupervised learning give you?
It allows us to identify what is known as patterns in the historical input data and allows us to identify interesting insights from the overall perspective of an interesting pattern or insight on the original input data. So, there is no explicit equation or a pattern which reflects a relationship between input or an output. So, the output here is absent and all you need to understand is that is there a pattern being visible in the unsupervised set of input.
An example is, suppose you look at identifying, in a large supermarket transaction set, which 2 items are often bought together? There is no output here. We are only talking about which are the collection of items which are frequently bought together? So, that’s like extraction of common occurrence kind of items together. So, that does not mean labeling, it only has explicit inputs, no output. All inputs are then run through an unsupervised algorithm and then the pattern is extracted.
The beauty of unsupervised learning is that it lends itself to numerous combinations of patterns, and the problem is, because of its diverse nature, there is no one notion of data,
One pattern means, in one context something, in another context of data, it could mean another pattern. So, there is no standardized notion of what is a good unsupervised algorithm. That’s why unsupervised algorithms are harder, problems are tougher and more difficult to deal with.
Reinforcement learning (RL) is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward.
Where typically you’re required to reach some goal state and it’s possible that you know you have put a whole set of steps as you go along from the start position to the goal, which is end position and there are several steps in between and at each step, you can take multiple actions. So, what happens is that there is some notion of a start function (start state), there is some notion of a goal state, and then there are multiple states in the middle where at each state, you can take multiple actions. Multiple actions possibility at each state and each action has a corresponding reward or a punishment. So, what happens with this notion is, because we’re talking a collection of states, one at start state, one at end state, multiple goal states, not one, then each state having the corresponding mapping to action to a corresponding reward or punishment. So, ultimately, what reinforcement learning does is it allows a machine or an agent to take steps and to go from one state to another and take action at each step, where it takes to another state and it takes action at next step. But as you keep going along, it collects the output based on experience, whether it is positive or negatively rewarded or punished. Based on that, it optimizes the steps taken and the path is taken and it incrementally increases its knowledge of which is probably a better path in terms of better rewarding, in terms of reaching the goal faster or finishing a task faster.
So, in that sense, reinforcement learning is dynamic, state dependent and constantly keeps updating rewards and punishments also as it keeps learning from experience. So, in that sense, history may not be there in the start, but history builds as it gives through actions, states, rewards, punishments in each step. So, a classic example of this is MAZE LEARNING, where a starting agent starts with start point to end with various obstacles that it gives that as a punishment and it retrains itself. And in the future, it remembers and tries to go through paths which do not lead to potentially blocking. So, what happens is you’re trying to reach the goal faster by remembering of the moves which do not lead you to the goal and instead take you to a hindrance and that knowledge is being accumulated here. Reinforcement learning dynamically continues updates the rewards and punishments knowledge and brings a system which is able to learn from experience and become optimal in reaching the goal.
So, these 3 broad categories form the basis of modern AI systems, where machine learning is entrenched into AI systems.