Among the many decisions you’ll have to make when building a predictive model is whether your business problem is either a classification or an approximation task. It’s an important decision because it determines which group of methods you choose to create a model: classification (decision trees, Naive Bayes) or approximation (regression tree,…Continue
Added by Algolytics on June 2, 2021 at 1:00am — No Comments
“If (there) was one thing all people took for granted, (it) was conviction that if you feed honest figures into a computer, honest figures (will) come out. Never doubted it myself till I met a computer with a sense of humor.”
― Robert A. Heinlein, The Moon is a Harsh Mistress
This post is the first in a series of articles in which we will explain what Machine Learning is. You don’t have to have formal training or…Continue
A popular phrase tossed around when we talk about statistical data is “there is correlation between variables”. However, many people wrongly consider this to be the equivalent of “there is causation between variables”. It’s important to explain the distinction: Correlation means that once we know how one variable changes we can make reasonable deductions about how other variables change There are several variants of correlation:
One of the most typical tasks in machine learning is classification tasks. It may seem that evaluating the effectiveness of such a model is easy. Let’s assume that we have a model which, based on historical data, calculates if a client will pay back credit obligations. We evaluate 100 bank customers and our model correctly guesses in 93 instances. That may appear to be a good result – but is it really? Should we consider a model with 93% accuracy as adequate?
It depends. Today, we…Continue
Added by Algolytics on November 13, 2016 at 4:30am — No Comments
In the previous post of our Understanding machine learning series, we presented how machines learn through multiple experiences. We also explained how, in some cases, human beings are much better at interpreting data than machines. In many tasks machines still can’t replace humans, who understand surrounding reality better and can make more accurate decisions.
Machines can be given a…Continue
Added by Algolytics on October 13, 2016 at 4:30am — No Comments
Added by Algolytics on October 21, 2015 at 5:29am — No Comments
Data is everywhere. We generate data when using an ATM, browsing the Internet, calling our friends, buying shoes in our favourite e-shop or posting on Facebook. Companies collect this data en masse in order to make more informed business decisions, such as:
In this last part of the tutorial we will discuss the LIFT curve.
A lift chart pictures gains from applying a classification model in comparison to not applying it (i.e. using a random classifier) for a given section of data.
Two simple examples are shown below.…Continue
Added by Algolytics on August 11, 2015 at 1:00am — No Comments
Information about provided services, customers and transactions can be stored in different database systems and data warehouses, depending on the way in which a company operates.
Due to such arrangements, even the simplest analyses or report may require significant expenditures of time, as well as in-depth knowledge about database systems and their availability.
For an analyst this situation is frequently the source of difficulties – lack of required…Continue
Added by Algolytics on July 24, 2015 at 6:00pm — No Comments
In the previous parts of our tutorial we discussed:Continue
Added by Algolytics on July 1, 2015 at 5:30am — No Comments
In the last part of the tutorial we introduced quantitative indicators of classification model quality. In the next two parts we will take a closer look at a couple of graphical indicators. The first one is called the Confusion Matrix (the name „Contingency Table” is also used).
What is a Confusion Matrix?
Confusion Matrix is an N x N matrix, in which rows correspond…Continue
Added by Algolytics on June 16, 2015 at 12:30pm — No Comments
In the last part of the tutorial we introduced the basic qualitative model quality indicators. Let us recall them now:
Added by Algolytics on May 23, 2015 at 6:00pm — No Comments
Classification is the process of assigning every object from a collection to exactly one class from a known set of classes.
Examples of classification tasks are:
In the previous post we presented a few methods of data analysis, which are used to identify customer needs and preferences and allow us to predict their behavior. Such knowledge results in building better marketing and sales offers which meet specific customer expectations.
In today’s article we present further examples of Data Mining methods that can be applied in daily business operations.…Continue
Added by Algolytics on May 2, 2015 at 5:30pm — No Comments
The key asset of any company is its customers. It is therefore very important to identify their needs and preferences as well as to know the factors affecting their behavior. The collected customer data allows predicting customer behavior and creating appropriate marketing offers, sales plans, and retention programs that match customers’ needs.
Data mining tools are used to create models that predict customer behavior by using historical data. These methods can be…Continue
Added by Algolytics on April 23, 2015 at 5:00pm — No Comments