I have been working with San Diego Water quality data project:
Here are data sets:
Regretfully my complete works do not fit into the blog post (or even a few posts) because of a post…Continue
Added by Maiia Bakhova on November 6, 2018 at 11:46am — No Comments
Neural networks are considered complicated and they are always explained using neurons and a brain function. But we do not need to learn how to brain works to understand Neural networks structure and how they operate. We can look as something people encounter in everyday life more…Continue
This is an article which attempts to detect dependable variables with non-linear method.
I'm going to apply a method for checking variable dependency which was introduced in my previous post. Because the "dependency" I get with this rule is not true dependency as defined in Probability then I will call variables practically dependent at a confidence level…Continue
Added by Maiia Bakhova on November 2, 2016 at 11:30am — No Comments
In this post I will sometimes use a term “variable” for “feature”(“predictor”“) or”outcome“(”predicted value“”).
The question of variable dependencies for a particular data is quite important, because it can help to reduce an amount of predictors used for a model. Or it can tell us what feature is not helpful for a model construction, although it still can be used for engineering of another predictor. For example sometimes it is better to compute speed than to use distance values. In…Continue
Added by Maiia Bakhova on September 6, 2016 at 1:07pm — No Comments
The bagged trees algorithm is a commonly used classification method. By resampling our data and creating trees for the resampled data, we can get an aggregated vote of classification prediction. In this blog post I will demonstrate how bagged trees work visualizing each step.…Continue
Added by Maiia Bakhova on May 18, 2016 at 2:12pm — No Comments
Choosing features to improve a performance of a particular algorithm is a difficult question. Currently here is PCA, which is difficult to understand (although it can be used out-of-the-box), requires centralizing and scaling of features and is not easy to interpret. In addition, it does not allows to improve prediction performance for a particular outcome (if its accuracy is lower than for others or it has a particular importance). My method enables to use features without preprocessing.…Continue
Added by Maiia Bakhova on May 5, 2016 at 11:30am — No Comments
There are many ways to choose features with given data, and it is always a challenge to pick up the ones with which a particular algorithm will work better. Here I will consider data from monitoring performance of physical exercises with wearable accelerometers, for example, wrist bands.
The data for this project come from this source: http://groupware.les.inf.puc-rio.br/har.
In this project, researchers used data from…Continue