Subscribe to DSC Newsletter

Frank Raulf's Blog (9)

Introduction to Gradient Decent

The gradient decent approach is used in many algorithms to minimize loss functions. In this introduction we will see how exactly a gradient descent works. In addition, some special features will be pointed out. We will be guided by a practical example.…

Continue

Added by Frank Raulf on June 24, 2020 at 7:00am — No Comments

How to make time-data cyclical for prediction?

There are many ways to deal with time-data. Sometimes one can use it as time-series to take possible trends into account. Sometimes this is not possible because time can not be arranged in a sequence. For example, if there are just weekdays (1 to 7) in a dataset over several month. In this case one could use one-hot-encoding. However, considering minutes or seconds of a day one-hot-encoding might lead to high complexity. Another approach is to make time cyclical. This approach leads to a…

Continue

Added by Frank Raulf on January 26, 2020 at 4:00am — No Comments

Setting the Cutoff Criterion for Probabilistic Models

For decision making, human perception tends to arrange probabilities into above 50% and below - which is plausible. For most probabilistic models in contrast, this is not the case at all. Frequently, resulting probabilities are neither normal distributed between zero and one with a mean of 0.5 nor correct in terms of absolute values. This is not seldom an issue accompanied with the existence of a minority class - in the underlying dataset.

For example, if the result of a…

Continue

Added by Frank Raulf on January 4, 2020 at 3:00am — No Comments

Naive Bayes Classifier using Kernel Density Estimation (with example)

Bayesian inference is the re-allocation of credibilities over possibilities [Krutschke 2015]. This means that a bayesian statistician has an “a priori” opinion regarding the probabilities of an event:

p(d)   (1)

By observing new data x, the statistician will adjust his opinions to get the "a posteriori" probabilities.

p(d|x)   (2)

The conditional probability of an event d given x is the share of  the joint…

Continue

Added by Frank Raulf on January 3, 2020 at 4:30am — No Comments

Which one is faster in multiprocessing, R or Python?

This post is the third one of a series regarding loops in R an Python.

The first one was Different kinds of loops in R. The recommendation…

Continue

Added by Frank Raulf on December 19, 2019 at 9:00am — 2 Comments

Omitted Variables in Linear Regressions

The importance of completeness of linear regressions is an often-discussed issue. By leaving out relevant variables the coefficients might be inconsistent.

But why on earth?! 

Assuming a linear complete model of the form:

z = a + bx + cy + ε.

Where z is supposed to be dependent, x and y are independent and ε is the error term.

Now we drop y to check…

Continue

Added by Frank Raulf on November 13, 2019 at 2:00am — No Comments

Loop-Runtime Comparison R, RCPP, Python

The positive reactions on my last post: “Different kinds of loops in R” lead me to compare some different versions of loops in R, RCPP (C++ integration of R). To see a bigger picture, I apply the Python for-loop additionally. The comparison focuses on the runtime for non-costly tasks with different numbers of iterations. For comparison purpose I create vectors in the form of (R syntax):

Vector <- 1:k

k = (1.000, 100.000, 1.000.000)

 

The task is to…

Continue

Added by Frank Raulf on September 1, 2019 at 4:30am — 1 Comment

Different kinds of loops in R.

Normally, it is better to avoid loops in R. But for highly individual tasks a vectorization is not always possible. Hence, a loop is needed – if the problem is decomposable.

Which different kinds of loops exist in R and which one to use in which situation?

In each programming language, for- and while-loops (sometimes until-loops) exist. These loops are sequential and not that fast – in R.

for(i in…

Continue

Added by Frank Raulf on August 12, 2019 at 12:30am — No Comments

Data Quality Maintenance

Continue

Added by Frank Raulf on August 2, 2019 at 12:00am — No Comments

Videos

  • Add Videos
  • View All

© 2020   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service

console.log("HostName");