Linear regression is arguably one of the most widely used techniques in the data science world. But, a comprehensive understanding of this technique is not universal and it is at a level that is less than desired.
First, a little history, the term regression was first used by Sir Francis Galton, a 19th century polymath. Galton was a pioneer in application of statistical methods in many branches of science, he studied the relative sizes of parents and their offsprings in various species of plants and animals. During this study he observed that a larger than average parent tends to produce a larger than average child, but the child is likely to be less large than the parent in terms of its relative position in its own generation. Galton termed this phenomenon “a regression towards mediocrity”.
Linear regression is an approach to model a relationship between a dependent variable (y) and one or more independent variables (x). In this paper, I discuss