We have witnessed the rise of **Key & Value pair**, since the emergence of Big Data. We certainly can explore the relationship of such two variables in terms of X & Y, to be worked with in terms of using Data Science. The use of Regression also on basic terms gives an a depiction of two variables X & Y to work with. These variables are:

**Independent** Variables & **Dependent** Variables

Let us take behavior of users of a financial institution. We take a hypothetical data (random sample) of 6 users visiting one specific website of a Line of Business or LOB in specific one hour.

**User**: 1, 2, 3, 4, 5, 6 **Visits**: 5, 17, 11, 8, 14, 5

The user behavior of another user or we can say User # 7, we need to predict his/her behavior of visiting one specific website, we will be using the statistical technique, which is called "Mean", which is the adding all visits by first randomly selected users, which becomes to total visits to be divided by the total number of users, which we can also say as Mean (Visits) = 60/6 = 10. This is the prediction we can do in terms of best estimate for user # 7 to visit the same site. This can also be considered Internal LOB Forensics of User Behavior. This can also be called the Measure of Variability. Let us now find the distance between our data on the good fit that we got after calculating the mean, which is 10 for users usage deviations ( Mean - Visit):

**Residuals(Error)**: -5, 7, 1, -2, 4, -5 {Let us add all + & - = -5-2-5 = -12 & 7+1+4 = 12}

This means -12+12 = 0, our value is most likely the value of the next user's visit to the website, we have chosen the sample for. Let us now do a Sum of Squared Residuals or Errors, which is 120. This entire example is based on one dependent variable only, which is the visits of one specific website by some users with in a Line of Business. This predictive analytic discussion has introduced the idea of usage of a website, by some users using **Simple Linear Regression**. We certainly can explore more, if we know, the time users have spent on that specific website or the number of pages the visited, in this case, we now can have both Independent and Dependent Variables available for us to work with to have our prediction on a better note.

Linear Regression is a continuity of **Correlation** and **Anova**. While working with Correlation we work with two variables as we discussed in this article X & Y, and there are points plotted on these X & Y on a graph.There is a relationship that we have explored between these plotted points. We can also say that the value of one variable is the function of another variable. It can also be shown as:

**y = f(x) { **the value of y is a function of x **}**

The value of dependent variable **y** is always dependent on the value of dependent variable **x**.

It is hoped that this article sheds some light on the basic use investigative forensics within a department or a Line of Business within an organization, which may be looking at the internal users' behavior to serve some clients using one single resource.

*Originally posted on LinkedIn*

© 2019 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

**Technical**

- Free Books and Resources for DSC Members
- Learn Machine Learning Coding Basics in a weekend
- New Machine Learning Cheat Sheet | Old one
- Advanced Machine Learning with Basic Excel
- 12 Algorithms Every Data Scientist Should Know
- Hitchhiker's Guide to Data Science, Machine Learning, R, Python
- Visualizations: Comparing Tableau, SPSS, R, Excel, Matlab, JS, Pyth...
- How to Automatically Determine the Number of Clusters in your Data
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- Fast Combinatorial Feature Selection with New Definition of Predict...
- 10 types of regressions. Which one to use?
- 40 Techniques Used by Data Scientists
- 15 Deep Learning Tutorials
- R: a survival guide to data science with R

**Non Technical**

- Advanced Analytic Platforms - Incumbents Fall - Challengers Rise
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- How to Become a Data Scientist - On your own
- 16 analytic disciplines compared to data science
- Six categories of Data Scientists
- 21 data science systems used by Amazon to operate its business
- 24 Uses of Statistical Modeling
- 33 unusual problems that can be solved with data science
- 22 Differences Between Junior and Senior Data Scientists
- Why You Should be a Data Science Generalist - and How to Become One
- Becoming a Billionaire Data Scientist vs Struggling to Get a $100k Job
- Why do people with no experience want to become data scientists?

**Articles from top bloggers**

- Kirk Borne | Stephanie Glen | Vincent Granville
- Ajit Jaokar | Ronald van Loon | Bernard Marr
- Steve Miller | Bill Schmarzo | Bill Vorhies

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives**: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central