Subscribe to DSC Newsletter

All Blog Posts Tagged 'Regression' (17)

A short introduction to Log Models

Why do we take logs of variable in Regression analysis?

We should remember that a regression equation has two parts

i) The Dependent variable (Predictand)

ii) The Independent variables (Predictors) ; which can be one or more and can be of different types (Categorical or Continuous).

The nature of the regression that we should run depends on the type of Dependent variable that we are dealing with in our model. For example, if the dependent variable is Continuous…

Continue

Added by Sibashis Chakraborty on October 20, 2019 at 8:57am — No Comments

How Can Python Help Solve Machine Learning Challenges?

Summary: Python’s open-source and high-level nature, as well as its comprehensive libraries, make it the perfect fit to solve the numerous real-life ML challenges.

The increasing popularity and accessibility of Artificial Intelligence solutions is rapidly reshaping many industries, from healthcare through finance to aviation. Although the application of the latest technologies has always been an essential consideration for companies striving to get…

Continue

Added by Łukasz Grzybowski on July 23, 2019 at 1:30am — No Comments

Credit Risk Prediction Using Artificial Neural Network Algorithm

1 Introduction

Credit risk or credit default indicates the probability of non-repayment of bank financial services that have been given to the customers. Credit risk has always been an extensively studied area in bank lending decisions. Credit risk plays a crucial role for banks and financial institutions, especially for commercial banks and it is always difficult to interpret and manage. Due to the advancements in technology, banks have managed to reduce the costs, in order to…

Continue

Added by Shruti Goyal on March 14, 2018 at 11:30am — 5 Comments

Machine Learning Techniques based paper one of the two Risk Quant Europe 2018 Call for Paper Winners

Each year, Risk Quant Europe Conference, a conference well-attended by practitioners from banking, asset management, insurers as well as academics from Europe, selects two papers to present in their annual conference.

For 2018, our paper is lucky to be one of the two winning papers selected by the Advisory Board for the conference to be held in London. Please feel free to check out our paper titled CDS Rate Construction Methods by Machine Learning…

Continue

Added by Zhongmin Luo on February 24, 2018 at 2:00am — No Comments

Optimization techniques: Finding maxima and minima

In the last post, we talked about how to estimate the coefficients or weights of linear regression. We estimated weights which give the minimum error. Essentially it is an optimization problem where we have to find the minimum error(cost) and the corresponding coefficients. In a way, all supervised learning algorithms have optimization at the crux of it where…

Continue

Added by Jobil Louis on January 2, 2018 at 3:30pm — No Comments

A Guide for Applying Machine Learning Techniques in Finance

Does it sound familiar to you? In order to get an idea of how to choose a parameter for a given classifier, you have to cross reference to a number of papers or books, which often turn out to present competing arguments for or against a certain parameterization choice but with few applications to real-world problems.

For example, you may find a few papers discussing optimal selection of K in…

Continue

Added by Zhongmin Luo on June 5, 2017 at 7:30pm — 6 Comments

Choice of K in K-fold Cross Validation for Classification in Financial Market

Cross Validation is often used as a tool for model selection across classifiers. As discussed in detail in the following paper https://ssrn.com/abstract=2967184, Cross Validation is typically performed in the following steps:

  • Step 1: Divide the original sample into K sub samples; each subsample typically has equal sample size and is referred to as one fold, altogether,…
Continue

Added by Zhongmin Luo on June 2, 2017 at 7:00pm — No Comments

Parameter Selection in Classification for Financial Market

In practice, we often have to make parameterization choices for a given classifier in order to achieve optimal classification performances; just to name a few examples:

  • Neural Network: e.g., the optimal choice of Activation Functions, # of hidden units
  • Support Vector Machine: e.g., the optimal choice of Kernel Functions
  • Ensemble: e.g., the number of Learning Cycles for Bagging.
  • Discriminant Analysis: e.g., Linear/Quadratic; regularization…
Continue

Added by Zhongmin Luo on May 29, 2017 at 12:49am — No Comments

Apply Machine Learning Techniques to Problems in Financial Market

Past literature show that the comparisons of classifier's performance are specific to the types of datasets (e.g., Pharmaceutical industry data) used; i.e., some classifiers may perform better in some context than others. A paper titled CDS Rate Construction Methods by Machine Learning Techniques conducts the performance comparison exclusively in the context of financial market by applying a wide range of classifiers to provide solution to so-called Shortage of…

Continue

Added by Zhongmin Luo on May 23, 2017 at 1:30am — No Comments

“Multicollinearity” a Problem or an Opportunity?

Multicollinearity (Collinearity) is not a new term especially when dealing with multiple regression models. This phenomenon of relationship in between one response variable with the set of predictor variables also include models like classification and regression trees as well as neural networks. Collinearity is infamously famous for inflating the variance of at least one estimated regression coefficient, which can cause the model to predict erroneously and in a business setup it can have an…

Continue

Added by Sunil Kappal on March 6, 2017 at 10:00am — No Comments

Deducer Tutorial: Creating Linear Model using R Deducer Package

Linear Model better known as linear regression is one of the most common and flexible analysis framework to identify relationship between two or more variables. The widely used linear model is represented by drawing the best fit line through a series of data points represented on a scatter plot. 

For any budding business analyst this should be the starting point to understand how model works at the very core of its design.

Selecting the Variables in Deducer…

Continue

Added by Sunil Kappal on February 28, 2017 at 7:00am — No Comments

Useful R Packages that Aligns with The CRISP DM Methodology

As we all know CRISP DM stands for Cross Industry Standard Process for Data Mining is a process model that outlines the most common approach to tackle data driven problems. Per the poll conducted by KDNuggets in 2014 this was and “is” one of the most popular and widest used methodology. This method of gleaning insights out of the data is very dear to the industry experts and data miners.

As the title suggest I will align some of the most useful R packages with this most popular and…

Continue

Added by Sunil Kappal on February 6, 2017 at 8:00am — 1 Comment

Fraud analysis using speech analytics and Monte Carlo

As per the largest market research firm MarketsandMarkets the speech analytics industry will grow to USD 1.60 billion by 2020 at a Compound Annual Growth Rate (CAGR) of 22% from 2015 to 2020. Today the omnichannel world consists of voice, email, chat, social channels, and surveys, and each channel has its own importance.

Therefore, it becomes inevitable for any customer centric organization to ignore the information that can be glean…

Continue

Added by Sunil Kappal on January 25, 2017 at 8:00am — 3 Comments

HealthCare Industry’s Savior: Using the Benford's Law (The Law of First Digit) to Debunk the Fraudsters

As the world is getting more tech savvy and advancements made in the information technology especially in the healthcare industry has opened areas in data mining and machine learning. Within the area of data mining one technique which has gained a lot of popularity as well as skepticism among the auditors and fraud detectives is Benford’s Law or “The Law of First digit.

In the past some researchers in Canada used the Benford’s Law distribution to detect anomalies within the claims…

Continue

Added by Sunil Kappal on January 16, 2017 at 5:12am — 5 Comments

How to create a Best-Fitting regression model?

Best Subset Regression method can be used to create a best-fitting regression model. This technique of model building helps to identify which predictor (independent) variables should be included in a multiple regression model(MLR).

This method comprises of scrutinizing all of the models created from all possible permutation combination of predictor variables. This technique uses the R Squared value to check for the best model. Considering the level of complexity involved in creating…

Continue

Added by Sunil Kappal on December 29, 2016 at 8:00am — 7 Comments

Learn the Concept of linearity in Regression Models

This Tutorial talks about basics of Linear regression by discussing in depth about the concept of Linearity and Which type of linearity is desirable.

What is the meaning of the term Linear ?



In Linear Regression the term linear is understood in 2 ways -

  1. Linearity in variables
  2. Linearity in parameters…



Continue

Added by Shantanu Deo on March 16, 2016 at 4:30am — No Comments

Simple Regression use in Big Data

We have witnessed the rise of Key & Value pair, since the emergence of Big Data. We certainly can explore the relationship of such two variables in terms of X & Y, to be worked with in terms of using Data Science. The use of Regression also on basic terms gives an a depiction of two variables X & Y to work with. These variables are:

Independent Variables & Dependent Variables

Let us take behavior of users of a…

Continue

Added by Atif Farid Mohammad on May 25, 2015 at 6:00am — No Comments

Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

1999

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service