Subscribe to DSC Newsletter
Rohit Walimbe
  • Male
  • Pune
  • India
Share on Facebook
Share

Gifts Received

Gift

Rohit Walimbe has not received any gifts yet

Give a Gift

 

Rohit Walimbe's Page

Latest Activity

Prasanth liked Rohit Walimbe's blog post Is it ‘always’ necessary to treat outliers in a machine learning model?
Nov 2
Haizea Rumayor Lazkano commented on Rohit Walimbe's blog post Overview and Classification of Machine Learning Problems
"Thank you Rohit, very interesting.With your permission I have reordered and completed with some new links. It seems to me that in this way is better understood.Regards,Overview%20and%20Classification%20of%20Machine%20Learning%20Problems.xlsx"
Aug 21
Katerina St liked Rohit Walimbe's blog post Overview and Classification of Machine Learning Problems
Aug 6
Karthik Dulam liked Rohit Walimbe's blog post Handling imbalanced dataset in supervised learning using family of SMOTE algorithm.
Apr 12
Rohit Walimbe posted a blog post

Is it ‘always’ necessary to treat outliers in a machine learning model?

Outliers is one of those issues we come across almost every day in a machine learning modelling. Wikipedia defines outliers as “an observation point that is distant from other observations.” That means, some minority cases in the data set are different from the majority of the data. I would like to classify outlier data in to two main categories: Non-Natural and Natural.The non-natural outliers are those which are caused by measurement errors, wrong data collection or wrong data entry. While…See More
Apr 11
Rohit Walimbe's blog post was featured

Is it ‘always’ necessary to treat outliers in a machine learning model?

Outliers is one of those issues we come across almost every day in a machine learning modelling. Wikipedia defines outliers as “an observation point that is distant from other observations.” That means, some minority cases in the data set are different from the majority of the data. I would like to classify outlier data in to two main categories: Non-Natural and Natural.The non-natural outliers are those which are caused by measurement errors, wrong data collection or wrong data entry. While…See More
Apr 11
ANISH XAVIER liked Rohit Walimbe's blog post Handling imbalanced dataset in supervised learning using family of SMOTE algorithm.
Dec 12, 2017
Rohit Walimbe posted blog posts
Apr 26, 2017
Rohit Walimbe's blog post was featured

Handling imbalanced dataset in supervised learning using family of SMOTE algorithm.

Consider a problem where you are working on a machine learning classification problem. You get an accuracy of 98% and you are very happy. But that happiness doesn’t last long when you look at the confusion matrix and realize that majority class is 98% of the total data and all examples are classified as majority class. Welcome to the real world of imbalanced data sets!!Some of the well-known examples of imbalanced data sets are1 - Fraud detection:  where number of fraud cases could be much…See More
Apr 25, 2017
Rohit Walimbe posted a blog post

Avoiding Look Ahead Bias in Time Series Modelling

Any time series classification or regression forecasting involves the Y prediction at 't+n' given the X and Y information available till time T. Obviously no data scientist or statistician can deploy the system without back testing and validating the performance of model in history. Using the future actual information in training data which could be termed as "Look Ahead Bias" is probably the gravest mistake a data scientist can make. Even the sentence “we cannot make use future data in…See More
Apr 21, 2017
Rohit Walimbe's blog post was featured

Avoiding Look Ahead Bias in Time Series Modelling

Any time series classification or regression forecasting involves the Y prediction at 't+n' given the X and Y information available till time T. Obviously no data scientist or statistician can deploy the system without back testing and validating the performance of model in history. Using the future actual information in training data which could be termed as "Look Ahead Bias" is probably the gravest mistake a data scientist can make. Even the sentence “we cannot make use future data in…See More
Apr 21, 2017

Profile Information

Short Bio
Experienced Data Scientist and Quant with a demonstrated history of working in various domains like BFSI, Manufacturing, Retail, Risk, etc. Strong knowledge of Machine Learning, Predictive Analytics, Network Theory, Time Series Analysis, Trading Systems, Derivative Pricing and Financial Mathematics. Skilled in R, Python, Matlab, VBA and SQL
My Web Site Or LinkedIn Profile
http://www.linkedin.com/in/rohit-walimbe-36309b15
Professional Status
Manager
Years of Experience:
6
Your Company:
Tata Consultancy Services
Industry:
IT /Consultancy
Your Job Title:
Assistant Manager
Interests:
Finding a new position, Networking

Rohit Walimbe's Blog

Is it ‘always’ necessary to treat outliers in a machine learning model?

Posted on April 9, 2018 at 2:30am 0 Comments

Outliers is one of those issues we come across almost every day in a machine learning modelling. Wikipedia defines outliers as “an observation point that is distant from other observations.” That means, some minority cases in the data set are different from the majority of the data. I would like to classify outlier data in to two main categories: Non-Natural and Natural.

The non-natural outliers are those which are caused by measurement errors,…

Continue

Handling imbalanced dataset in supervised learning using family of SMOTE algorithm.

Posted on April 24, 2017 at 10:00pm 0 Comments

Consider a problem where you are working on a machine learning classification problem. You get an accuracy of 98% and you are very happy. But that happiness doesn’t last long when you look at the confusion matrix and realize that majority class is 98% of the total data and all examples are classified as majority class. Welcome to the real world of imbalanced data sets!!…

Continue

Avoiding Look Ahead Bias in Time Series Modelling

Posted on April 21, 2017 at 6:00am 0 Comments

Any time series classification or regression forecasting involves the Y prediction at 't+n' given the X and Y information available till time T. Obviously no data scientist or statistician can deploy the system without back testing and validating the performance of model in history. Using the future actual information in training data which could be termed as "Look Ahead Bias" is probably the gravest mistake a data scientist can make. Even the sentence “we cannot make use future…

Continue

Comment Wall

You need to be a member of Data Science Central to add comments!

Join Data Science Central

  • No comments yet!
 
 
 

Videos

  • Add Videos
  • View All

Follow Us

© 2018   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service