Subscribe to DSC Newsletter
Fabrice JOURDAN
  • Male
  • PARIS
  • France
Share on Facebook
Share

Fabrice JOURDAN's Discussions

How to check/optimize cross validation with randomforest on imbalanced classes ?

Started this discussion. Last reply by Vincent Granville May 28, 2018. 3 Replies

Hi everybody, here's a summary of my study followed with few question on randomforestPopulation : 3300 observables, minority class 150 observables (~4%)Predictors : ~70 , just 1 numerical, all others…Continue

RandomForest for imbalanced classes

Started this discussion. Last reply by Fabrice JOURDAN May 14, 2018. 2 Replies

Hi everybody, here's a summary of my study followed with few question on randomforestPopulation : 3300 observables, minority class 150 observables (~4%)Predictors : ~70 , just 1 numerical, all others…Continue

Gifts Received

Gift

Fabrice JOURDAN has not received any gifts yet

Give a Gift

 

Fabrice JOURDAN's Page

Latest Activity

Tim Matteson liked Fabrice JOURDAN's discussion How to check/optimize cross validation with randomforest on imbalanced classes ?
May 31, 2018
Vincent Granville replied to Fabrice JOURDAN's discussion How to check/optimize cross validation with randomforest on imbalanced classes ?
"Yes 10 predictors are OK, but the data set seems a bit small, so the risk of over-fitting is higher than with (say) 50,000 observations."
May 28, 2018
Fabrice JOURDAN replied to Fabrice JOURDAN's discussion How to check/optimize cross validation with randomforest on imbalanced classes ?
"Thank's Vincent, I understand for over-sampling i'll see what to do. Why did you say " I would use less than 5 predictors", is it in comparison of 150 observables in minority class ? Let say 30 observables per predictor ?…"
May 28, 2018
Vincent Granville replied to Fabrice JOURDAN's discussion How to check/optimize cross validation with randomforest on imbalanced classes ?
"Your data set is a bit small. The classic solution is to over-sample under-represented classes. I've been doing it routinely but on data sets with 50+ million observations, where the class "fraud" (versus "non fraud")…"
May 27, 2018
Fabrice JOURDAN's discussion was featured

How to check/optimize cross validation with randomforest on imbalanced classes ?

Hi everybody, here's a summary of my study followed with few question on randomforestPopulation : 3300 observables, minority class 150 observables (~4%)Predictors : ~70 , just 1 numerical, all others are booleanI use features selection in order to reduce the number of predictorsI remove predictors with lowest variance, lowest correlation with my target variable, also i use t-test (mean difference between 2 classes)I keep around 20 predictors for 150 observables in my signalNB:I didnt use yet…See More
May 25, 2018
Fabrice JOURDAN posted a discussion

How to check/optimize cross validation with randomforest on imbalanced classes ?

Hi everybody, here's a summary of my study followed with few question on randomforestPopulation : 3300 observables, minority class 150 observables (~4%)Predictors : ~70 , just 1 numerical, all others are booleanI use features selection in order to reduce the number of predictorsI remove predictors with lowest variance, lowest correlation with my target variable, also i use t-test (mean difference between 2 classes)I keep around 20 predictors for 150 observables in my signalNB:I didnt use yet…See More
May 25, 2018
Tim Matteson liked Fabrice JOURDAN's discussion RandomForest for imbalanced classes
May 22, 2018
Fabrice JOURDAN replied to Fabrice JOURDAN's discussion RandomForest for imbalanced classes
""F1-score" I determine f1-score during "Parameters tuning" of RandomForest. For each set of parameters (few hundreds) i determine the thresholdwhich give me the best f1score (mostly between 0.09 and 0.13). So i do not use display…"
May 14, 2018
Danylo Zherebetskyy replied to Fabrice JOURDAN's discussion RandomForest for imbalanced classes
"These are some questions that, hopefully, may help to move on: - for f1-score, what is the probability threshold for the classification? is it standard 0.5 or you determined it from AUROC curves? - since there is one continuous feature, the trees…"
May 14, 2018
Fabrice JOURDAN's discussion was featured

RandomForest for imbalanced classes

Hi everybody, here's a summary of my study followed with few question on randomforestPopulation : 3300 observables, minority class 150 observables (~4%)Predictors : ~70 , just 1 numerical, all others are booleanI use features selection in order to reduce the number of predictorsI remove predictors with lowest variance, lowest correlation with my target variable, also i use t-test (mean difference between 2 classes)I keep around 20 predictors for 150 observables in my signalNB:I didnt use yet…See More
May 9, 2018
Fabrice JOURDAN posted a discussion

RandomForest for imbalanced classes

Hi everybody, here's a summary of my study followed with few question on randomforestPopulation : 3300 observables, minority class 150 observables (~4%)Predictors : ~70 , just 1 numerical, all others are booleanI use features selection in order to reduce the number of predictorsI remove predictors with lowest variance, lowest correlation with my target variable, also i use t-test (mean difference between 2 classes)I keep around 20 predictors for 150 observables in my signalNB:I didnt use yet…See More
May 9, 2018

Profile Information

Professional Status
Consultant
Years of Experience:
15
Your Job Title:
Consultant
How did you find out about DataScienceCentral?
Internet
Interests:
Finding a new position, Networking, New venture

Comment Wall

You need to be a member of Data Science Central to add comments!

Join Data Science Central

  • No comments yet!
 
 
 

Videos

  • Add Videos
  • View All

Follow Us

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service