Subscribe to DSC Newsletter
Fabrice JOURDAN
  • Male
  • PARIS
  • France
Share on Facebook
Share

Fabrice JOURDAN's Discussions

How to check/optimize cross validation with randomforest on imbalanced classes ?

Started this discussion. Last reply by Vincent Granville May 28. 3 Replies

Hi everybody, here's a summary of my study followed with few question on randomforestPopulation : 3300 observables, minority class 150 observables (~4%)Predictors : ~70 , just 1 numerical, all others…Continue

RandomForest for imbalanced classes

Started this discussion. Last reply by Fabrice JOURDAN May 14. 2 Replies

Hi everybody, here's a summary of my study followed with few question on randomforestPopulation : 3300 observables, minority class 150 observables (~4%)Predictors : ~70 , just 1 numerical, all others…Continue

Gifts Received

Gift

Fabrice JOURDAN has not received any gifts yet

Give a Gift

 

Fabrice JOURDAN's Page

Latest Activity

Tim Matteson liked Fabrice JOURDAN's discussion How to check/optimize cross validation with randomforest on imbalanced classes ?
May 31
Vincent Granville replied to Fabrice JOURDAN's discussion How to check/optimize cross validation with randomforest on imbalanced classes ?
"Yes 10 predictors are OK, but the data set seems a bit small, so the risk of over-fitting is higher than with (say) 50,000 observations."
May 28
Fabrice JOURDAN replied to Fabrice JOURDAN's discussion How to check/optimize cross validation with randomforest on imbalanced classes ?
"Thank's Vincent, I understand for over-sampling i'll see what to do. Why did you say " I would use less than 5 predictors", is it in comparison of 150 observables in minority class ? Let say 30 observables per predictor ?…"
May 28
Vincent Granville replied to Fabrice JOURDAN's discussion How to check/optimize cross validation with randomforest on imbalanced classes ?
"Your data set is a bit small. The classic solution is to over-sample under-represented classes. I've been doing it routinely but on data sets with 50+ million observations, where the class "fraud" (versus "non fraud")…"
May 27
Fabrice JOURDAN's discussion was featured

How to check/optimize cross validation with randomforest on imbalanced classes ?

Hi everybody, here's a summary of my study followed with few question on randomforestPopulation : 3300 observables, minority class 150 observables (~4%)Predictors : ~70 , just 1 numerical, all others are booleanI use features selection in order to reduce the number of predictorsI remove predictors with lowest variance, lowest correlation with my target variable, also i use t-test (mean difference between 2 classes)I keep around 20 predictors for 150 observables in my signalNB:I didnt use yet…See More
May 25
Fabrice JOURDAN posted a discussion

How to check/optimize cross validation with randomforest on imbalanced classes ?

Hi everybody, here's a summary of my study followed with few question on randomforestPopulation : 3300 observables, minority class 150 observables (~4%)Predictors : ~70 , just 1 numerical, all others are booleanI use features selection in order to reduce the number of predictorsI remove predictors with lowest variance, lowest correlation with my target variable, also i use t-test (mean difference between 2 classes)I keep around 20 predictors for 150 observables in my signalNB:I didnt use yet…See More
May 25
Tim Matteson liked Fabrice JOURDAN's discussion RandomForest for imbalanced classes
May 22
Fabrice JOURDAN replied to Fabrice JOURDAN's discussion RandomForest for imbalanced classes
""F1-score" I determine f1-score during "Parameters tuning" of RandomForest. For each set of parameters (few hundreds) i determine the thresholdwhich give me the best f1score (mostly between 0.09 and 0.13). So i do not use display…"
May 14
Danylo Zherebetskyy replied to Fabrice JOURDAN's discussion RandomForest for imbalanced classes
"These are some questions that, hopefully, may help to move on: - for f1-score, what is the probability threshold for the classification? is it standard 0.5 or you determined it from AUROC curves? - since there is one continuous feature, the trees…"
May 14
Fabrice JOURDAN's discussion was featured

RandomForest for imbalanced classes

Hi everybody, here's a summary of my study followed with few question on randomforestPopulation : 3300 observables, minority class 150 observables (~4%)Predictors : ~70 , just 1 numerical, all others are booleanI use features selection in order to reduce the number of predictorsI remove predictors with lowest variance, lowest correlation with my target variable, also i use t-test (mean difference between 2 classes)I keep around 20 predictors for 150 observables in my signalNB:I didnt use yet…See More
May 9
Fabrice JOURDAN posted a discussion

RandomForest for imbalanced classes

Hi everybody, here's a summary of my study followed with few question on randomforestPopulation : 3300 observables, minority class 150 observables (~4%)Predictors : ~70 , just 1 numerical, all others are booleanI use features selection in order to reduce the number of predictorsI remove predictors with lowest variance, lowest correlation with my target variable, also i use t-test (mean difference between 2 classes)I keep around 20 predictors for 150 observables in my signalNB:I didnt use yet…See More
May 9

Profile Information

Professional Status
Consultant
Years of Experience:
15
Your Job Title:
Consultant
How did you find out about DataScienceCentral?
Internet
Interests:
Finding a new position, Networking, New venture

Comment Wall

You need to be a member of Data Science Central to add comments!

Join Data Science Central

  • No comments yet!
 
 
 

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2018   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service