Subscribe to DSC Newsletter

All Blog Posts Tagged 'xgboost' (5)

Detecting Bias with SHAP

Originally published on the Databricks blog with accompanying notebook.

StackOverflow’s annual developer survey concluded earlier this year, and they have graciously published the (anonymized) 2019 results for analysis. They’re a rich view into the experience of software developers around the world — what’s…

Continue

Added by Sean Owen on August 8, 2019 at 8:00am — No Comments

Streamlining Predictive Modeling Workflow with Sagemaker and Essentia

Ecommerce sites generate tons of web server log data which can provide valuable insights through analysis. For example, if we know which users are more likely to buy a product, we can perform targeted marketing, improve relevant product placement on our site and lift conversion rates. However, raw web logs are often enormous and messy so preparing the data to train a predictive model is time consuming for data scientists.…

Continue

Added by Ayumi Owada on July 18, 2019 at 2:00pm — No Comments

Using XGBoost for time series prediction tasks

Recently Kaggle master Kazanova along with some of his friends released a "How to win a data science competition" Coursera course. The Course involved a final project which itself was a time series prediction problem. Here I will describe how I got a top 10 position as of writing this…

Continue

Added by Rahul Agarwal on December 18, 2018 at 9:30am — No Comments

Using RuleFit Ensemble Models Is About to Become Very Important

Summary:  The drive toward transparency and explainability in our modeling seems unstoppable.  Up to now that meant sacrificing accuracy for interpretability.  However, the ensemble method known as RuleFit may be the answer with both explainability and accuracy meeting or exceeding Random Forest.

 

If you’re like me and not doing modeling in a highly regulated industry like mortgage finance or insurance then when you produce a model, you are…

Continue

Added by William Vorhies on June 27, 2017 at 10:02am — No Comments

Want to Win Competitions? Pay Attention to Your Ensembles.

Summary: Want to win a Kaggle competition or at least get a respectable place on the leaderboard?  These days it’s all about ensembles and for a lot of practitioners that means reaching for random forests.  Random forests have indeed been very successful but it’s worth remembering that there are three different categories of ensembles and some important hyper parameters tuning issues within each  Here’s a brief review.

 …



Continue

Added by William Vorhies on May 25, 2016 at 7:30am — 1 Comment

Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

1999

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service