StackOverflow’s annual developer survey concluded earlier this year, and they have graciously published the (anonymized) 2019 results for analysis. They’re a rich view into the experience of software developers around the world — what’s…Continue
Added by Sean Owen on August 8, 2019 at 8:00am — No Comments
Ecommerce sites generate tons of web server log data which can provide valuable insights through analysis. For example, if we know which users are more likely to buy a product, we can perform targeted marketing, improve relevant product placement on our site and lift conversion rates. However, raw web logs are often enormous and messy so preparing the data to train a predictive model is time consuming for data scientists.…
Added by Ayumi Owada on July 18, 2019 at 2:00pm — No Comments
Recently Kaggle master Kazanova along with some of his friends released a "How to win a data science competition" Coursera course. The Course involved a final project which itself was a time series prediction problem. Here I will describe how I got a top 10 position as of writing this…Continue
Added by Rahul Agarwal on December 18, 2018 at 9:30am — No Comments
Summary: The drive toward transparency and explainability in our modeling seems unstoppable. Up to now that meant sacrificing accuracy for interpretability. However, the ensemble method known as RuleFit may be the answer with both explainability and accuracy meeting or exceeding Random Forest.
If you’re like me and not doing modeling in a highly regulated industry like mortgage finance or insurance then when you produce a model, you are…Continue
Added by William Vorhies on June 27, 2017 at 10:02am — No Comments
Summary: Want to win a Kaggle competition or at least get a respectable place on the leaderboard? These days it’s all about ensembles and for a lot of practitioners that means reaching for random forests. Random forests have indeed been very successful but it’s worth remembering that there are three different categories of ensembles and some important hyper parameters tuning issues within each Here’s a brief review.