Implementing some of the pillars of an automated machine learning pipeline such as (i) Automated data preparation, (ii) Feature engineering, (iii) Model building in classification context that includes techniques such as (a) Regularised regression , (b) Logistic regression , (c) Random Forest , (d) Decision tree  and (e) Extreme Gradient Boosting (xgboost) , and finally, (iv) Model explanation (using… Continue
Added by Dayananda U on June 15, 2020 at 5:30am —
As an academic discipline, the rate of maturation for data science should be measured in light years. Although it's really only about 10 years old as a field of study – with the first Ph.D. program in the country emerging just four years ago – most, major universities across the world have integrated data science into their portfolio of degree options. Universities… Continue
Added by Jennifer Lewis Priestley on February 12, 2019 at 10:52am —
This article has 4 sections:
- Introduction: Introduction to the gambler's ruin problem
- Methodology: How the simulation will be carried out
- Pseudocode: Summary of the Python Code
- Theory: A summary of the theory
The full code and theory can be found here: …
Added by Tansel Arif on February 5, 2019 at 4:55am —
This article is about Intuitive explanation of Degrees of Freedom and How Degrees of Freedom affects Sudoku.
A lot of aspiring Data Scientists take courses on statistics and get befuddled with the concept of Degrees of Freedom. Some memorize it by rote as ‘n-1'.
But there is a intuitive reason why it is ‘n-1’.…
Added by Venkat Raman on January 30, 2019 at 1:51am —
October is historically the most volatile month for stocks, but is this a persistent signal or just noise in the data?
Stocks, Significance Testing & p-Hacking. Follow me on Twitter (twitter.com/pdquant) for more. Over the past 32 years, October has been the most volatile month on average for the S&P500 and December the least, in this article we will use simulation to assess the… Continue
Added by Patrick David on January 18, 2019 at 5:30am —
Imagine you are a company selling a fast-moving consumer good in the market.
Let’s assume that the customer would follow the given journey to make the final purchase: These are the states at which the customer would be at any point in the purchase journey.
Now, how to find out in which state the customers would be after 6… Continue
Added by Ridhima Kumar on January 8, 2019 at 12:00am —
Imagine that one day, you see people are queuing up in front of Bank A; so you ask the staff at the counter, you are told that they are offering anyone (regardless of their credit history) a loan of $100,000 at a fixed annual rate at 2%. You then look around, the Bank B next door offers 1-year term deposit with a fixed annual rate at 3% for the same amount ($100,000). After 5 minutes' waiting, you sign for the loan from Bank… Continue
Added by Zhongmin Luo on November 21, 2018 at 2:30pm —
As we all know in today’s world of quick results and insights nobody wants to spend time in understanding the core concepts of certain statistical terms while performing analytical routine. One statistical term that is talked a lot but known very less in terms of its mechanics is R Squared statistics a.k.a. coefficient of determination. This statistics helps to measure the closeness of the data to the fitted line of regression.
It is also worth mentioning that by squaring the… Continue
Added by Sunil Kappal on July 11, 2018 at 3:35am —
Statistical Analysis is a way of collecting, presenting and exploring large amounts of data in order to discover underlying patterns and trends. It can be especially useful in banking, manufacturing or retail where knowing the future patterns might greatly benefit the businesses. Not without reason, it resembles and can cooperate with blockchain - a new tech… Continue
Added by James Mason on April 18, 2018 at 2:30pm —
What a weird question. That’s what you would have thought after reading the headline. Perhaps you thought the word “NOT” was accidental.
Hmm, for past few years many of us have come across articles like
- “Top 10 Machine Learning Algorithms every Data Scientist …
Added by Venkat Raman on February 6, 2018 at 12:30am —
Many a blogs and articles are written on how to become a Data Scientist. The list normally goes like this
- Study descriptive statistics, hypothesis testing, probability
- Learn types of Machine learning algorithms – Supervised, Unsupervised
- Learn Python, R, SAS, SQL
- Apply machine learning techniques using Python, R, SAS
- Learn Data Visualization
While there is nothing wrong in the path illustrated above, it is not the… Continue
Added by Venkat Raman on January 11, 2018 at 12:30am —
Remember the … Continue
Added by Peter Bruce on January 2, 2018 at 9:00am —
It sounds almost absurd, but that could be one factor behind the so-called “reproducibility crisis”
It sounds almost heretical to ask the question: Is there too much scientific research?… Continue
Added by Peter Bruce on November 27, 2017 at 3:00pm —
What’s the first thing that comes to mind when you hear the following phrases?
- Artificial grass
- Artificial sweeteners
- Artificial flavors
- Artificial plants
- Artificial flowers
- Artificial diamonds and jewelry
- Artificial (fake) news
These phrases probably evoke thoughts such as “fake,” “not real,” or even “shabby.” Artificial is such a harsh adjective.…
Added by Bill Schmarzo on October 30, 2017 at 6:30pm —
A data scientist needs to be Critical and always on a lookout for something that misses others. So here is some advice that one can include in the day to day data science work to be better at their work:
1. Beware of the Clean Data Syndrome
You need to ask yourself questions even before you start working on the data. **Does this data make sense?** Falsely assuming that the data is clean could lead you towards wrong Hypotheses. Apart from that, you can discern a… Continue
Added by Rahul Agarwal on March 29, 2017 at 10:00am —
Essentially good hypotheses lead decision-makers like you to new and better ways to achieve your business goals. When you need to make decisions such as how much you should spend on advertising or what effect a price increase will have your customer base,… Continue
Added by Vinay Babu on January 27, 2017 at 6:00pm —
Ever wonder how you can create meaningful insights with your data? Tired of being asked what's your big data strategy, are you doing predictive analytics, or how you are making use of the latest technologies? There is a way for you to get what you need with Data Science. The key to new knowledge is to make sure you use Data Science in your organization.…
Added by Damian Mingle on January 4, 2017 at 5:30am —
It is not only about understanding about statistics, it is also about implementing the correct statistical approach or method. In this brief article I will showcase some common statistical blunders that we generally make and how to avoid them.
To make this information simple and consumable I have divided these errors into two parts:
- Data Visualization Errors (Erroneous Graphs)
- Statistical Blunders Galore (pun intended)
Added by Sunil Kappal on January 3, 2017 at 7:00am —
In the wake of ZMOT (Zero Moment of Truth) it becomes pivotal for any product company to choose the most appropriate advertisement channel for the promotion of their products. This not only helps the organizations to maximize their chances of creating the best first impression but will also help them to be discovered by today’s tech savvy consumers.
Today we will talk about a very… Continue
Added by Sunil Kappal on December 30, 2016 at 8:30am —
At Grakn Labs we love technology. Here is our December 15th edition by Filipe Pinto Teixeira, where he looked back at Predictive Analytics.
Added by Raphaelle Rf on December 23, 2016 at 6:00am —