Subscribe to DSC Newsletter

February 2018 Blog Posts (98)

The Concept of Datafication; Definition & Examples

This article was written by Margarita Shilova.

Datafication is a buzzword of the last several years, that is used actively along Big Data industry. Honestly, if you would search the term ‘datafication’ on the internet you probably won’t find that much relative information about it, yet it is a word we are…


Added by Amelia Matteson on February 6, 2018 at 6:30pm — No Comments

Curve Fitting using Linear and Nonlinear Regression

This article was written by Jim Frost.

In regression analysis, curve fitting is the process of specifying the model that provides the best fit to the specific curves in your dataset. Curved relationships between variables are not as straightforward to fit and interpret as linear relationships.…


Added by Amelia Matteson on February 6, 2018 at 6:00pm — 1 Comment

Why is it so hard to train data scientists?

Some time ago I met a colleague who expressed her disappointment from two data scientists that she hired. These were the first employees with a data science degree hired by that company, and apparently did not meet the high expectations. She felt that in some cases the data scientists did not do work she could not do without them, and in other cases did not provide very useful insights.


I do not have specific information about the training and background of these two data…


Added by Lior Shamir on February 6, 2018 at 4:30pm — 9 Comments

Most Data Science Job Ads are Targeted to Young People - And How to Get a Job at Facebook

If you are on Facebook, you've probably seen their ads to attract candidates to data science positions. You've probably seen ads featuring many young women working at Facebook in data science roles, and while I think this is great, it is a misrepresentation of their workforce (70% are males, in engineering positions.) 

What is surprising though, is that you would think these ads would be targeted to people who can relate to the picture featured in the ad. Yet, old white males see the…


Added by Vincent Granville on February 6, 2018 at 1:30pm — 1 Comment

Cliff Notes for Managing the Data Science Function

Summary:  There are an increasing number of larger companies that have truly embraced advanced analytics and deploy fairly large numbers of data scientists.  Many of these same companies are the one’s beginning to ask about using AI.  Here are some observations and tips on the problems and opportunities associated with managing a larger data science function.



Added by William Vorhies on February 6, 2018 at 8:09am — 3 Comments

TimeSeries.OBeu v1.2.2 release on CRAN

TimeSeries.OBeu v1.2.2 release on CRAN

We are very pleased to announce TimeSeries.OBeu v1.2.2 on CRAN!

TimeSeries.OBeu is used on data mininig tool platform with OpenCPU integration of R and JavaScript to estimate and return the needed…


Added by Kleanthis Koupidis on February 6, 2018 at 3:00am — No Comments

So, How Many ML Models You Have NOT Built?

What a weird question. That’s what you would have thought after reading the headline. Perhaps you thought the word “NOT” was accidental.

Hmm, for past few years many of us have come across articles like

  • “Top 10 Machine Learning Algorithms every Data Scientist …

Added by Venkat Raman on February 6, 2018 at 12:30am — 3 Comments

Weekly Digest, February 5

Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week.

  • Enhance your data analysis skills with an online degree from Penn State World Campus. Our programs focus on data and data management. Learn to collect, classify,…

Added by Vincent Granville on February 4, 2018 at 2:00pm — No Comments

AI’s Impact on Retail: Examples of Walmart and Amazon

Artificial Intelligence or AI is expected to be in major demand by retail consumers due to its ability to make interactions in retail as flawless and seamless as possible. Many of us do realize the potential of AI and all that it is capable of, along with the support of Machine Learning or ML, but don’t realize…


Added by Ronald van Loon on February 4, 2018 at 1:30pm — No Comments

Are the Digits of Pi Truly Random? - Updated with Blockchain Application and More

This article covers far more than the title suggests. It is written in simple English and accessible to quantitative professionals from a variety of backgrounds. Deep mathematical and data science research (including a result about the randomness of Pi, which is just a particular case) are presented here, without using arcane terminology or complicated equations.  

The topic discussed here, under a unified framework, is at the intersection of mathematics, probability theory, chaotic…


Added by Vincent Granville on February 4, 2018 at 9:30am — 6 Comments

Handbook of Research on Predictive Modeling and Optimization Methods in Science and Engineering


The disciplines of science and engineering rely heavily on the forecasting of prospective constraints for concepts that have not yet been proven to exist, especially in areas such as artificial intelligence. Obtaining quality solutions to the problems presented becomes increasingly difficult due to the number of steps required to sift through the possible solutions, and the ability to solve such problems relies on the recognition of…


Added by Sanjiban Sekhar Roy on February 3, 2018 at 9:30am — No Comments

Hottest Research Projects

As a potential Ph.D. candidate in data analytics and as an AI enthusiast and architect. I wondered why is it not the case that I can ask an AI that question "Which research topic would you recommend me to dig in"?! A bit vague though, so I decided to be more specific in order to simplify the problem for the AI. Which research topic is getting most grant funds?!

I think an AI should be at least able to answer this question if you give it a data source that it parse/read from…


Added by Emad Mohamed on February 2, 2018 at 4:00pm — No Comments

The one critical skill many data scientists are missing

This article was written by Emma Walker. Emma is a data scientist at Qriously.

When I started to learn about data science and consider it as a career choice, there was a diagram that I came across regularly and still come across today, in articles and even text books aimed at introducing and educating the world about the “sexiest job of the 21st century.” First created by Drew Conway, it illustrates the three broad skill groups you need to be a data scientist.…


Added by Emmanuelle Rieuf on February 2, 2018 at 9:00am — 2 Comments

Paradox Regarding Random (Normal) Numbers

I am investigating if some numbers like Pi or SQRT(2) are normal in base 2 or 10, that is, whether any sequence of digits appear with the expected frequency in their decimal representation.  Actually, I am even more interested in the nested square root representation (see here) where "digits" are either 0 (with probability 0.43), 1 (with probability 0.30) or 2 (with…


Added by Vincent Granville on February 2, 2018 at 7:30am — No Comments

Apache Spark with Scala- Learning Path Decoded

When you are going back and forth on learning Big Data and Apache Spark, most of the resources on the Internet will point you towards Scala. The programming language has earned credits for itself lately and is the most talked about language in the Big Data arena.

The number of job posting demanding Scala as a skill has exponentially boosted on the job search platform Indeed since 2015. but, learning a new programming…


Added by Samual Alister on February 2, 2018 at 1:30am — 3 Comments

Why 80% U.S Companies Will Be Shifted Towards Cloud Based Apps

According to the research  team at IBM, about 80% of the companies in the U.S. today want to use more cloud managed services. There are two main reasons for this.…


Added by Elianna Hyde on February 1, 2018 at 7:00pm — No Comments

Thursday News: Feature Reduction, Causation, Math Challenges, Qualitative Data

Here is our selection of featured articles and resources posted since Monday:


Added by Vincent Granville on February 1, 2018 at 10:00am — No Comments

Using SQL to Estimate Customer Lifetime Value (LTV) without Machine Learning

The Statsbot team estimated LTV 592 times for different clients and business models. 

Customer lifetime value, or LTV, is the amount of money that a customer will spend with your business in their “lifetime,” or at least, in the portion of it that they spend in a relationship with you. It’s an important indicator of how much you can spend on acquiring new customers. For example, your customer acquisition cost (CAC) is $150, and LTV is…


Added by Luba Belokon on February 1, 2018 at 8:30am — No Comments

Blog Topics by Tags

Monthly Archives











© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service