Subscribe to DSC Newsletter

All Blog Posts (6,674)

Math for Data Science in One Picture: What do you REALLY need to study?

In my last blog post, I covered the statistics you need to know for data science. But of course, stats isn't the only math related knowledge you need. Rather than offer my own biased opinion about the importance of this subject vs. that one, I performed a meta analysis of popular opinion to see what data scientists and educators are saying (see…


Added by Stephanie Glen on December 14, 2019 at 7:17am — No Comments

Thursday News, December 12

Here is our selection of featured articles and technical resources posted since Monday:

Technical Resources


Added by Vincent Granville on December 12, 2019 at 12:00pm — No Comments

The Rise of Fake News: A Machine Learning Challenge

Guest blog by Faruqui Ismail and NookaRaju Garimella.

Reporters with various forms of "fake news" from an 1894 illustration by Frederick Burr…


Added by Vincent Granville on December 12, 2019 at 10:31am — No Comments

2020 Trends, Predictions and Challenges for Data Management and Privacy

As we move into 2020, data management will continue to advance and develop efficiencies that will make the job of having data ready for business purposes faster and more reliable than ever. While the data management space is a diverse field in its practices, there are four trends that will be forefront in 2020:

  • Data Orchestration – The uniting of data integration, API integration, and data movement to support DataOps techniques. This involves combining multiple…

Added by Todd Wright on December 12, 2019 at 6:00am — No Comments

Performance evaluation of cloud computing platforms for Machine Learning

A use case on Logistic regression training

Over the last few years there are several efforts for more powerful computing platforms to face the challenges imposed by emerging applications like machine learning. General purpose CPUs have been developed specialized ML modules, GPUs and FPGAs with specialized engines are…


Added by Chris Kachris on December 12, 2019 at 4:30am — No Comments

Web Scraping with a Headless Browser: A Puppeteer Tutorial

Web development has moved at a tremendous pace in the last decade with a lot of frameworks coming in for both backend and frontend development. Websites have become smarter and so have the underlying frameworks used in developing them. All these advancements in web development have led to the development of the browsers themselves too.  Most of the browsers are now available with a “headless” version where a user can interact with a website without any UI. You…


Added by Sandra Moraes on December 11, 2019 at 8:00pm — No Comments

Optimal Binning for Scoring Modeling (R Package)

What is Binning?

Binning is the term used in scoring modeling for what is also known in Machine Learning as Discretization, the process of transforming a continuous characteristic into a finite number of intervals (the bins), which allows for a better understanding of its distribution and its relationship with a binary variable. The bins generated by the this process will eventually become the attributes of a…


Added by Vincent Granville on December 11, 2019 at 11:12am — No Comments

The Rise of Fake News. A Machine Learning challenge!

By Faruqui Ismail and NookaRaju Garimella

Reporters with various forms of "fake news" from an 1894 illustration by Frederick Burr Opper


We’ve always pictured the rise of artificial intelligence as…


Added by Faruqui Ismail on December 11, 2019 at 8:00am — No Comments

Machine Learning Market is Rising Due to Rapid Increase in Unstructured Data

The Global Machine Learning Market is expected to expand at 42.08% CAGR during the forecast period 2018–2024Machine learning is a branch of artificial intelligence (AI) that uses statistical techniques for analytical model building that imparts the computers with the ability to learn from data instead of being…


Added by Ehtesham Peerzade on December 11, 2019 at 2:30am — No Comments

It Is Never Too Late To Learn!

The article by Stefanie Glen in the November 30 DSC Newsletter  is spot on!  I am a 77-year old Data Scientist, and I have done my best work since I “retired” in 2009.  Since then, I published 3 books on Data Science topics with Academic Press, and a 4th book is in press at Cambridge University Press.  I began teaching Data Science at the University of California at Irvine in 2012.  All of my students are international (in an international program at UCI), and almost all of them…


Added by Robert Nisbet on December 10, 2019 at 6:14pm — No Comments

Make Crucial Predictions as Data Comes

Walking by the hottest IT streets in these days means you've likely heard about achieving Streaming Machine Learning, i.e. moving AI towards streaming scenario and exploiting the real-time capabilities along with new Artificial Intelligence techniques. Moreover, you will also notice the lack of research related to this topic, despite the growing interest in it.

If we try to investigate it a little bit deeper then, we realize that…


Added by Valeria on December 10, 2019 at 7:30am — No Comments

Why Event Stream Processing Is Leading the New Big Data Era

Big Data is probably one of the most misused words of the last decade. It was widely promoted, discussed, and spread around by business managers, technical experts, and experienced academics. Slogans like “Data is the new oil” were widely accepted as unquestionable truth.

These beliefs pushed  technologies forward. Its stack, formerly developed by Yahoo! and now owned by the Apache Software Foundation, was recognized as “The” Big Data…


Added by Valeria on December 10, 2019 at 7:21am — No Comments

Deep Analytics: Risk Management with AI

We first provide a mini-tutorial on  Adjoint Algorithmic Differentiation (AAD) (also known as back-propagation in machine learning). We then illustrate how  neural networks may be used to compute dynamic values and risks of trading books with applications to risk management of derivatives,  valuation adjustments (XVA), counterpart credit risk, FRTB and SIMM margin valuation adjustments (MVA). We also describe new techniques to substantially improve deep learning on simulated data, and…


Added by Antoine Savine on December 10, 2019 at 1:30am — No Comments

Fun with maps: Part 2

Last time we created a beautiful map with a lot of features, see here. This time I will show you how to customize different things. I use the same data.

map1= folium.Map(location=[10,20], zoom_start=2, tiles='{z}/{x}/{y}.png',attr="Dr.Katharina Glass")

Let’s start with marker.…


Added by Dr. Katharina Glass on December 9, 2019 at 10:00pm — No Comments

CPU Vendors Compete Over Memory Bandwidth to Achieve Leadership in Real-World Application Performance

By Rob Farber

Now is a great time to be procuring systems as vendors are finally addressing the memory bandwidth bottleneck. Succinctly, memory performance dominates the performance envelope of modern devices be they CPUs or GPUs. [i] It does not matter if…


Added by Rob Farber on December 9, 2019 at 10:00am — No Comments

Statistics for Data Science in One Picture

There's no doubt about it, probability and statistics is an enormous field, encompassing topics from the familiar (like the average) to the complex (…


Added by Stephanie Glen on December 9, 2019 at 7:48am — No Comments

Self Aware Streaming

 Self Aware Streaming


1. Problem Statement

By processing data in motion, Real time/stream processing enables you to get insight into your business and make vital decisions.

Challenges in Stream Processing -

  • Over-provisioning of resources for…

Added by Daljeet Kaur on December 9, 2019 at 2:30am — No Comments

Android App Development: Tips for AI Integration

Android app development has gone through a plethora of ground-breaking changes since it first emerged on the scene. And so has the technology, in general, and this persistent evolution of technology has given several pioneering tools that now play a crucial role in the world. Take artificial intelligence, for example. AI, which makes use of human-like aptitude in combination with machine learning, has now firmly established itself as a critical technology for every industry in the world. And…


Added by Ryan Williamson on December 8, 2019 at 10:39pm — No Comments

Data Cleaning and Wrangling With R

Originally posted by Michael Grogan.

One of the big issues when it comes to working with data in any context is the issue of data cleaning and merging of datasets, since it is often the case that you will find yourself having to collate data across multiple files, and will need to rely on R to carry out functions that you would normally carry out using commands like VLOOKUP in Excel.



Added by Vincent Granville on December 8, 2019 at 7:50pm — No Comments

What are the Core Skills needed in Data Science?

Data are becoming the new raw material of business.”-Craig Mundie, Senior Advisor to the CEO at Microsoft

In a fast-paced technology-driven world, Data becomes the ‘the new oil’, which flows like bloodline to every business decisions and strategies such as launching a new product, expanding a new line of assembly, improving the…


Added by Ariane Rose Reyes on December 8, 2019 at 4:30pm — No Comments

Blog Topics by Tags

Monthly Archives












  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service