Big data involves a process of grasping and understanding the significance of massive data that is highly disorganized and is very difficult to analyze using conventional relational databases. Drawn from various sources, it is delivered in several ways. This data, which is invaluable in today’s globalized world, is growing exponentially.
As it affects all aspects of mankind, companies can tap it to serve their customers better and rise up the value chain. IBM estimates that…Continue
Added by Shravani Reddy on July 18, 2018 at 12:36am — No Comments
Spark’s primary abstraction is a distributed collection of items called a Resilient Distributed Dataset (RDD). It is a fault-tolerant collection of elements which allows parallel operations upon itself. RDDs can be created from Hadoop InputFormats (such as HDFS files) or by transforming other RDDs.
Added by Igor Bobriakov on July 17, 2018 at 11:07pm — No Comments
Hello, with this article I'm starting series of articles about full featured C++ Machine Learning frameworks . This articles covers how to use Shogun library for solving classification problem. Shogun is an open-source machine learning library that offers a wide range of machine learning algorithms. From my point of view it's not very popular among professionals, but it have a lot of fans among enthusiasts and…Continue
Added by Kyrylo Kolodiazhnyi on July 17, 2018 at 11:17am — No Comments
Summary: Getting an AI startup to scale for an IPO is currently elusive. Several different strategies are being discussed around the industry and here we talk about the horizontal strategy and the increasingly favored vertical strategy.
Added by William Vorhies on July 17, 2018 at 7:00am — No Comments
Pandas dataframe is making life a lot easier if you are working with data. A lot of data comes in CSV format. It's possible to read CSV files to a Pandas dataframe. In fact, it's quite easy using read_csv. In the video below we will learn the basics of just loading a CSV file to a Pandas dataframe object. …Continue
Added by Erik Marsja on July 17, 2018 at 6:00am — No Comments
Natural language processing (NLP) is getting very popular today, which became especially noticeable in the background of the deep learning development. NLP is a field of artificial intelligence aimed at understanding and extracting important information from text and further training based on text data. The main tasks include speech…Continue
The age of technology is way past its nascent stage and has grown exponentially during the last decade. In my role, travelling to events and meeting with thought leaders I am aware of the developments made across different technological fronts.…Continue
Added by Ronald van Loon on July 17, 2018 at 2:45am — No Comments
Autonomous cars are racing down the highway at speeds exceeding 100 MPH when suddenly a car a half-mile ahead blows out a tire sending dangerous debris across 3 lanes of traffic. Instead of relying upon sending this urgent, time-critical distress information to the world via the cloud, the cars on that particular section of the highway use peer-to-peer, immutable communications to inform all vehicles in the area of the danger so that they can slow down and move to unobstructed…Continue
My nephew's a very impressive young man. Five years ago, he received a PhD in Biochemistry/Molecular Biology from a prestigious university, earning numerous teaching and research awards along the way. He then took a faculty…
One of the most difficult and most critical parts of implementing data science in business is quantifying the return-on-investment or ROI. In this article, we highlight three reasons you need to learn the Expected Value Framework, a framework that connects the machine learning classification model to ROI.Continue
Added by Matt Dancho on July 16, 2018 at 9:42am — No Comments
Did you know that you can execute R and Python code remotely in SQL Server from Jupyter Notebooks or any IDE? Machine Learning Services in SQL Server eliminates the need to move data around. Instead of transferring large and sensitive data over the network or losing accuracy on ML training with sample csv files, you can have your R/Python code execute within your database. You can work in Jupyter Notebooks,…Continue
Added by Kyle Weller on July 16, 2018 at 8:31am — No Comments
Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week.
Featured Resources and Technical ContributionsContinue
2018 is set to be the year data finally delivers for both businesses and consumers. Alex Comyn, chief strategy officer at Amaze, explores 8 key trends that are set to impact on brands next year.
Throughout 2017 we’ve seen the use of data starting to come together to assist both consumers and businesses,…Continue
Added by tom on July 14, 2018 at 7:50pm — No Comments
Critically reading scientific papers is critical for Data Scientists working some areas - especially those working in health. With that in mind, here are some key considerations in reading scientific (peer-review, grey literature) papers:
Theory: Is the theory sound? Are there theoretical issues in the design that cause…Continue
Added by Howard Friedman on July 13, 2018 at 9:11am — No Comments
Added by Ziyad Nazem on July 13, 2018 at 8:08am — No Comments
Machine learning in finance may work magic, even though there is no magic behind it (well, maybe just a little bit). Still, the success of machine learning project depends more on building efficient infrastructure, collecting suitable datasets, and applying the right algorithms.
Machine learning is making significant inroads in the financial services industry. Let’s see why financial companies should care, what solutions they can implement with AI and machine learning, and how exactly…Continue
Added by Tetiana Boichenko on July 13, 2018 at 4:12am — No Comments
Apache Spark is a powerful open-source processing engine built around speed, ease of use, and sophisticated analytics. It has originally been developed at UC Berkeley in 2009, while Databricks was founded later by the creators of Spark in 2013.
The Spark engine runs in a variety of…Continue
Added by Igor Bobriakov on July 13, 2018 at 2:33am — No Comments
The best trained soldiers can’t fulfill their mission empty-handed. Data scientists have their own weapons — machine learning (ML) software. There is already a cornucopia of articles listing reliable machine learning tools with in-depth descriptions of their functionality. Our goal, however, was to get the feedback of industry experts.
And that’s why we interviewed data science practitioners — gurus, really —regarding the useful tools they…Continue
Added by Kateryna Lytvynova on July 13, 2018 at 2:00am — No Comments
Big data is the present and the future in the world of technological innovations. Big data is a collection of small details and information points in varying silos. It is a powerful technology that it can bring about a significant digital transformation with its use of data.…Continue
Added by imranali on July 12, 2018 at 8:04pm — No Comments
Here is our selection of articles and technical contributions featured since Monday:
Added by Vincent Granville on July 12, 2018 at 9:00am — No Comments