Subscribe to DSC Newsletter

Igor Bobriakov's Blog (44)

Intro to Data Science for Managers [Mindmap]

Data science has become an integral part of many modern projects and businesses, with an increasing number of decisions now based on data analysis. The data science industry is experiencing an acute …

Continue

Added by Igor Bobriakov on November 14, 2018 at 7:30am — 4 Comments

U.S. Startups Locations Analysis

Data analysis can bring a competitive advantage to your business, assisting in a better understanding of the product, customers, and competitors. An integral part of data analysis is data visualization. It can provide valuable information and help with its comprehension and correct interpretation.

Today we…

Continue

Added by Igor Bobriakov on September 3, 2018 at 9:57pm — No Comments

Hadoop 3: Comparison with Hadoop 2 and Spark

The release of Hadoop 3 in December 2017 marked the beginning of a new era for data science. The Hadoop framework is at the core of the entire Hadoop ecosystem, and various other libraries strongly depend on it.

In this article, we will discuss the major changes in Hadoop 3 when compared to…

Continue

Added by Igor Bobriakov on August 27, 2018 at 12:30am — 1 Comment

Top 7 Data Science Use Cases in Healthcare

Medicine and healthcare is a revolutionary and promising industry for implementing the data science solutions. Data analytics is moving the medical science to a whole new level, from computerizing medical records to drug discovery and genetic disease exploration. And this is just the beginning.…

Continue

Added by Igor Bobriakov on August 22, 2018 at 10:30am — No Comments

Top 9 Data Science Use Cases in Banking

Using data science in the banking industry is more than a trend, it has become a necessity to keep up with the competition. Banks have to realize that big data technologies can help them focus their resources efficiently, make smarter decisions, and improve performance.

Here is a list of…

Continue

Added by Igor Bobriakov on August 20, 2018 at 10:00am — No Comments

Practical Apache Spark in 10 minutes. Part 6 - GraphX

In our last post, we explained the basics of streaming with Spark. Today, we want to talk about graphs and explore …

Continue

Added by Igor Bobriakov on August 17, 2018 at 1:00am — No Comments

Comparison of the Top Cloud APIs for Computer Vision

What is computer vision?

Nowadays, computer vision (CV) is one of the most widely used dimensions of machine learning. The main task of computer vision is to understand the contents of the image. It is used almost in all spheres of…

Continue

Added by Igor Bobriakov on August 16, 2018 at 1:02am — No Comments

Comparison of the Most Useful Text Processing APIs

Nowadays, text processing is developing rapidly, and several big companies provide their products which help to deal successfully with diverse text processing tasks. In case you need to do some text processing there are 2 options available. The first one is to develop the entire system on your own from scratch. This way proves to be very time and…

Continue

Added by Igor Bobriakov on August 10, 2018 at 2:57am — 2 Comments

A Comparative Analysis of Top 6 BI and Data Visualization Tools in 2018

Image credit: tinydesignr.com

Nowadays, there is a huge list of powerful visualization tools to help you illustrate your ideas,…

Continue

Added by Igor Bobriakov on August 9, 2018 at 12:30am — 1 Comment

Top 15 Scala Libraries for Data Science in 2018

In our previous articles, we have discussed the top Python libraries for data science. This time we will focus on Scala, which has recently become another prominent language for data scientists. It has gained popularity mostly due to the rise of Spark,…

Continue

Added by Igor Bobriakov on August 7, 2018 at 12:16am — No Comments

Practical Apache Spark in 10 minutes. Part 5 - Streaming

Spark is a powerful tool which can be applied to solve many interesting problems. Some of them have been discussed in our previous posts. Today we will consider another important application, namely streaming. Streaming data is the data which continuously comes as small records from different sources. There are many use cases for streaming technology…

Continue

Added by Igor Bobriakov on July 30, 2018 at 3:53am — No Comments

Top 10 Data Science Use Cases in Retail

Nowadays data proves to be a powerful pushing force of the industry. Big companies representing diverse trade spheres seek to make use of the beneficial value of the data. 

Thus, data has become of great importance for those willing to take profitable decisions concerning business. Moreover, a…

Continue

Added by Igor Bobriakov on July 26, 2018 at 8:00am — No Comments

Practical Apache Spark in 10 minutes. Part 4 - MLlib

The vast possibilities of artificial intelligence are of increasing interest in the field of modern information technologies. One of its most promising and evolving directions is machine learning (ML), which becomes the essential part in various aspects of our life. ML has found successful applications in Natural Languages Processing, Face…

Continue

Added by Igor Bobriakov on July 24, 2018 at 10:12pm — No Comments

Practical Apache Spark in 10 minutes. Part 3 - DataFrames and SQL

Spark SQL is a part of Apache Spark big data framework designed for processing structured and semi-structured data. It provides a DataFrame API that simplifies and accelerates data manipulations. DataFrame is a special type of object, conceptually similar to a table in relational database. It represents a distributed collection…

Continue

Added by Igor Bobriakov on July 18, 2018 at 10:01pm — No Comments

Practical Apache Spark in 10 minutes. Part 2 - RDD

Spark’s primary abstraction is a distributed collection of items called a Resilient Distributed Dataset (RDD). It is a fault-tolerant collection of elements which allows parallel operations upon itself. RDDs can be created from Hadoop InputFormats (such as HDFS files) or by transforming other RDDs. 

Creating…

Continue

Added by Igor Bobriakov on July 17, 2018 at 11:07pm — No Comments

Comparison of Top 6 Python NLP Libraries

Natural language processing (NLP) is getting very popular today, which became especially noticeable in the background of the deep learning development. NLP is a field of artificial intelligence aimed at understanding and extracting important information from text and further training based on text data. The main tasks include speech…

Continue

Added by Igor Bobriakov on July 17, 2018 at 3:00am — 2 Comments

Practical Apache Spark in 10 minutes. Part 1 - Ubuntu installation

Apache Spark is a powerful open-source processing engine built around speed, ease of use, and sophisticated analytics. It has originally been developed at UC Berkeley in 2009, while Databricks was founded later by the creators of Spark in 2013.

The Spark engine runs in a variety of…

Continue

Added by Igor Bobriakov on July 13, 2018 at 2:33am — No Comments

Top 10 Data Science Use Cases in Insurance

The insurance industry is regarded as one of the most competitive and less predictable business spheres. It is instantly related to risk. Therefore, it has always been dependent on statistics. Nowadays, data science has changed this dependence forever.

Now, insurance companies have a wider range of…

Continue

Added by Igor Bobriakov on July 11, 2018 at 11:18pm — 1 Comment

Installation and running Ubuntu Virtual Box

Oracle VM VirtualBox - a suite of applications, system services and drivers that emulate the new computer equipment in the environment of the operating system where you installed VirtualBox. On a virtual machine can be installed almost any operating system. For example, on a real computer with Windows, you can install a virtual machine with operating systems Linux and use both operating systems simultaneously. This operation we wish to make in this article.…

Continue

Added by Igor Bobriakov on July 6, 2018 at 3:28am — No Comments

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service