Hadoop - MapReduce in an easy way
In the previous blog, we discussed about HDFS, one of the main components of Hadoop. I highly recommend going through that blog before moving onto MapReduce. This blog will introduce you to MapReduce, which is… Continue
Added by Aafrin Dabhoiwala on September 2, 2018 at 8:30am —
This blog is to give brief introduction about Hadoop for those who know next to nothing about this technology. Big Data is at the foundation of all the megatrends that are happening today, from social to the cloud to mobile devices to gaming. This blog will help to build the foundation to take the next step in learning this interesting technology. Let's get started:
1. What's Big Data?
Ever since… Continue
Added by Aafrin Dabhoiwala on August 26, 2018 at 12:30pm —
With the rise of IoT devices (Internet of Things), being able to analyze and visualize live streams of data is becoming more and more important. For example, you could have sensors like thermometers in machines or portable medical devices like pacemakers, continuously streaming data to a streaming service like Kafka. PixieDust makes it easier to work with live data inside Jupyter Notebooks by providing simple integration APIs to both the PixieApp… Continue
Added by Packt Publishing on August 16, 2018 at 1:30am —
Technology changes on the daily basis transforming the rules like never before. Something that looked modern and fresh yesterday can appear dated seemingly overnight, and trends once dismissed as irrevocably passé can unexpectedly cycle back in vogue.
Over the past few years, AI, Big Data, and such disruptive technologies have succeeded in transforming many industries already but a notable and a positive change noticeable is in the travel sector. Maybe because travel industries… Continue
Added by Nishtha Singh on August 13, 2018 at 3:04am —
Many predict, and warn, that the Artificial Intelligence (AI) Revolution will change the world – and possibly the very essence of mankind. But society-changing revolutions are not new. History is full of such revolutions. What can we learn from those previous revolutions that might provide an indication as to how this AI revolution might play out?
We will examine two other revolutions – the Industrial Revolution and the Information Revolution… Continue
Added by Bill Schmarzo on June 26, 2018 at 3:30pm —
The performance of the neural network improves with an increasing volume of training data. With more and more devices generating data that can potentially be used for training and model generation, the models are getting better at generalizing the stochastic environment and handling complex tasks. However, with more data and more complex structures for the deep neural networks, the computational requirements increase.
Even though we have started leveraging GPUs for deep neural network… Continue
Added by Packt Publishing on June 13, 2018 at 12:30am —
Businesses are growing more digitized today. As this happens, cybersecurity threats are rising as well. Companies are placed at an increasing risk, which is why they need help from big data analysis. In fact, KuppingerCole conducted a study entitled “Big Data and Information Security.” study looks in-depth at current deployment levels and the benefits of big data… Continue
Added by Evan Morris on May 15, 2018 at 8:30pm —
I had a new talk presented at "Codemotion Amsterdam 2018" this week. I discussed the relation of Apache Kafka and Machine Learning to build a Machine Learning infrastructure for extreme scale.
Long version of the title:
"Deep Learning at Extreme Scale (in the Cloud)
with the Apache Kafka Open Source Ecosystem - How to Build a Machine Learning Infrastructure with Kafka, Connect, Streams, KSQL, etc."
As always, I want to share the slide deck. The talk was… Continue
Added by Kai Waehner on May 8, 2018 at 9:30pm —
After reviewing 8 great ETL tools for fast-growing startups, we got a request to tell you more about open source solutions.There are many open source ETL tools and frameworks, but most of them require writing code.… Continue
Added by Luba Belokon on April 26, 2018 at 2:30am —
New reforms under the General Data Protection Regulation (GDPR) started as an attempt to standardize data protection regulations in 2012. The European Union intends to make Europe “fit for the digital age.” It took four years to finalize the agreements and… Continue
Added by Ronald van Loon on March 26, 2018 at 10:30pm —
I was recently asked to conduct a 2-hour workshop for the State of California Senior Legislators on the topic of “Big Data, Artificial Intelligence and Privacy.” Honored by the privilege of offering my perspective on these critical topics, I shared with my home-state legislators how significant opportunities await the state. I reviewed the once-in-a-generation opportunities awaiting the great State of California (“the State”), where decision makers could vastly…
Added by Bill Schmarzo on February 13, 2018 at 5:30am —
Today I'm writing this post to explain how it's possible to make geographic analysis and answer questions like: which is the richest area in my city? How many people do live in one neighborhood?
You can do it combining shape files with an excel spreadsheet, let's understand it together...
First of all, we need to install one Geographic Information System (GIS), and I recommend QGIS - free and open source GIS
Added by Thiago Buselato Maurício on February 11, 2018 at 9:30am —
Organizations looking for justification to move beyond legacy reporting should review this little ditty from the healthcare industry:
The Institute of Medicine (IOM) estimates that the United States loses $750 billion annually to medical fraud, inefficiencies, and other siphons in the healthcare system…
Added by Bill Schmarzo on January 26, 2018 at 1:30pm —
“Big Data is dead.” “Big Data is passé.”
“We no longer need Big Data; we need Machine Learning now.”
As we end 2017 and look forward to big (data) things in 2018, the most important lessons of 2017 – in fact, maybe the most important lesson going forward – is that Big Data is NOT a thing. Big Data isn’t about the volume, variety or velocity of data any more than car…
Added by Bill Schmarzo on January 20, 2018 at 5:30am —
I was recently a guest lecturer at the University of California Berkeley Extension in San Francisco. On a lovely Saturday afternoon, the classroom was crowded with students of all ages learning the tools of the modern economy. The craftspeople of the “Analytics Revolution” were busy learning new skills and tools that will prepare them for this Brave New World of analytics. I was blown away by their dedication!
As we teach the next generation, it’s important…
Added by Bill Schmarzo on January 19, 2018 at 5:00am —
Most organizations’ IOT Strategy look like a game of ‘Twister’ with progress across important IOT capabilities such as architecture, technology, data, analytics and governance; variables comprising a series of random investments and decisions.…
Added by Bill Schmarzo on January 13, 2018 at 5:00am —
Social media provide a low-cost alternative source for public health surveillance and health-related classification plays an important role to identify useful information. We summarized the recent classification methods using social media in public health. These methods rely on bag-of-words (BOW) model and have difficulty grasping the semantic meaning of texts. Unlike these methods, we present a word embedding based clustering method. Word embedding is one of the strongest trends in Natural… Continue
Added by CD on January 4, 2018 at 10:30am —
Finding out the difference between data scientists, data engineers, software engineers, and statisticians can be confusing and complicated. While all of them are linked to data in a way, there is an underlying difference between the work they do and manage.
The growth of data and its usage across… Continue
Added by Ronald van Loon on December 19, 2017 at 1:00am —
These days, it seems like every city is trying to implement “smart” initiatives. Take Singapore for example, now known for the most extensive effort to collect data on citizens’ daily living habits/routines ever attempted by a municipality. Even Bill Gates has pumped millions of dollars into helping Phoenix in their smart city efforts. Continue
But what do we mean when we talk about a “smart city”? Is it the better use of resources within the city center, or the fact that all resources being…
Added by Ronald van Loon on December 8, 2017 at 1:30am —
I recently had another client conversation about optimizing their data warehouse and Business Intelligence (BI) environment. The client had lots of pride in their existing data warehouse and business intelligence accomplishments, and rightfully so. The heart of the conversation was about taking costs out of their reporting environments by consolidating runaway data marts and “spreadmarts,” and improving business analyst BI…
Added by Bill Schmarzo on November 18, 2017 at 7:00am —