I want to know more about the Boltzmann and MCMC techniques from a very basic level in a layman's language. Can someone guide me?
Added by Malay Kapoor on May 24, 2016 at 8:13am — No Comments
As a data scientist, your job doesn’t always make sense to others. Ever tried explaining what you do to your parents? They may nod their heads, but their eyes scream confusion.
Well, aside from possibly stifling job-related conversations, this isn’t a big deal. However, when it comes to explaining what you do to potential clients, who happen to be just as technology averse, it’s a major issue.
Here are some helpful tips for explaining exactly what you do to…Continue
Added by Larry Alton on May 24, 2016 at 7:30am — No Comments
Since its inception in the year 2008, the global Hadoop market has observed growth at a tremendous pace. This market, valued US$1.5 billion in 2012, is estimated to grow at a CAGR of 54.7% from 2012 to 2018. By the end of 2018, this market could amass a net worth of US$20.9 billion. With the massive amount of data generated every day across major industries, the global Hadoop market is anticipated to observe significant growth in the future as well.
Added by Ankit Jain on May 23, 2016 at 11:00pm — No Comments
There has been a lot of activity recently around revenue attribution - marketers want to develop a better understanding of their customer acquisition funnel and be able to measure progress against it. Most of this attention has been focused on the B2C space. However, less work has been done measuring the performance of B2B marketing activities.
Certainly the marketing automation segment is very vibrant with a large number of vendors (both big and small) providing solutions that…Continue
Added by Gregory Thompson on May 23, 2016 at 4:33pm — No Comments
This is one of the first comprehensive machine learning, data science, statistical science, and computer science repository -- featuring many brand new scalable, big-data algorithms published in the last two years, such as automated cataloging, causation detection, or model-free tests of hypotheses, in addition to the classics. The original title for this project was Handbook of Data Science, but over time, it grew much bigger than an handbook. This is still an ongoing…Continue
Added by Vincent Granville on May 23, 2016 at 2:10pm — No Comments
Data science student project contributed by Brian. Brian took NYC Data Science Academy 12 week full time Data Science Bootcamp program between Sept 23 to Dec 18, 2015. The post was based on his first class project (due the 2nd week of the program).
This project utilizes publicly available data to visualize…Continue
Added by SupStat on May 23, 2016 at 9:00am — No Comments
Big data is a term for data sets that are extremely large and complex that only a few short years ago were not capable of being processed with traditional data processing applications. Challenges in big data include the capture, search, sharing, storage, transfer, visualization, querying and privacy, among other concerns. Data sets are growing rapidly because there are increasingly more avenues for data including mobile devices, software logs, cameras, microphones, wireless networks,…Continue
Added by Sam Carr on May 22, 2016 at 10:00am — No Comments
Thousands of articles and tutorials have been written about data science and machine learning. Hundreds of books, courses and conferences are available. You could spend months just figuring out what to do to get started, even to understand what data science is about.
In this short contribution, I share what I believe to be the most valuable resources - a small list of top resources and starting points. This will be most valuable to any data practitioner who has very little free…Continue
Added by Vincent Granville on May 20, 2016 at 11:30am — No Comments
Machine Learning today tends to be “open-loop” – collect tons of data offline, process them in batches and generate insights for eventual action. There is an emerging category of ML business use cases that are called “In-Stream Analytics (ISA)”. Here, the data is processed as soon as it arrives and insights are generated quickly. However, action may be taken offline and the effects of the actions are not immediately incorporated back into the learning process. If we did, it is an…Continue
Added by PG Madhavan on May 20, 2016 at 5:30am — No Comments
Today we are really happy to host a post from Ariadni-Karolina Alexiou or Caroline in short. Caroline is a Data…Continue
Added by George Psistakis on May 20, 2016 at 3:00am — No Comments
At the Data Science Association our members often complain about the major data engineering problem of finding the right tools and programming models to build both robust data processing pipelines and efficient ETL processes for data transformation and integration.…
Added by Michael Walker on May 19, 2016 at 10:00pm — No Comments
Collaborative business intelligence is an environment In which users can communicate and collaborate each other with ease, they are able to sharing information, ideas, and decision making in their communities.
Each and every day, no one holds the millions of data items of intellectual property (telephone calls , conversations, and e-mails) in companies and organizations across the world. Using important collaborative software to…Continue
Added by Priyanka Jain on May 19, 2016 at 9:00pm — No Comments
Marketing measurement has long been an arcane field - companies interested in understanding how their marketing programs impacted revenue (or brand value) would hire expensive consultants who labored long and hard to deliver complex models at great cost to help their clients set high level marketing strategies and advertising budgets.
This worked well until the internet came along and changed the game - new digital channels and online marketing techniques were embraced by…Continue
Added by Gregory Thompson on May 19, 2016 at 11:00am — No Comments
The healthcare industry was a pioneer in consistently applying data mining techniques and analytics procedures to identify areas subject to optimization and potential improvements of clinical practice. The research methodology was typically focused on accepting or discarding an initial…Continue
Added by Rafael San Miguel Carrasco on May 19, 2016 at 10:12am — No Comments
As part of Data Science tutorial Series in my previous post I posted on basic data types in R. I have kept the tutorial very simple so that beginners of R programming may takeoff immediately.
Please find the online R editor at the end of the post so that you can execute the code on the page itself.
In this section we learn about control structures loops used…
Added by dataperspective on May 18, 2016 at 8:30pm — No Comments
The bagged trees algorithm is a commonly used classification method. By resampling our data and creating trees for the resampled data, we can get an aggregated vote of classification prediction. In this blog post I will demonstrate how bagged trees work visualizing each step.…Continue
Added by Maiia Bakhova on May 18, 2016 at 2:12pm — No Comments
Starred articles are new additions posted between Thursday and Sunday, published in the Monday edition exclusively. The Monday edition has six sections: (1) Featured Resources and Technical Contributions, (2) Featured Articles and Case Studies, (3) From our Sponsors, (4) News, Events, Books, Training, Forum Questions, (5) Picture of the Week, and (6) Syndicated Content. The Thursday edition covers articles…Continue
Added by Vincent Granville on May 18, 2016 at 9:30am — No Comments
Here are three useful resources for learning about Data Science:
Added by Ujjwal Karn on May 18, 2016 at 8:59am — No Comments
I created an R package for exploratory data analysis. You can read about it and install it here.
The package contains several tools to perform initial exploratory analysis on any input dataset. It includes custom functions for plotting the data as well as performing different kinds of analyses such as univariate, bivariate and multivariate investigation which is the first step of any…Continue
Added by Ujjwal Karn on May 18, 2016 at 8:30am — No Comments
Recently, I rediscovered a TED Talk by David McCandless, a data journalist, called “The beauty of data visualization.” It’s a great reminder of how charts (though scary to many) can help you tell an actionable story about a topic in a way that bullet points alone usually cannot. If you have not seen the talk, I recommend you take a look for some inspiration about visualizing big…Continue