Featured Blog Posts – May 2015 Archive (61)

Ontology for Data Science

When I returned to university to do a graduate degree, I was interested to discover how certain terms are subject to "intellectual interpretation." A word that I was asked to explain during one of my earliest classes was "ontology." Since this term was absent from my dictionary, I originally confused it with "oncology." I faintly recall that oncology involves the study of tumors. After consulting a few sources, I said that ontology is the study of how things come to exist or into being. I…


Added by Don Philip Faithful on May 30, 2015 at 6:17am — No Comments

Healthcare Industry Finds New Solutions to Big Data Storage Challenges

Hospitals and medical centers have more to gain from big data analytics than perhaps any other industry. But as data sets continue to grow, healthcare facilities are discovering that success in data analytics has more to do with storage methods than with analysis software or techniques. Traditional data silos are hindering the progress of big data in the healthcare industry, and as terabytes turn into petabytes, the most successful hospitals are the ones that are coming up…


Added by Nick Rojas on May 29, 2015 at 1:00pm — 1 Comment

Alchemy and Algorithmic lawyers

Alchemy is a fascinating pseudo-science, long discredited as a real science; its fundamental principles still form the basis of many contemporary scientific theories. Alchemy is based on the premise that nothing in the universe is devoid of existential elements, and that these elements can be manipulated and even transmuted into other forms. The fabled Philosopher’s Stone is said to fulfil one of the main objectives of Alchemy, a legendary Alchemical substance said to be capable of turning…


Added by Mkhuseli Mthukwane on May 29, 2015 at 1:30am — 1 Comment

Weekly Digest - June 1

The full version is always published Monday. Starred articles are new additions or updated content, posted between Thursday and Sunday.



Added by Vincent Granville on May 27, 2015 at 9:00am — No Comments

Simple Regression use in Big Data

We have witnessed the rise of Key & Value pair, since the emergence of Big Data. We certainly can explore the relationship of such two variables in terms of X & Y, to be worked with in terms of using Data Science. The use of Regression also on basic terms gives an a depiction of two variables X & Y to work with. These variables are:

Independent Variables & Dependent Variables

Let us take behavior of users of a…


Added by Atif Farid Mohammad on May 25, 2015 at 6:00am — No Comments

How to determine the quality and correctness of classification models? Part 2 - Quantitative quality indicators

Basic quantitative quality indicators

In the last part of the tutorial we introduced the basic qualitative model quality indicators. Let us recall them now:

  • TP – True Positive – the number of observations correctly assigned to the positive class

    Example: the model’s predictions are correct and resigning customers have been…

Added by Algolytics on May 23, 2015 at 6:00pm — No Comments

Virtual Org and Behaviour by Transaction

In Java programming, there is the idea of a "virtual machine." A virtual machine is a computer system that doesn't exist in real life. Yet programs can be written for it. The code is interpreted by a runtime environment. Through this arrangement, Java programs can operate on different operating systems rather than one exclusively. Depending on one's background, the concept of a "…


Added by Don Philip Faithful on May 23, 2015 at 6:31am — No Comments

Four successful big data / analytics startups in Seattle

These companies gather and process gigantic amounts of data to serve their clients and/or users. They make money out of selling summarized, processed, real-time data. They are poised to succeed in the IoT (Internet of Things) revolution, leveraging all sort of devices and API's to gather data, and

  • send alerts to users via text messages or other technology
  • sell intelligence extracted from data, to other businesses

It is worth spending some time figuring out…


Added by Mirko Krivanek on May 22, 2015 at 8:00pm — No Comments

How Apple Uses Big Data To Drive Success

Apple’s old slogan was “Think Different” – and while it is now retired, and the ethos may not be as apparent in the company’s products as it once was, it is true for their approach to Big Data.

In some ways, despite being the most profitable tech company in the…


Added by Bernard Marr on May 22, 2015 at 1:30pm — 1 Comment

Big Data: Uncovering The Secrets of Our Universe At CERN

CERN is best known these days as the research organization which operates the Large Hadron Collider – the largest and most complicated science experiment ever undertaken, which aims to explain mysteries behind the creation of the universe.

That’s not where…


Added by Bernard Marr on May 22, 2015 at 1:30pm — 3 Comments

10 Python Machine Learning Projects on GitHub

Here is a list of top Python Machine learning projects on GitHub. A continuously updated list of open source learning projects is available on Pansop.



scikit-learn is a Python…


Added by Pansop on May 21, 2015 at 8:00pm — 2 Comments

Data Integrity: The Rest of the Story Part II

Buzz words are one of my least favorite things, but as buzz words go, I can appreciate the term “Data Lake.” It is one of the few buzz words that communicates a meaning very close to its intended definition. As you might imagine, with the advent of large scale data processing, there would be a need to name the location where lots of data resides, ergo, data lake. I personally prefer to call it a series of redundant commodity servers with Direct-Attached Storage, or hyperscale computing with…


Added by Randall Shane on May 21, 2015 at 3:13pm — 1 Comment

Measuring Information Retrieval Performance Using Extrapolated Precision

This is a brief overview of my paper “Information Retrieval Performance Measurement Using Extrapolated Precision,” which I’ll be presenting on June 8th at the DESI VI workshop at ICAIL 2015.  The paper provides a novel method for extrapolating a precision-recall point to a different level of recall, and…


Added by Bill Dimm on May 21, 2015 at 2:44pm — No Comments

9 Python Analytics Libraries

Python & data analytics go hand in hand. Here is a list of 9 Python data analytics libraries. This list is going to be…


Added by Pansop on May 21, 2015 at 4:30am — No Comments

Weekly Digest - May 25

The full version is always published Monday. Starred articles are new additions or updated content, posted between Thursday and Sunday.


  • Webinar: Flipping the 80/20 Rule for Analytics - Hear how Teradata helps businesses flip the 80/20 model so they can spend only 20% preparing and organizing data and 80% on the analytics, accelerating time to value.…

Added by Vincent Granville on May 20, 2015 at 5:30pm — No Comments

100 Best Data Science Companies to Work for in 2015

This is an interesting article recently published in Forbes. The author gathered data from Glassdoor.com, to rank companies. Glassdoor.com is a website where employees make comments about, and rate their company, and can even post their job title and salary range. Keep in mind that the author is not a statistician, and his analysis is…


Added by Mirko Krivanek on May 20, 2015 at 10:00am — 2 Comments

What Defines a Big Data Scenario?

Big data is a new marketing term that highlights the everincreasing and exponential growth of data in every aspect of our lives. The term big data originated from within the open-source community, where there was an effort to develop analytics processes that were faster and more scalable than traditional data warehousing, and could extract value from the vast amounts of unstructured and semistructured data produced daily by web users. Consequently, big data origins are tied to web data,…


Added by Khosrow Hassibi on May 20, 2015 at 7:51am — No Comments

How Do I Become a Data Scientist? / Data Science Aspects

I asked myself this question a few months ago. Next I thought: What is the definition of Data Science? So the first thing I started to do is read as many posts on the topic as I could get my hands on and also lookup definitions of related topics such as Data Mining and Machine Learning. Looking at the discussions and posts around Data Science it …


Added by Michael Laux on May 20, 2015 at 5:30am — 1 Comment

Machine Learning Resources for Spam Detection

Spam is a kind of messaging where the cost of sending is usually negligible and the receiver and the ISP pays the cost in terms of bandwidth usage. 

An example of a manual approach to detecting spam is using knowledge engineering. When you are aware of what is spam and what is not, you can usually filter it by creating a set of rules like,

  • If the subject line of an email contains words ‘Buy viagra’ its…


Added by Pansop on May 19, 2015 at 1:00am — 1 Comment

Predictive Analytics Demystified

This 30 minute video aims to demystify predictive analytics and present the IBM SPSS predictive analytics portfolio. The contents of the video are as follows:

  • Evolution of Analytics 5:45
  • Why is Predictive Analytics Important? 11:35
  • Demystifying Predictive Analytics 21:30
  • IBM…

Added by Venky Rao on May 18, 2015 at 11:30am — No Comments

Featured Monthly Archives












© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service