Subscribe to DSC Newsletter

May 2015 Blog Posts (66)

Ontology for Data Science

When I returned to university to do a graduate degree, I was interested to discover how certain terms are subject to "intellectual interpretation." A word that I was asked to explain during one of my earliest classes was "ontology." Since this term was absent from my dictionary, I originally confused it with "oncology." I faintly recall that oncology involves the study of tumors. After consulting a few sources, I said that ontology is the study of how things come to exist or into being. I…

Continue

Added by Don Philip Faithful on May 30, 2015 at 6:17am — No Comments

Healthcare Industry Finds New Solutions to Big Data Storage Challenges

Hospitals and medical centers have more to gain from big data analytics than perhaps any other industry. But as data sets continue to grow, healthcare facilities are discovering that success in data analytics has more to do with storage methods than with analysis software or techniques. Traditional data silos are hindering the progress of big data in the healthcare industry, and as terabytes turn into petabytes, the most successful hospitals are the ones that are coming up…

Continue

Added by Nick Rojas on May 29, 2015 at 1:00pm — 1 Comment

Machine Learning is Not the Boogie Man! Gates and Musk Are Wrong.

Humanity is going to be okay!   The big bad robots are not going to come and get you...

In a recent Reddit AMA session, Bill Gates commented, “First the machines will do a lot of jobs for us and not be super intelligent… A few decades after that though the intelligence is strong enough to be a concern. I agree with Elon Musk and some others…

Continue

Added by John Thuma on May 29, 2015 at 7:30am — 2 Comments

Alchemy and Algorithmic lawyers

Alchemy is a fascinating pseudo-science, long discredited as a real science; its fundamental principles still form the basis of many contemporary scientific theories. Alchemy is based on the premise that nothing in the universe is devoid of existential elements, and that these elements can be manipulated and even transmuted into other forms. The fabled Philosopher’s Stone is said to fulfil one of the main objectives of Alchemy, a legendary Alchemical substance said to be capable of turning…

Continue

Added by Mkhuseli Mthukwane on May 29, 2015 at 1:30am — 1 Comment

Weekly Digest - June 1

The full version is always published Monday. Starred articles are new additions or updated content, posted between Thursday and Sunday.

Featured

Continue

Added by Vincent Granville on May 27, 2015 at 9:00am — No Comments

What clustering method is required for text documents

Let's say a set of documents 'S' has a large set of 'pure' texts.

On all documents in S, I am spelling normalisation method, which yields a normalised set S'.

Then I use the chosen method M (which method? ) to make clusters in S, obtaining a clustering result C.

Then I use the same method M to make clusters in S', obtaining a clustering results C'.

Finally I need to compare if there are statistically significant differences between C and C'.

Any help in identifying…

Continue

Added by MUSHTAQ AHMAD on May 25, 2015 at 11:48am — 3 Comments

Simple Regression use in Big Data

We have witnessed the rise of Key & Value pair, since the emergence of Big Data. We certainly can explore the relationship of such two variables in terms of X & Y, to be worked with in terms of using Data Science. The use of Regression also on basic terms gives an a depiction of two variables X & Y to work with. These variables are:

Independent Variables & Dependent Variables

Let us take behavior of users of a…

Continue

Added by Atif Farid Mohammad on May 25, 2015 at 6:00am — No Comments

How to determine the quality and correctness of classification models? Part 2 - Quantitative quality indicators

Basic quantitative quality indicators

In the last part of the tutorial we introduced the basic qualitative model quality indicators. Let us recall them now:

  • TP – True Positive – the number of observations correctly assigned to the positive class

    Example: the model’s predictions are correct and resigning customers have been…
Continue

Added by Algolytics on May 23, 2015 at 6:00pm — No Comments

Virtual Org and Behaviour by Transaction

In Java programming, there is the idea of a "virtual machine." A virtual machine is a computer system that doesn't exist in real life. Yet programs can be written for it. The code is interpreted by a runtime environment. Through this arrangement, Java programs can operate on different operating systems rather than one exclusively. Depending on one's background, the concept of a "…

Continue

Added by Don Philip Faithful on May 23, 2015 at 6:31am — No Comments

Four successful big data / analytics startups in Seattle

These companies gather and process gigantic amounts of data to serve their clients and/or users. They make money out of selling summarized, processed, real-time data. They are poised to succeed in the IoT (Internet of Things) revolution, leveraging all sort of devices and API's to gather data, and

  • send alerts to users via text messages or other technology
  • sell intelligence extracted from data, to other businesses

It is worth spending some time figuring out…

Continue

Added by Mirko Krivanek on May 22, 2015 at 8:00pm — No Comments

How Apple Uses Big Data To Drive Success

Apple’s old slogan was “Think Different” – and while it is now retired, and the ethos may not be as apparent in the company’s products as it once was, it is true for their approach to Big Data.

In some…

Continue

Added by Bernard Marr on May 22, 2015 at 1:30pm — 1 Comment

Big Data: Uncovering The Secrets of Our Universe At CERN

CERN is best known these days as the research organization which operates the Large Hadron Collider – the largest and most complicated science experiment ever undertaken, which aims to explain mysteries behind the creation of the universe.…

Continue

Added by Bernard Marr on May 22, 2015 at 1:30pm — 3 Comments

10 Python Machine Learning Projects on GitHub

Here is a list of top Python Machine learning projects on GitHub. A continuously updated list of open source learning projects is available on Pansop.

 …

Continue

Added by Pansop on May 21, 2015 at 8:00pm — 2 Comments

Data Integrity: The Rest of the Story Part II

Buzz words are one of my least favorite things, but as buzz words go, I can appreciate the term “Data Lake.” It is one of the few buzz words that communicates a meaning very close to its intended definition. As you might imagine, with the advent of large scale data processing, there would be a need to name the location where lots of data resides, ergo, data lake. I personally prefer to call it a series of redundant commodity servers with Direct-Attached Storage, or hyperscale computing with…

Continue

Added by Randall Shane on May 21, 2015 at 3:13pm — 1 Comment

Measuring Information Retrieval Performance Using Extrapolated Precision

This is a brief overview of my paper “Information Retrieval Performance Measurement Using Extrapolated Precision,” which I’ll be presenting on June 8th at the DESI VI workshop at ICAIL 2015.  The paper provides a novel method for extrapolating a precision-recall point to a different level of recall, and…

Continue

Added by Bill Dimm on May 21, 2015 at 2:44pm — No Comments

9 Python Analytics Libraries

Python & data analytics go hand in hand. Here is a list of 9 Python data analytics libraries. This list is going to be…

Continue

Added by Pansop on May 21, 2015 at 4:30am — No Comments

Weekly Digest - May 25

The full version is always published Monday. Starred articles are new additions or updated content, posted between Thursday and Sunday.

Announcements

  • Webinar: Flipping the 80/20 Rule for Analytics - Hear how Teradata helps businesses flip the 80/20 model so they can spend only 20% preparing and organizing data and 80% on the analytics, accelerating time to value.…
Continue

Added by Vincent Granville on May 20, 2015 at 5:30pm — No Comments

100 Best Data Science Companies to Work for in 2015

This is an interesting article recently published in Forbes. The author gathered data from Glassdoor.com, to rank companies. Glassdoor.com is a website where employees make comments about, and rate their company, and can even post their job title and salary range. Keep in mind that the author is not a statistician, and his analysis is…

Continue

Added by Mirko Krivanek on May 20, 2015 at 10:00am — 2 Comments

What Defines a Big Data Scenario?

Big data is a new marketing term that highlights the everincreasing and exponential growth of data in every aspect of our lives. The term big data originated from within the open-source community, where there was an effort to develop analytics processes that were faster and more scalable than traditional data warehousing, and could extract value from the vast amounts of unstructured and semistructured data produced daily by web users. Consequently, big data origins are tied to web data,…

Continue

Added by Khosrow Hassibi on May 20, 2015 at 7:51am — No Comments

How Do I Become a Data Scientist? / Data Science Aspects

I asked myself this question a few months ago. Next I thought: What is the definition of Data Science? So the first thing I started to do is read as many posts on the topic as I could get my hands on and also lookup definitions of related topics such as Data Mining and Machine Learning. Looking at the discussions and posts around Data Science it …

Continue

Added by Michael Laux on May 20, 2015 at 5:30am — 1 Comment

Monthly Archives

2017

2016

2015

2014

2013

2012

2011

1999

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service