Featured Blog Posts – June 2015 Archive (66)

9 Must-Have Skills You Need to Become a Data Scientist

Usually I tend to criticize this type of articles, but in this case I agree pretty much agree with BurtchWorks, the author of this article, even though the article is more than 6 months old. Note that BurtchWorks is a recruiting firm that recently posted interesting salary surveys for data…


Added by Mirko Krivanek on June 21, 2015 at 2:30pm — No Comments

Big Data: The Amazing Numbers in 2015

Big data is growing — in fact, the sector is growing so fast and we are producing data so voraciously, that no one can afford to ignore it as a “fad” any more.

And, it’s going to affect all companies, large and small, across all segments of the market — from healthcare to public safety, and retail to wholesale.

Big data is changing the world as we know it.…


Added by Bernard Marr on June 20, 2015 at 6:00am — 2 Comments

Types Of Performance Metrics Everyone Must Differentiate

There are leading and lagging indicators in business. It is important that managers understand the difference between them and ensure they have both types of metrics to get an accurate picture of performance. However, there are also some important issues to watch out for.

All performance measures are, by definition, backward looking. They tell us what has already happened. Current trends are on merging the data and information from backward looking metrics or performance indicators…


Added by Bernard Marr on June 20, 2015 at 6:00am — No Comments

Structural Relationships in Data

The first computer program that I encountered mimicking or emulating human interaction through language was called "Eliza." The version that I knew ran on the Commodore PET. It communicated in English. Eliza made comments that made some sense but which indicated lack of understanding of the conversation. If a person mentions "mother," Eliza might…


Added by Don Philip Faithful on June 20, 2015 at 5:06am — No Comments

Spectacular patterns found in rare, big numbers used in encryption schemes

This may lead to a practical, feasible solution, to factor the product of two very large prime numbers (with thousands of digits), making many security systems vulnerable to new types of attacks: factoring encryption keys, en masse.

Let's start with the pictures. The patterns (as well as how to leverage them) are explained below.…


Added by Vincent Granville on June 19, 2015 at 2:00pm — 3 Comments

Can anyone suggest a big dataset for the purpose of regression?

I need a big dataset for the purpose of regression.I need atleast 4 million records. I already have airlines dataset.I need a dataset apart from the airline dataset.

I basically want to test a regression model.I have already tested it with artificial dataset.But I want to try my hands on real data set.

Added by Harshvardhan Solanki on June 19, 2015 at 9:56am — No Comments

A Database of Police Killings Since 2013 by Justin Tenuto

A while back, we found an interesting dataset online. The URL, killedbypolice.net, is fairly self-explanatory. It's a community-sourced list of all "police-involved fatalities", started in May of 2013, but the data itself was a bit jumbled and messy. Race and gender identifiers were in the same column, dates were inconsistent, and though most entries had a news story, we felt there was more information we wanted to know. We put the dataset…


Added by Leena Kamath on June 19, 2015 at 8:30am — No Comments

Opensource Robotics Projects List


Added by Pansop on June 19, 2015 at 1:53am — No Comments

Big Data = 3 data issues

There are at least two definitions for Big Data: a broad sense definition and a strict sense definition. For the broad sense definition, Big Data includes all the possible available data on earth.  For the strict sense definition, Big Data is a term for large datasets that traditional data processing applications cannot handle. We are going to follow this last approach.

For Davenport [2014] “Big Data refers to data that is too big to fit in a single server, too unstructured to fit…


Added by Luis Cavique on June 18, 2015 at 7:00am — 2 Comments

What Are the Differences Between Quantitative and Qualitative Data Analysis?

Companies are fighting tooth and nail to stay ahead of the competition. Besides deploying aggressive market campaigns, they are focusing on increasing their dependency on research in order to understand market competition and trends. The changing market dynamics entails closer look at various aspects connected to the data.

The market research aims at giving business organizations information about customers or markets for making informed business decisions. This effort to collect…


Added by Daina Martin on June 17, 2015 at 9:30pm — No Comments

Weekly Digest, June 22

The full version is always published Monday. Starred articles are new additions or updated content, posted between Thursday and Sunday. The picture of the week is from the contribution marked with a +, where you will find the details.


  • SlideRule's …

Added by Vincent Granville on June 17, 2015 at 10:00am — No Comments

Taking Control of your CRM Data

CRMs are supposed to be used to achieve better efficiency. By investing time in the CRM, sales teams should be able to identify leads, retain existing customers and successfully recruit new clients to the fold.

Your research and development team should be able to use the metrics from the CRM to drive next year’s products, and the marketing team should be able to feed…


Added by Martin Doyle on June 17, 2015 at 5:30am — No Comments

Python NLTK Tools List for Natural Language Processing (NLP)


Added by Pansop on June 16, 2015 at 9:00pm — No Comments

Quality and correctness of classification models. Part 3 – Confusion Matrix

In the last part of the tutorial we introduced quantitative indicators of classification model quality. In the next two parts we will take a closer look at a couple of graphical indicators. The first one is called the Confusion Matrix (the name „Contingency Table” is also used).

What is a Confusion Matrix?

Confusion Matrix is an N x N matrix, in which rows correspond…


Added by Algolytics on June 16, 2015 at 12:30pm — No Comments

Visualize your Social Media Analytics

In an earlier blog post on Making the Business Case for Text Analytics , I had spoken of the importance of Social Media Analytics and specifically Text Analytics within the context of Social Media.for big and small business. Social Media plays a critical role in today's world  in  understanding, measuring and influencing the real time perception of your company and/or…


Added by Mark Sharma on June 15, 2015 at 7:05pm — No Comments

How to Avoid a Data Disaster- Infographic

Infographics provided by SupremeSystems.

We reveal some interesting statistics around data loss and also offer some helpful advice about what an effective data backup plan should look like. For example, did you know that this year, 40% of small to medium businesses that manage their own network and use the Internet for more than e-mail will have their network accessed by a hacker? Also, find out what are the main…


Added by Vincent Granville on June 15, 2015 at 5:39pm — 1 Comment

Elastic Is a Great Paradigm When “All You Can Eat Data Consumption” Is the Goal

Guest blog post.

Today, people are no longer looking to reduce their data consumption. In fact, if anything, they want more data originating from more sources and with more diversity than anyone could have ever imagined. As we pioneer a world where data can be digested easily, software solutions need to be engineered so they can expand to meet the customers demand. Increasingly, and because of this trend, more and more software…


Added by Vincent Granville on June 15, 2015 at 5:32pm — No Comments

Data Structure Graph - The application of Graph theory to Architecture

How does centrality affect your Architecture?

Some time ago, I was responsible for a data architecture I had mostly inherited. There were a number of tweaks I worked to on to refine the monolithic nature of the main database. It was a time of upheaval in this organization. They had outgrown their legacy Computer Telephony Interface application. It was time to create something new. 
A large new application development team was brought in to develop some new…

Added by Doug Needham on June 15, 2015 at 1:00pm — No Comments

Reducing Data Cleansing Time to Get Actionable Insights Faster

Guest blog post by TheKiniGroup, originally posted here.

If you categorized how you spend your time at work every day, on which tasks do you spend the bulk of your time? Most business analysts spend 50 to 80 percent their time…


Added by Vincent Granville on June 15, 2015 at 11:56am — No Comments

Feature Scaling and Normalization

Very long article posted by Sebastian Raschka in 2014. Here we only provide the table of content, and a chart showing the results of PCA applied to a wine dataset. A link to the full version is provided below. The article is rather technical and uses Python, including the scikit-learn, numpy. pandas and matplotlib libraries. Interesting for anyone working with scores and looking for normalization, though personally, I don't like PCA (produces meaningless reduced variables and sensitive…


Added by Vincent Granville on June 15, 2015 at 8:26am — No Comments

Featured Monthly Archives












© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service