What is Cosine Similarity?
Cosine Similarity is a measure of similarity between two vectors that calculates the cosine of the angle between them. Similarity ranges from −1 meaning exactly opposite, to 1 meaning exactly the same, with 0 usually indicating independence, and in-between values indicating intermediate similarity or dissimilarity.…Continue
I just got back from my vacation in Barcelona, Spain where I spent about 3 days, then rented a car and drove up north through the South of France. My last stop was Nice, France. The trip was a lot of fun and now I intend to find some data to help bring back great memories (hm...it sounds more geeky than I thought but anyway).
Barcelona is located in Catalonia region of Spain famous for its earthy dry reds as well as Cava - world's most delicious bubbly drink. I am a big wine fan which…Continue
Added by Tatiana Sorokina on April 12, 2015 at 8:00am — No Comments
The objective of my final project at Metis from weeks 9 to 12, is to categorize drivers based on their behaviour on the roads - their driving style and the type of roads that they follow.
The challenge associated with this objective is to identify uniquely a driver (and hence his proper “driving…Continue
First, let's start with an article featuring many great Excel functions, entitled 11 Advanced Excel Tricks That Will Help You Get An Instant Raise At Work. It describes the following Excel functions:
Nice infographics produced by famous business management consultant and author, Bernard Marr. Click on the picture, then click one more time on the picture, to see easy-to-read version.
Added by Bernard Marr on April 9, 2015 at 7:30pm — No Comments
In a meeting with Airbus last week I found out that their forthcoming A380-1000 – the supersized airliner capable of carrying up to 1,000 passengers – will be equipped with 10,000 sensors in each wing.
The current A350 model has a total of close to 6,000 sensors across the entire plane and generates 2.5 Tb of data per day, while the newer model – expected to take…Continue
I assist enterprises by driving data-driven approaches into their operations, developing market-aware products that learn from data, and encouraging data-smart cultures among the c-suite of executives. I have had the privilege to work with many talented professionals looking to disrupt their…Continue
In order for a business today to remain competitive, it must be willing to embrace new technologies. Using old or outdated technology can leave a business trailing in the dust of those newer businesses that have emerged to the forefront of the industry, especially when reaping the benefits that new technology affords them. Of course, this means that one must also be aware of new technology and how they might benefit your business, which is not always so easy to do. In fact, there is a term…Continue
Primed to make a huge entrance in 2015, Data-as-a-Service (DaaS) empowers companies with real-time data to overcome tough challenges with data. DaaS is allowing companies to generate real-time insights and revenue from Big Data. Companies commonly report feeling overwhelmed solely by the mere size of big data, not to mention the processes necessary to use the data. This no longer has to be a reality. With DaaS using big data is no longer a couple month long process.
Added by Larisa Bedgood on April 7, 2015 at 12:30pm — No Comments
Organizations are struggling with a fundamental challenge – there’s far more data than they can handle. Sure, there’s a shared vision to analyze structured and unstructured data in support of better decision making but is this a reality for most companies? The big data tidal wave is transforming the database management industry, employee skill sets, and business strategy as organizations race to unlock meaningful connections between disparate sources of…Continue
Guest blog post.
Big data makes a noteworthy contribution to the usefulness of an application, but its presence can make the design of a clean and usable interface rather difficult. Today, many web applications are built on the platform of big cloud-based data, which leads to the question: how can a designer deliver all the necessary data in an application without making a train-wreck of everything?
Creating a balance between complex data requirements and a simplified…Continue
Added by Vincent Granville on April 7, 2015 at 3:50am — No Comments
Next month marks the 100th anniversary of Babe Ruth’s first home run.
This year, opening day in baseball signals the “closing day” for one of the classic truisms among sports statisticians: the belief that…Continue
Added by Peter Bruce on April 6, 2015 at 10:40am — No Comments
At this point, I suspect a lot of us have heard of the three, four, or even seven V’s of big data. The original three V’s – Volume, Velocity, and Variety – appeared in 2001 when Gartner analyst Doug Laney used it to help identify key dimensions of big data. …Continue
Added by Anne Russell on April 6, 2015 at 8:30am — No Comments
The Unix Philosophy, summarized by Doug McIlroy in 1994:
Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.
This is considered by some to be the greatest revelation in the history of computer science, and there’s no debate that this philosophy has been instrumental in the success of Unix and its derivatives. Beyond Unix, it’s easy to see how this philosophy has…Continue
Added by John Hugg on April 6, 2015 at 7:30am — No Comments
The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday.
Added by Vincent Granville on April 5, 2015 at 5:30pm — No Comments
Gest blog post.
Vozag downloaded CRAN data from the R project to understand the top projects & which ones had the most discussions. Given below is a list of the top 20 packages downloaded in a single day. The full list of the top 100 most downloaded R packages is here.
Guest blog post by Mike Davie.
With the exponential growth of IoT and M2M, data is seeping out of every nook and cranny of our corporate and personal lives. However, harnessing data and turning it into a valuable asset is still in its infancy stage of development. In a recent study, IDC estimates that only 5% of data created is actually analyzed.…Continue
Added by Vincent Granville on April 5, 2015 at 2:30pm — No Comments
In this post, we’ll use an unsupervised machine learning technique called kmeans clustering to find naturual structures in our data. In the other blog posts, we used supervised machine learning techniques like logistic regression and linear regression to predict car prices or …Continue
Added by Peter Chen on April 4, 2015 at 6:00pm — No Comments
From all indications, 2015 is well on its way to becoming the year of cloud computing. The feverish pitch of activities at key players on one hand and the data as well as observations of industry pundits affirm this. There are apparently a handful of reason to keep the IT industry leaders awake at night.
For starters, per , 2014 revenues for cloud services grew by 60 percent. The global cloud computing market, per Forrester, is expected to grow to over $191 billion by 2020. IDC…Continue
Added by Naagesh Padmanaban on April 4, 2015 at 4:30pm — No Comments
More than a thousand keywords with detailed explanations, and hundreds of machine learning / data science books categorized by programming language used to illustrate the concepts.
Here's a selection of keywords, from the mega-list
10 keywords starting with A, this is indeed a small subset of all the keywords starting with…Continue
Added by Mirko Krivanek on April 3, 2015 at 11:30am — No Comments