Subscribe to DSC Newsletter

March 2017 Blog Posts (118)

Anime Reviews and Scores

Contributed by Yisong Tao. 

Anime, the abbreviated pronunciation of "animation" in Japanese, refers to animation from Japan. Its style is often characterized by colorful graphics, vibrant characters, and fantastical themes As someone who watched lots of anime growing up, I turned my sight to the largest anime and manga database and community - MyAnimeList.net.

Web Scraping:

The website is…

Continue

Added by NYC Data Science Academy on March 31, 2017 at 12:00pm — No Comments

New York City Public High Schools

Contributed by Yisong Tao. 

Choosing a high school is one of the first big decisions in life. With over 400 public high schools in New York City, the families of students can be overwhelmed by the long list of schools, each of which promises secondary education for students with curriculum ranging from biology to engineering to musical theater. The options seem endless. Using the public high school data sets released by NYC Department of Education, I made this…

Continue

Added by NYC Data Science Academy on March 31, 2017 at 12:00pm — No Comments

Visualizing the Game Style and Shooting Performance among Superstars via NBA Shot-log

Contributed by Xinyuan Wu. 

In the NBA, a top player makes around a thousand shots during the entire regular season. A question worth asking is: What information can we get by looking at these shots? As a basketball fan for more than 10 years, I am particularly interested in discovering facts that can not be directly seen on live TV. When I was surfing on web last week, I found a data set called NBA shot-log from Kaggle. This data summarizes every shot made by each…

Continue

Added by NYC Data Science Academy on March 31, 2017 at 11:30am — No Comments

Exploring Vehicular Collisions within NYC

Contributed by Regan Yee.

New York City — it's known for its fast walkers, busy people, and of course aggressive drivers. Many people who live here know that it is a nightmare to drive here; there are too many cars, too few parking spots, jaywalkers, cyclists, flashing lights, honking, and the list goes on. It's no wonder that after living here for a majority of my life that traffic conditions in Boston felt very tame in comparison. When I was in Boston for college, the…

Continue

Added by NYC Data Science Academy on March 31, 2017 at 11:00am — No Comments

Alternatives to algebraic modeling for complex data: topological modeling via Gunnar Carlsson

For many, mathematical modeling is exclusively about algebraic models, based on one form or another of regression or on differential equation modeling in the case of dynamical systems.  

However, this is too restrictive a point of view.  For example, a clustering algorithm can be regarded as a modeling mechanism applicable to data where linear regression simply isn’t applicable. Hierarchical clustering can also be regarded as a modeling mechanism, where the output is a dendrogram and…

Continue

Added by Jonathan Symonds on March 31, 2017 at 7:00am — No Comments

Where There Are Numbers, There Is Box Plot

Multiple numeric columns in data and even more techniques at hand to analyse the data, like histograms, ANOVA, mean/median, contingency tables, scatter plots, variance…what to choose for exploratory or descriptive analytics!

Sounds a bit geeky! Let me simplify

This is an everyday scenario faced by an analyst. There are too many numbers and challenge is to communicate the scenario to business folks. Whether its competitive analysis, internal sales analysis,…

Continue

Added by saurabh ajmera on March 31, 2017 at 6:00am — No Comments

Finding "Gems" in Big Data

(Photo credit:  Rob Lavinsky, iRocks.com – CC-BY-SA-3.0)

In 1945, Count ,Richard Taaffe* a Dublin gem collector,…

Continue

Added by Peter Bruce on March 30, 2017 at 2:30pm — No Comments

More than ML: Guide to the Components of AI

When I tell people that I work at an AI company, they often follow up with “So what kind of machine learning/deep learning do you do?” This isn’t surprising, as most of the market attention (and hype) in and around AI has been centered around Machine Learning, and its high profile subset, Deep Learning, and around Natural Language Processing, with the rise of the chatbot and virtual assistants. But while machine learning is a core component for artificial intelligence, AI is in fact more…

Continue

Added by Precy Kwan on March 30, 2017 at 8:00am — No Comments

Book: Advanced R (Chapman & Hall/CRC The R Series)

An Essential Reference for Intermediate and Advanced R Programmers

Advanced R presents useful tools and techniques for attacking many types of R programming problems, helping you avoid mistakes and dead ends. With more than ten years of…

Continue

Added by Emmanuelle Rieuf on March 30, 2017 at 3:30am — No Comments

The Future is Bright for Banking

2018 is likely to be a game-changing year for the banking and finance sector. As the General Data Protection Regulation (GDPR) and Revised Payment Service Directive (PSD2) are implemented across the European Union, the exclusive control of banks and other financial institutions on financial data of their customers is about to end. These new…

Continue

Added by Ronald van Loon on March 30, 2017 at 2:00am — No Comments

Book: Neural Networks and Statistical Learning

About the Textbook:

Providing a broad but in-depth introduction to neural network and machine learning in a statistical framework, this book provides a single, comprehensive resource for study and further research. All the major…

Continue

Added by Emmanuelle Rieuf on March 29, 2017 at 5:00pm — No Comments

How to think like a Data Scientist

A data scientist needs to be Critical and always on a lookout for something that misses others. So here is some advice that one can include in the day to day data science work to be better at their work:

1. Beware of the Clean Data Syndrome

You need to ask yourself questions even before you start working on the data. **Does this data make sense?** Falsely assuming that the data is clean could lead you towards wrong Hypotheses. Apart from that, you can discern a…

Continue

Added by Rahul Agarwal on March 29, 2017 at 10:00am — 3 Comments

Misuses of Statistics: Examples and Solutions

This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, Hadoop, decision trees, ensembles, correlation, ouliers, regression Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, time series, cross-validation, model fitting, dataviz, and many more. To keep receiving these articles, …

Continue

Added by Vincent Granville on March 29, 2017 at 9:30am — No Comments

The seven deadly sins of statistical misinterpretation, and how to avoid them

By Winnifred Louis, Associate Professor, Social Psychology, The University of Queensland, and Cassandra Chapman,PhD Candidate in Social Psychology, The University of Queensland.…

Continue

Added by Data Geek on March 29, 2017 at 8:30am — No Comments

Which Celebrity Movie Star Do You Look Like?

Interesting machine learning app posted on Vonvon. You need to log on using Facebook, and it automatically fetches your profile pictures, but you can also upload some pictures from your computer. It then tries to find a match in a database of star pictures. It asks for your gender, but obviously as you can see below (this is me, on the far left) not for your age. It would be interesting to test two different pictures of the same…

Continue

Added by Vincent Granville on March 28, 2017 at 9:00am — No Comments

SLAM! The Sound of Autonomous Vehicles Colliding

Summary:  Autonomous Vehicles (AVs) are supposed to be just around the corner but the anecdotal evidence is that their claims to safety are way out ahead of reality.  The solution may be in a shared segment of on-board telematics, part of the SLAM group (simultaneous localization and mapping) and sharing some of that data car-to-car.

 …

Continue

Added by William Vorhies on March 28, 2017 at 8:54am — No Comments

Free Deep Learning Textbook

The Deep Learning textbook is a resource intended to help students and practitioners enter the field of machine learning in general and deep learning in particular. The online version of the book is now complete and will remain available online for free. For more information about this 700+ pages free book and its authors, click here

The picture below represents a selection of (non-free) deep learning…

Continue

Added by Vincent Granville on March 28, 2017 at 8:30am — No Comments

Prediction Algorithms in One Picture

This infographics was produced by Dataiku.

Click here to find the original image, along with the article describing the various concepts.  Here are other interesting pictures illustrating…

Continue

Added by Vincent Granville on March 28, 2017 at 8:30am — No Comments

What is Hadoop?

This article was posted on Intellipaat. 

Hadoop is an open-source framework developed in Java, dedicated to store and analyze the large sets of unstructured data. It is a highly scalable platform which allows multiple concurrent tasks to run from single to thousands of servers without any delay.

It consists of a distributed file system that allows transferring data and files in split seconds between different nodes. Its ability to process efficiently even if a node…

Continue

Added by Emmanuelle Rieuf on March 28, 2017 at 7:30am — 1 Comment

Monthly Archives

2017

2016

2015

2014

2013

2012

2011

1999

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service