Subscribe to DSC Newsletter

Gridlex's Blog (4)

Calculate Cosine Similarity Using Scipy – Data Sets & Sample Code

What is Cosine Similarity?

Cosine Similarity is a measure of similarity between two vectors that calculates the cosine of the angle between them. Similarity ranges from −1 meaning exactly opposite, to 1 meaning exactly the same, with 0 usually indicating independence, and in-between values indicating intermediate similarity or dissimilarity.…

Continue

Added by Gridlex on April 12, 2015 at 8:30pm — 1 Comment

What Technology & Tool Skills Do Data Scientists Jobs Require?

People often ask “What technology & tool skills do I need to develop to be a data scientist?”. We decided to go straight to the source of job descriptions & check what requirements are being asked for while people hire data scientists. We analyzed about 1660 job postings which had “Data Scientists” in the job title and decided to further search for specific technology & tools skills that are required in those job descriptions.

Our first analysis involved understanding what…

Continue

Added by Gridlex on April 2, 2015 at 7:45pm — 1 Comment

6 Cloud Based Machine Learning Services

Developing machine learning solutions that give a lift from your existing prediction algorithms is not an easy task. They require a multitude of activities to get it right including cleaning up the data, setting up the infrastructure, testing &re-testing the model & finally deploying the algorithm.

Here are five machine learning services that can help…

Continue

Added by Gridlex on April 1, 2015 at 7:30pm — No Comments

Solving Poisson Distribution Problems Using SciPy

Imagine the following business problem:

A call center has a rule that if more than 8 customers calls in 24 hours about Issue X, then there should be an alarm & that that Issue X should be forwarded to Tier 2 team for further investigation. However, the Tier 2 team believes that 24 hours is too long to wait since the customer experience could suffer. They want to predict BEFORE the 24 hour interval. Therefore, they want the probability at any given time based on historical hourly…

Continue

Added by Gridlex on January 28, 2015 at 2:30am — No Comments

Videos

  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service