Home » Uncategorized

How Spotify know a lot about you using machine learning and AI.


In this article, we will talk about

  1. How Spotify is using Artificial Intelligence and Machine Learning to enhance the experience of listeners?
  2. How it is helping artists and creators?
  3. Which machine learning, loss function, training model technologies Spotify uses in its different applications.
  4. What Spotify is planning to do in the upcoming future?

Spotify is a music streaming industry started in 2006. It got its first official launch in India in February 2019 and it already had millions of subscribers in its list. Spotify is known for its user experience, music recommendation and it is continuously getting improved. It uses artificial intelligence, machine learning, and big data to improve and personalize the music experience for its listeners.

Spotify needs no introduction. Spotify is one of the best music streaming industry in the market. But what excites us the most is the amazing ways it uses to enhance the user experience.

How Spotify is using Artificial Intelligence and Machine Learning to enhance the user experience of listeners?

We all would be familiar with “discover weekly” which is a personalized playlist unique to each user. It is using artificial intelligence and machine learning algorithms to generates the playlist. It learns through your music preferences, streaming history or how many times you listened to a particular song. Everyone’s discovers weekly is different at different times of the day.

How Spotify know a lot about you using machine learning and AI.

When you are listening to music, Spotify will monitor whether you are listening to the whole song or just skipping through it. And over time it builds up and understands the type of music you like. They even dissect this type of music by beats per minute and style the type of voices and so on. So this helps users who don’t have time, energy or skills to create their own playlist getting the playlist according to their interest.

The more you listen to the music the more data they get about you and the better their algorithm becomes of your kind of music and hence taking them on a personal listening journey.

In the further section, we will discuss the in-depth working of this system.

How it is helping artists and creators?


There was one problem in the traditional music industry of the past and that was that new creators had to go through a lot of struggle to reach the audience, even if they create the music that people will like. Spotify’s music recommendation system works on machine learning that learns about your song type and it predicts and recommends you a new song that you probably haven’t listened but you will like.

This gives a chance to music creators to get known by the people and listeners to get songs they will like. This makes happy both listeners and creators and especially help creators to become the best version of themselves. They don’t have to go through hurdles to get recognized and they can focus on creating music.

Which machine learning, loss function, training model technologies Spotify uses in its different applications.

Firstly Spotify tries to collect as much data as it can and tries to make sense of it in different ways. It creates many shared models representing the data and is used as many different applications.

And some of them are discussed below:

1. Guess the missing track from a playlist.

They have millions of playlists and they filter out the playlist that is relevant for the training. Selection is an important factor here because if you train on all available playlist it will definitely not give better results.

So what it does is it removes the song from a particular playlist and then try to guess which track is missing in using the context of other playlists. It uses the Word2Vec type algorithm.

Word2vec is a group of related models that are used to produce word embeddings. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic contexts of words. Word2vec takes as its input a large corpus of text and produces a vector space, typically of several hundred dimensions, with each unique word in the corpus being assigned a corresponding vector in the space. Word vectors are positioned in the vector space such that words that share common contexts in the corpus are located close to one another in the space.

Out of that, they get a cloud of similarities between playlists, tracks, and artists and try to map that how these artist’s music types are close to each other or how this album is close to a particular listener’s music taste.

2. Spotify Home screen: Spotify Home screen uses machine learning algorithm known as BaRT.

Basically BaRT is a Bayesian Additive Regression Trees which is a Bayesian “sum-of-trees” model where each tree is constrained by a regularization prior to being a weak learner, and fitting and inference are accomplished via an iterative Bayesian backfitting MCMC algorithm that generates samples from a posterior.

In Spotify, BaRT is used to predict the wide range of different shelves and shelf could be made for you or recommendations related to recent listening history.

spotify_algo_green-1wtl1k5How it gives a personalized experience?

BaRT algorithm work in a very interesting way to know about its user.

  • It is optimized for >30 seconds streams. It means if you listen to a song and you listen to it for more than 30 seconds it considers it your interest.
  • And then they retrain the model once a day based on interaction data collected
  • Then built the system to de-bias for positional bias. Meaning if you clicked on something on the top will be of less worth and the clicks on the bottom will be of more worth.

3. Search Bars on Spotify

Whenever users search about a query it categories its searches in a different manner like search item popularity, whether the user has searched about this item before, similarity of the item to the user taste and the distance between prefix query and the matched items. The ranking model trained on search interaction logs and use search sessions that end in success action as positive examples. And all these predictions happen in just milliseconds.

So how ranking algorithm gets its data?

It basically takes two things into account and gives the score on that basis.

  • Search results seen by the users in the past.
  • Successful Interactions in the past.
  • Score(4 on success item, 2 on the related item and 0 on everything else).

Loss Functions used in this system is LISTWISE FUNCTION.

The listwise approach addresses the ranking problem in the following way. In learning, it takes ranked lists of objects (e.g., ranked lists of documents in IR) as instances and trains a ranking function through the minimization of a listwise loss function defined on the predicted list and the ground truth list. The listwise approach captures the ranking problems, particularly those in IR in a conceptually more natural way than previous work.

And Training Model used is Lambda Mart with Maximizing NDCG(average over training dataset) using GBDT (Gradient Boosting decision trees.)

Training data consists of lists of items with some partial order specified between items in each list. This order is typically induced by giving a numerical score(4 on success item, 2 on the related item and 0 on everything else) for each item. The ranking model’s purpose is to rank, i.e. produce a permutation of items in new, unseen lists in a way that is “similar” to rankings in the training data in some sense.

Why helped Spotify achieve this level?

Spotify describes its successful implementation of machine learning in a Hyperight AB keynote in the following three ways.

  1. The large volume of playlists created by the users.
  2. The emotion attached to the user in creating those playlists.
  3. 9 years of continuous iteration and hard work.
  4. Team of user researcher, data scientist, and data engineer.

What Spotify is planning to do in the upcoming future?

In a video of Mr. Bernard Marr he provided the information when he met with the data scientist team of Spotify, they revealed

  1. They will combine this data with other data sources like GPS location, age, and work. For example, if you are commuting to your work in the morning or coming back from it. Or you are listening to music in the evening at your home or what type of music you like to listen when you are going to the gym.
  2. Also when it gets connected to the fitness tracker band or apple watch they will now know what your pulse rate is and what type of music will help you.
  3. In the upcoming future, they will be using machine learning and Artificial intelligence to automate their music recommendations. 
  4. Now you don’t have to pick the playlist manually when you are traveling or going to the gym or taking heavyweight Spotify will know what songs you will be liking at that moment of time.
  5. This will be a great implementation of AI providing real value to their customers.