Contributed by http://blog.nycdatascience.com/author/sjstebbins/“,”Spencer”;)”>Spencer James Stebbins. He enrolled in the NYC Data Science Academy 12-week full time Data Science Bootcamp program taking place between July 5th to September 23rd, 2016. This post is based on their third project – Web Scraping, due on 6th week of the program. The original article can be found here.
The average American spends 4 hours a day listening to music (SPIN) and 93% of Americans listen to music daily in some capacity (EDISON). Almost half of that listening is via AM/FM radio, but music streaming services are gaining a greater and greater market share and have even turned around the declining revenues of the music industry.
This success breeds competition among streaming providers to increase customer acquisition and retention by creating music recommendation algorithms that have just the right ingredients to serve up the tunes their customers crave.
Let’s first take a look at one of the pioneers in the music streaming service world: Pandora. Pandora was established in 2001 and has had a decreasing market share among newer streaming competitors.
Why is this? I personally believe it is because of the nature of the recommendations on Pandora. Pandora only allows users to create playlists based on one artist or song, which often doesn’t encapsulate the original mood the listener was looking for and yields poor suggestions that often are repeated after as little as 10 songs later. On the other end of the streaming service spectrum are platforms like 8 tracks, which do suggest playlists that a user may be interested in listening, but all of these playlists have to be manually created by users. Services like 8 tracks have more dynamic playlists than Pandora, but again these services rely on users to actively create playlists and then suggesting the correct playlists to users. Somewhere in the middle of the human editorial and algorithmically generated playlist spectrum of music streaming services is Spotify. This compromise is one of the main reasons Spotify has the largest market share as of March 2016. Spotify uses a combination of collaborative filtering and other playlist statistics to suggest very poignant songs to users that feel like that music guru friend you trust is making the suggestion. Spotify’s collaborative filtering works by analyzing a users’ playlist and finding other users’ playlists that have the same songs, but maybe a a few more, and then will suggest those additional songs to the original user. Spotify not only uses this method for song suggestion, but also weights songs based on whether a user favorited it and listened to it many times following the initial like or even when a user is suggested a song and skips it within the first minute. Because of this combination of collaborative filtering and certain statistics tracking, Spotify can suggest extremely accurate song recommendations that feel eerily familiar. As great as Spotify’s recommendation engine is, what if there was way to build upon its already impressive algorithm and to suggest songs that are even more playlist specific.
The general concept behind Muse is simple. Where Pandora suggests songs based on a single seed artist or song, Muse takes into account an entire playlist, the attributes of each song within a playlist, and the play counts of each song to form a better query on the new Spotify API endpoint for recommendations. In early 2014, Spotify acquired EchoNest; the industry’s leading music intelligence company, providing developers with the deepest understanding of music content and music fans. Through the use of Spotify’s API, which now allows access to these EchoNest services, developers can provide seed information and target attributes in a query and the API will send back recommended songs in its response. This EchoNest API endpoint is the backbone of many suggestive services like Shazam, Pandora etc.., As you can probably imagine, choosing the best query parameters is thus very important and indeed this is what determines the accuracy and relevance os the recommendations the API responds with. As stated previously, Muse optimizes these query parameters to be as representative of an entire playlist as possible and therefore bring back recommendations that are more relevant. So how does Muse do it?
The Muse Algorithm
Upon login via Spotify oAuth, users automatically see their local iTunes playlists and Spotify playlists on the left-hand column. If for some reason, their iTunes Library XML file is not at a standard location on their computer, users can specify the correct path for Muse to look for the file.
From here, a user can click on a playlist, which is where the magic of Muse takes place through the following algorithmic process which I have appended a diagram below followed by bullet points to explain each step in greater detail:
- Upon choosing a playlist, Muse will send each song within the playlist to Spotify’s API to get back the attributes of the song including tempo, danceability, energy etc..