Your Music Friend
Made by Smokey
Made by Smokey
To reconsider social data mining in respect to music recommendation engines. This project embraces 'little data', only using curated lists of music from specific tastemakers to generate music recommendations, it uses the data processing methods of big data to automate the entire process. Hand picked without a hand picking.
Created: September 29th, 2015
Big data and mining social media trends has done great things for recommendation engines for all sorts of media types, but several of the inherent design features of these systems has lead to flaws in the system. I propose a solution that circumvents many of the issues of a music recommendation system by focusing on ‘little data’. That is: users whose influence is more significant than an average user, whom I call “Tastemakers”. The core concept is to design a service that feels more like a staff wall at a local record store than any “web service”.
Music recommendation engines have the following issues:
Spotify recommends music based off of the music listening habits of it’s users. The system both influences the listening habits and is driven by them. This leads to a ‘long tail’ where an artist must hit a certain threshold of success to continue success, and is detrimental to more modest artists. last.fm, iTunes, and other recommendation engines that are also listening services suffer from this
Stores or users who play spotify as a radio station all day long have a disproportionate influence over the system, as their 247 number of listens is far greater than any regular listener.
The average music listener listens to the same cycle of artists, and infrequently is discovering new bands to listen to. Not every user has listened to enough music to form a taste that is worth accounting for in data. Their listening data, in terms of building recommendation engines for new music discovery, is not as useful as a user who is actively searching for and trying out new artists.
Attempting a sort of ‘music genome’ to algorithmically determine similar artists to recommend is hinged on the assumption that a if a user likes one artist, they will like a similar artist. This usually correct assumption leads to an unaccounted for contrapositive, where the the user will not like artists that are not similar to the original artist. This is false, and locks users into listening to only particular styles, genres, time-periods, and other taxonomic categories instead of a diverse spread of music.
By valuing only the ‘freshest’, ‘hottest’ album drops, recommendation engines disvalue music that is not recently produced. Look at the movie industry, where a movie must be successful in its first few weekends to make any money. This, in my opinion, is an unhealthy model for music listening, particularly as music is not ephemeral to it’s time of production, and that a user can listen to far more music than they can watch movies, and enjoy music repeatedly, coming back to music long after it’s release date. Recommendation engines that value fresh music too much encourage this environment.
One can avoid these issues, if - in the automotive spirit of big data - one embraces ‘small data’ and starts paying attention to specific users to give a service a hand-picked flair.
I built a music recommendation engine that’s source data is built solely from the personal and intentional recommendation of certain users I call ‘tastemakers’. They not only listen to music, but actively select a track to be included in the database. The engine randomly selects tracks from this curated database, weighted towards the music most recently added. This engine pulls in a spotify iFrame for convenience, but the website is not a listening platform. For discovering new music, it gives the user no input on the matter. They are presented with a track and they are forced to form their own opinion on it.
I wanted to create a music recommendation engine that feels more akin to the ‘staff recommendations’ wall at a local record store than any infinite, algorithmically generated big-data, database.
For creating site, I decided to go with a client-side javascript (and jQuery) solution, as I figured many of the libraries for interacting with services would already exist, and a website makes an easy platform for an audience. Just go to a url, and listen to ‘good’ music. No barriers. After dealing with a lot of API dead-ends and, I ended up pulling music from a curated database through google spreadsheets, and the engine - which now requires human effort to keep updated - was born very quickly after this point. The design and development stage a breeze once I knew which tools I could easily work with. Almost all of the time spent was spent frustratedly trying to decipher various API documentation.
I created a website that, on every page load, shows you a random artist from a curated database. View it at listentothis.hdyar.com. It is built with JQuery, Miso Project, and IFTTT. I used Skeleton as a framework for the html and the css.
The algorithm that re-assigns weight based on how recently the track was added has yet to be implemented, although it’s an easy process once I figure out the spreadsheet math. There are a lot of changes I would like to make, eventually removing the random component, so the weighting isn’t a high priority.
My next step is to remove the randomness feature, and just let the website work it’s way through the database (at a generated pace that roughly matches the pace music is being added), theming it feel like a record store’s staff wall of ‘check this music out’ in design and in function. This will force viewers to respect the recommendation, and not just refresh their way through a lot of artists, judging the music by it's album art.
I also would like to remove the ‘effort’ from maintaining the database, which would basically involve rewriting the whole thing - in effect, I made a statement about big data by avoiding big data, but I would like to pull automatically from tastemakers all over the country, then analyze what they are listening to. if I can figure out which users are tastemakers outside of my personal community and follow them (on spotify or last.fm, or music blogs, or wherever). The current implementation is ‘small data’, but I would like the data perhaps being a bit bigger. Not big data big, but a much larger selection of tastemakers.var intro = ['I\'m your music friend. I think you should listen to','Life is too short to listen to the same music all the time. Try listening to','If you love someone, give them good music. Like','I like you. Here is some music I think is good.','We should get together and listen to','You have great taste. That\'s why you will enjoy','Listen to whoever you want. But also listen to','You should cook a fine Italian dinner while listening to','Why don\'t you listen to','Hey, I think you should listen to','You know what artist you should check out?','Go ahead and listen to this artist.','Need new music? Why not','Psst! Check the following band out','I\'really been digging','I think you might enjoy','I think you might enjoy listening to','I have a hunch that you should listen to','The stars are telling me you should listen to','I head about this band from a friend of a friend.','You probably have not heard of','I demand that you listen to','Steve told me about this band, but who trusts Steve?','I like this artist','I used to own this record on cassete.','Want to come over and listen to the records of','I once fell in love while listening to','This band was playing during my first kiss.','Listen to','Check out the this artist. Or don\'t. I won\'t judge you either way.'];
var randomIntro = Math.floor(intro.length * Math.random());
var songs = new Miso.Dataset({
importer : Miso.Dataset.Importers.GoogleSpreadsheet,
parser : Miso.Dataset.Parsers.GoogleSpreadsheet,
key : "1z3cDIY4ED3M-oZG8PNEG--867oJqCtMe1M0YRgfzx5M",
worksheet : "1"
});
var totalSongs = 0;
songs.fetch({
success : function() {
songs.each(function(row, rowIndex) {
totalSongs++;
//any other parsing goes here, will probably need to parse the weight of each item here. Currently, I don't have a script that changes the weight of the items.
//All of the weight is 1 - that is, the same. Eventually I will make it so it distrubutes, favoring the more recently added items.
});//end each
var Song = songs.rowByPosition(Math.floor((Math.random() * totalSongs -1)+1));//Just straight random for now.
var latestSong = songs.rowByPosition(totalSongs -1);
var songId = Song.link.replace("https://open.spotify.com/track/","");
$("#listentothis").append('<h1>'+Song.artist.toString()+'</h1><br /><h5><a href='+Song.link.toString()+'>'+Song.title.toString()+'</a></h5></div><hr />');
$("#listentothis").append('<iframe src="https://embed.spotify.com/?uri=spotify%3Atrack%3A'+songId+'&theme=white&view=coverart" width="500" height="580" frameborder="0" allowtransparency="true"></iframe>')
$("#intro").text(intro[randomIntro]);
$("#stats").text("Database last updated "+latestSong.date_added);
},//end success if
error : function() {
console.log("Well, it's all over now. Give up. Cry.");
}//end error if
});
Click to Expand
To reconsider social data mining in respect to music recommendation engines. This project embraces 'little data', only using curated lists of music from specific tastemakers to generate music recommendations, it uses the data processing methods of big data to automate the entire process. Hand picked without a hand picking.