Back to Parent

Outcome


Calvin & Hobbes

Here's the quick link to the project for the interested: autohobbes.azurewebsites.net


The comic Calvin & Hobbes has proven to be one of the most timeless and influential comics of the last century. Though it ended before I was born, I grew up reading it and now own the entire hard-cover set of Bill Watterson's work. I decided to create a Calvin & Hobbes generator that would grind up all the existing comics (spanning over a decade) and spit out new ones from the pieces. Proving to be harder than I initially imagined, I threw machine learning at the problem and created a tool that, when trained with enough data, can reproduce sensible, if not funny, comics.

Process

The first challenge was getting the comics themselves. Unable to find a suitably high-quality version of the entire collection online, I resorted to my physical copy of 3 hard-bound books, scanning in each page and running them through Photoshop to remove the background noise from the paper/shadows.

Choverview.png.thumb
Show Advanced Options

Once I had the comic strips saved, I created a Python script that went through the thousands of comics, removing the Sunday version (which are colored/oddly shaped, and wouldn't quite fit in), then scanned each image and intelligently broke it into individual frames. As the comic was not created with a standard template, the script had to look for gaps between the squares of differing sizes. After a few hours, it had created 11,997 individual frames that now make up the generator.

Chbroken.png.thumb
Show Advanced Options

Then, I created the database that powers the machine learning algorithm and keeps track of the individual panels. Each panel got its own record, and I kept multiple other tables mapping relationships between panels. I created a voting system for individual panels, panel duets, and whole comics. Having never done a machine learning project, I was struggling to come up with a data structure that could store feedback on over two trillion potential 4-panel comics, so I ended up using a rather naive algorithm that records only provided data in a vector bitmap of each image, instead saving a polynomial mapping positive and negative correlations. 

Aaaaaa.png.thumb
Show Advanced Options

I then used web sockets (with Node.js) to allow for live feedback of what other people are voting on for projects. Unless I can have thousands of people read and rate comics constantly, the data I'm gathering isn't going to make much of a difference in the algorithm that generates comics for quite some time. However, the mimicry of broadcasting upvoted comics to everyone is instant and interesting.

Product

Here's the tool in its current form: http://autohobbes.azurewebsites.net. Not focusing on its appearance, I've simply added a few tools for ensuring that everything is functioning correctly. Given many votes and much time, it will eventually produce more interesting and sensible comics using position and relative ordering data collected by the upvotes (hearts) and downvotes (X's).

Reflection

Given more time to work on this project, I'd do more front-end work and make the website a little nicer. I'd also probably spend a considerable amount of time training the data myself to create at least a few hundred data points. Overall, though, I'm incredibly surprised with how generally interesting the comics that it creates even in its current state.

Attribution

All comic strips were scanned and parsed from The Complete Calvin and Hobbes Collection. The updated algorithm (storing only logarithmic amounts of information) was conceived by my roommate, and all code (front- and back-end, roughly 1,400 lines) was created by hand.

Drop files here or click to select

You can upload files of up to 20MB using this form.