visualizing the humanities with computational methods

As a freshman at Harvard, I took HUM 10 and CS 50, which were, respectively, the introductory classes to the humanities and computer science. Both classes had a tremendous impact on me — CS 50 was where I learnt computer science for the first time, and HUM 10 was where I made some deeply impactful relationships with faculty and peers.

I submitted this web application, Project CodeRead, as my CS 50 final project — a marriage of what I learnt from these two classes. CodeRead applies a number of computational tools to analyze literature from various visual perspectives. It belongs to the field known as digital humanities, the use of computational methods to better understand works and art traditionally classified under the humanities.

This project is dear to my heart. As embarrassing as the code is to read and as rudimentary as it looks in retrospect, this website is the first, significant piece of digital work I have created. It also speaks to my strong belief that there are many enriching connections between CS and the humanities, waiting to be discovered.

Check out some of the cool features of the website below!

A novel can be visualized as a graph, where the nodes are words (higher frequency = bigger node) and edges are relationships between words (e.g. same sentence). The graph above is generated for Jane Austen's Emma.
Can we distinguish genres based on easily computable metrics? As we can see, philosophical works tend to have fewer past tenses and longer sentences than sci-fi novels (unsurprisingly).