Further adventures in the digital humanities?

Written by: Spencer Greenhalgh

Primary Source:  Spencer Greenhalgh

[This post originally appeared on the MSU Digital Humanities blog]

As anticipated on this very blog, I recently spent a week in Indianapolis attending a workshop on computational text analysis at HILT 2016. We spent our time surveying a number of different tools, techniques, and concepts related to text analysis, so I walked away with a greater appreciation for data cleaning, Weka, HathiTrust, metadata, Python, and much more. The most frustrating part of the workshop was that we visited each topic so briefly and that we had so few opportunities to apply these techniques to our own work. I can’t fault the workshop organizers for these decisions—helping participants take a dozen wildly different datasets through deep dives into a particular technique would have been difficult—but I was excited enough by a lot of the concepts we covered that I was itching to try them out myself.

This was the most true of topic modeling, a technique for identifying different “topics” (or themes, or discourses, or…) in the documents of a particular corpus. As we tried out this technique on a corpus of slave narratives, I was amazed at how an algorithm was able to tease out what seemed to be clearly distinct themes within and across these narratives. One of our instructors warned us against being too impressed, explaining that the underlying math was actually really simple. He certainly had a point, and I know the importance of not being blindly wowed by what an algorithm seems to do, but to not think of topic modeling as amazing because it really comes down to conditional probabilities seemed to me akin to choosing to not recognize the wonder of the French language because at its roots, it’s an arbitrary collection of mouth sounds.

That said, neither French nor topic modeling can be really useful or truly amazing for me unless I spend some time figuring out how it works. I went to HILT hoping to learn a couple of neat tricks, but I came away convinced that topic modeling could have some real value for me. Over the past few weeks, I’ve added to my notebook full of dissertation brainstorming scribbles a number of references to topic modeling, and over the next few months, I hope to learn more about the process, dive more into the details, and make this a part of the work that I do.

The following two tabs change content below.
Hi there! My name is Spencer Greenhalgh, and I am a student in the Educational Psychology and Educational Technology doctoral program at Michigan State University. I came to Michigan State University with a strong belief in the importance of an education grounded in the humanities. As an undergraduate, I studied French and political science and worked as a teaching assistant in both fields. After graduation, I taught French, debate, and keyboarding in a Utah private school before coming to MSU, where I plan to study how technology can be used to help students connect the humanities with their lives. I have a particular interest in the use of games and simulations to promote ethical reasoning and explore moral dilemmas, but am eager to study any technology that can help students see the relevance of studying language, culture, history, and government.