Written by: Spencer Greenhalgh
Primary Source: Spencer Greenhalgh
I recently stumbled upon a post on R-bloggers entitled “Qualitative Research in R.” This title got me pretty excited, since I’m generally excited about most things R and since I recently helped teach a qualitative methods course, which has had me thinking about adding more ethnographic and other qualitative elements to my work. I’d also heard of the RQDA package before but hadn’t had much luck in getting it to work
The article has some good advice on text mining and word clouds—however, I was disappointed to see text mining and word clouds described as qualitative research. This seems to me to be an example of the oversimplified assumption that qualitative = words and quantitative = numbers. Consider the following passages from the post that illustrate some of the problems with assuming that work is qualitative just because it involves text:
[Text mining is a] method which enables us to highlight the most frequently used keywords in a paragraph of texts or compilation of several text documents.
There’s no denying that text mining (as described here) involves words, which makes it tempting to describe it as qualitative research. However, it’s clear from this description that text mining (as described here) is also based on frequency counts. Even more advanced forms of text analysis (like topic modeling rely more on underlying quantitative analysis that happens to manifest itself in terms of words.
However, it should be obvious that these text analysis techniques do not understand the meaning of the words that they manipulate, instead treating them as arbitrary values that happen to correlate, coincide, etc. It’s similar to a phenomenon described by Stefan Fatsis in his wonderful book Word Freak, where highly-competitive Scrabble players start to see words not as units of meaning but as combinations of game pieces, board spaces, and point values, a phenomenon that allowed a New Zealander who doesn’t speak a word of French to win the 2015 Francophone World Championship of Scrabble.
Another passage shows that the same is true of word clouds (which my colleague Josh Rosenberg has written further about here):
Finally, the frequency table of the words (document matrix) will be visualized graphically by plotting in a word cloud with the help of the following R code.
Again, the R function that is composing the word cloud doesn’t understand those words any more than our Kiwi friend understood French—even if they are both able to do something pretty cool with language. I’d go so far as to argue that they aren’t really working with words, they’re just working with the numbers underneath.
This isn’t to say that automated text analysis is useless (on the contrary, I’m a pretty big fan) or that text analysis couldn’t inform qualitative research (on the contrary, I think digital humanists are doing a great job of exploring this). It’s simply to say that qualitative research is rich enough of a tradition that it would be a tremendous disservice to describe text mining alone as a qualitative research method.