Updates on code for Web scraping users’ Twitter profiles

Written by: Spencer Greenhalgh

Primary Source:  Spencer Greenhalgh

A couple of weeks ago, I spent some time discussing how to use R and Web scraping to retrieve information on Twitter users’ locations, as stored in their profiles.

I’ve since updated the code to scrape not only locations, but names, descriptions, locations, personal websites, join dates, number of tweets, number of users following, number of followers, and number of favorited/liked tweets. I stopped short of tracking the number of lists each user has created, since I was having trouble with it, and the little bits of code I wrote to clean white space out of the text have been commented out, since I can’t vouch for their accuracy (they were also pretty crude measures, so I need to come up with something better).

If this might be useful to you, feel free to check the code out here.

Also, Paul Pival at the University of Calgary recently took the #educattentats data that I was originally working with and tried a different approach for scraping profile information. If you’re interested in different methods for working with Web data, his post is worth a read!

The following two tabs change content below.
Hi there! My name is Spencer Greenhalgh, and I am a student in the Educational Psychology and Educational Technology doctoral program at Michigan State University. I came to Michigan State University with a strong belief in the importance of an education grounded in the humanities. As an undergraduate, I studied French and political science and worked as a teaching assistant in both fields. After graduation, I taught French, debate, and keyboarding in a Utah private school before coming to MSU, where I plan to study how technology can be used to help students connect the humanities with their lives. I have a particular interest in the use of games and simulations to promote ethical reasoning and explore moral dilemmas, but am eager to study any technology that can help students see the relevance of studying language, culture, history, and government.