Sunday, August 05, 2007

Preparing for a Computer Content Analysis of Second Life media coverage

For my master's thesis at the Harvard Extension School, I performed a Computer Content Analysis of news articles from the New China News Agency in the 1970s, 1980s and 1990s. Later this week, I am going to use some of the same CCA skills and tools to perform an analysis of Second Life-related news coverage. To that end, I've been spending my free moments this weekend creating a Yoshikoder-friendly version of the General Inquirer negative dictionary used for computer content analysis of political texts. It entails adding wildcards, which Yoshikoder recognizes, but the original GI program did not. This means that the dictionary will be far more sensitive to variations of common negative terms. The creators of the GI dictionary got some variants -- for instance, "exasperate" and "exasperation" -- but missed many other obvious ones, such as "exasperates" and "exasperating". Using "exasper*" will catch these terms.

Of course, wildcards don't work for every word. For instance, "envy" and "envious" could be replaced by "env*", which would get variants such as "envies," but would also catch unrelated words with neutral or even positive meanings -- "envelope," "envision", etc. In this case, I simply added "envies" to the list, rather than using a wildcard.

Converting the General Inquirer dictionary is no easy task. There are 2000 words in the original dictionary that my thesis director gave me (although I see another version contains 2291 words), and each one requires manual review to ensure that wildcards are effectively used and don't introduce unwanted terms into the content analysis that I am planning -- a review of press coverage of Second Life in the past 18 months. Although the GI dictionaries were originally created to examine political texts, I believe they can be used to evaluate other types of text content as well. The GI negative dictionary doesn't contain some of the terms that one typically sees in American or British media articles about new technologies, but it does have a very solid baseline list of negative terms that one might see anywhere.

No comments:

Post a Comment

All comments will be reviewed before being published. Spam, off-topic or hateful comments will be removed.