By Luling Huang
My first blog (1) briefly introduces my research plan at the DSC and (2) describes the data I am gathering.
What does the project look like?
I am studying the relationship between conversational structure and belief or attitude change in political discussion. For example, in a presidential election, are Candidate A’s supporters more likely to reply when they are addressed by Candidate B’s supporters? Or when they are addressed by someone whose political leaning reflects their own? My goal is to capture the dynamics of belief or attitude change during naturally occurring social interaction. I think online political discussion forums are ideal for my research questions. These forums allow researchers to capture naturally occurring social interaction and to apply computational methods to analyze a large amount of textual data.
Methodologically, there are two main components. First, conversational structure is operationalized as a stream of “participation shifts” (Gibson, 2003; 2005). A participation shift tells us “who addresses whom” in any two adjacent turns in group conversation. A shift is denoted as [initial speaker][initial addressee][hyphen][following speaker][following addressee]. Gibson (2003, p. 1342) classified thirteen categories. Here are three examples from his inventory:
- AB-BA denotes the situation when the initial addressee B replies to the initial speaker A in a following turn.
- AB-XB denotes the situation when a third individual X addresses the initial addressee B.
- A0-XA denotes the situation when the initial addressee is the group (denoted by “0”), and one of the group members X replies to A.
We can do two things with participation shifts. (1) Group members can be differentiated based on how often they are involved in certain categories of shifts. Then we can look at how social ties, or the beliefs of members (see below) may affect differentiation. (2) A relational event model can be built by organizing participation shifts sequentially (Butts, 2008; Liang, 2014). Analysis can be done by investigating whether a set of exogenous and endogenous variables (e.g., political lean, popularity of post authors, opinion congruity, past posting behavior, etc.) have effects on the relational event streams.
Second, and what my study contributes to previous studies, is to add another “layer” to participation shifts and the relational event model just described. Attitude or belief change may occur during interaction just like participation shifts do. Therefore, if social ties among individuals can be related to participation shift (Gibson, 2005), why not attitudes or beliefs? If, for each turn, attitudes or beliefs can be measured from what individuals actually say, rather than from post- or pre-discussion survey, we can build a stream of attitude or belief shifts. The challenge for me here is to use computational methods to measure psychological states from a large amount of textual data. I will be learning how to apply opinion mining to extract attitudes or beliefs from texts. If you’re interested in my research topics or an expert on opinion mining, I’m more than happy to have a discussion and take advice.
What does the data look like?
My data is collected from debatepolitics.com. The forum is a popular website for political discussion in the U.S. It promotes a nonpartisan and civilized environment with moderators. Anyone can register as a user with an email account. Just like most forums, debatepolitics.com is asynchronous, which means that conversation occurs without requiring users being on the website at the same time. Also, I narrow down my focus on the subforum of the 2016 presidential election.
I’m collecting information of all threads under the subforum, including thread title and thread author. For each thread, I collect information of all posts, including post author, post author’s ideological lean, gender, post author’s total number of post in the forum, and post date and time. For post content, I collect the entire message, as well as any quotes contained and those quotes’ original authors. These latter information allows me to to construct participation shifts post by post.
How am I getting the data?
Considering my limited programming skills and the fairly neat structure of the forum website, I used the free Chrome extension Web Scraper to get my data. The extension uses CSS selectors to locate the type of data we want to scrape. Basically users just need to click the parts they want on a website, and Web Scraper will make its best guess at the corresponding CSS selector.
In my next blog post, I will describe how I use the Chrome extension Web Scraper to collect the data and how I proceed to clean it.
Butts, C. T. (2008). A relational event framework for social action. Sociological Methodology, 38, 155-200. doi:10.1111/j.1467-9531.2008.00203.x
Gibson, D. R. (2003). Participation shifts: Order and differentiation in group conversation. Social Forces, 81, 1335-1380. doi:10.1353/sof.2003.0055
Gibson, D. R. (2005). Taking turns and talking ties: Networks and conversational interaction. American Journal of Sociology, 110, 1561-1597. doi:10.1086/428689
Liang, H. (2014). The organizational principles of online political discussion: A relational event stream model for analysis of web forum deliberation. Human Communication Research, 40, 483-507. doi:10.1111/hcre.12034