Skip to content

Loretta C. Duckworth Scholars Studio

⠀

Menu
  • Scholars Studio Blog
    • Digital Methods
      • coding
      • critical making
      • data visualization
      • digital pedagogy
      • immersive technology (AR/VR)
      • mapping
      • textual analysis
      • web scraping
    • Disciplinary Fields
      • Anthropology
      • Archaeology
      • Architecture
      • Art History
      • Business
      • Computer Science
      • Critical Digital Studies
      • Cultural Studies
      • Dance
      • Economics
      • Education
      • Environmental Studies
      • Film Studies
      • Gaming Studies
      • Geography
      • History
      • Information Science
      • Linguistics
      • Literary Studies
      • Marketing
      • Media and Communication Studies
      • Music Studies
      • Political Science
      • Psychology
      • Public Health
      • Sculpture
      • Sociology
      • Urban Studies
      • Visual Art
    • Cultural Analytics Practicum Blogposts
  • Current Staff
  • Newsletter
  • About
    • Games Group 
Menu

Exploring Stylo to Wrap up the Semester

Posted on January 12, 2017August 26, 2019 by

By Jillian Benedict

While I spent most of my time proofreading every book I had scanned to make sure I had a clean digital copy, I was unable to complete every book. Therefore the data below only relates to six books, not including Jon Krakauer’s most recent work, Missoula. However, since I still completed transforming Krakauer’s earlier books, I wanted to take the little time I have left at Temple to play with my corpus in R (I chose to use RStudio as I am more comfortable with the RStudio interface).

The six books in question can be divided into two groups based on their copyright dates. Eiger Dreams, Into the Wild, and Into Thin Air, were  published in the 1990s; Under the Banner of Heaven, Where Men Win Glory, and Three Cups of Deceit, were published within the first two decades of the 2000s. All six books cover different topics, but all six are clearly works of investigative journalism written by the same author.

Before installing the stylo program in R, I made make sure that all my text files in my corpus were the same type of file. I chose a plain text file to avoid unnecessary programming. From there I installed the stylo program, which allows Rstudio to access the functions involved in stylo. Once the program recognized that stylo had been installed, I set my work directory to the parent folder in which my corpus folder was located. After that, all I had to do was call up the stylo function and R did the rest of the legwork.

Code in console section of RStudio
Code in console section of RStudio

Since stylo is a package that focuses on statistics, there was not a lot of programming for me to do; however, I was glad I spent so much time getting comfortable with R. It made it easier to identify what was wrong with my code when I received an error message. The difficulty with the stylo package is taking the results of the function, placing the results in the context of your project, and figuring out what these results mean. When I applied the stylo function to my corpus containing the plain text files of Krakauer’s novels, a plot appeared in the interface.

Cluster Analysis Plot
Cluster Analysis Plot

This plot, a cluster analysis, separated the texts (identified by different colors) into different branches based on proximity in the style of the plain text documents in the corpus. Even though this graph does not explicitly suggest what is different about the style of each book, there is a clear distinction between those published in the 90s and those after 2000. Perhaps Krakauer’s style is different depending on how much autobiographical information is included in the book (works published in the 1990s contain more autobiographical info). Perhaps Krakauer’s style changed after 9/11. It is impossible to know for sure. Krakauer himself may not even be aware of a change in his style.

To make sure I was reading the cluster analysis plot correctly, I altered the graphing preferences for the stylo plot in preferences and R produced a consensus tree. The separation between the two groups was not as clear at first, but when I really looked at the data I saw that the novels were once again separated into the same groups as before and were situated on opposite ends of the tree.

Consensus Tree Plot
Consensus Tree Plot

Due to a lack of time and experience, I did not have the opportunity to explore the stylo program as much as I would have liked. While I would not say that I came to a real conclusion, through my proofreading (I read each book to correct OCR inconsistencies) and the data provided by the stylo package, I do feel comfortable suggesting that there appears to be a difference of some sort between Krakauer’s earlier published works and his more recent ones.

 

To learn more about Stylometry and the stylo package, check out, “Stylometry with R: A Package for Computational Text Analysis” by Maciej Eder, Jan Rybicki and Mike Kestermong.

 

Novels and copyright dates:

 

Eiger Dreams: 1990

Into The Wild: 1996

Into Thin Air: 1997

Under the Banner of Heaven: 2003

Where Men Win Glory: 2009

Three Cups of Deceit: 2011

Missoula: 2015

Leave a Reply

You must be logged in to post a comment.

Recent Posts

  • The Untold History of Fletcher Street’s Stables April 21, 2025
  • Building an Immersive Archive of the Greek Orthodox Churches in Istanbul April 15, 2025
  • Tracing Influence in Genealogies of Communication Theory April 14, 2025

Tags

3D modeling 3D printing arduino augmented reality banned books coding corpus building critical making Cultural Heritage data cleaning data visualization Digital Preservation digital reconstruction digital scholarship film editing game design games gephi human subject research linked open data machine learning makerspace makerspace residency mapping network analysis oculus rift omeka OpenRefine Photogrammetry Python QGIS R SketchUp stylometry text analysis text mining textual analysis top news twitter video analysis virtual reality visual analysis voyant web scraping webscraping

Recent Posts

  • The Untold History of Fletcher Street’s Stables April 21, 2025
  • Building an Immersive Archive of the Greek Orthodox Churches in Istanbul April 15, 2025
  • Tracing Influence in Genealogies of Communication Theory April 14, 2025

Archives

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

Archives

Blog Tags

3D modeling (11) 3D printing (14) arduino (8) augmented reality (5) banned books (3) coding (12) corpus building (4) critical making (7) Cultural Heritage (11) data cleaning (4) data visualization (11) Digital Preservation (3) digital reconstruction (9) digital scholarship (12) film editing (3) game design (3) games (6) gephi (3) human subject research (3) linked open data (4) machine learning (6) makerspace (8) makerspace residency (4) mapping (30) network analysis (17) oculus rift (8) omeka (3) OpenRefine (4) Photogrammetry (5) Python (8) QGIS (10) R (9) SketchUp (4) stylometry (8) text analysis (10) text mining (4) textual analysis (32) top news (102) twitter (5) video analysis (4) virtual reality (17) visual analysis (5) voyant (4) web scraping (16) webscraping (3)

Recent Posts

  • The Untold History of Fletcher Street’s Stables April 21, 2025
  • Building an Immersive Archive of the Greek Orthodox Churches in Istanbul April 15, 2025
  • Tracing Influence in Genealogies of Communication Theory April 14, 2025
  • From Theory to Practice: Weaving in Response to the Grid in the Global Context March 26, 2025
  • Visiting a Land of Twilight February 24, 2025

Archives

©2025 Loretta C. Duckworth Scholars Studio | Design: Newspaperly WordPress Theme
Menu
  • Scholars Studio Blog
    • Digital Methods
      • coding
      • critical making
      • data visualization
      • digital pedagogy
      • immersive technology (AR/VR)
      • mapping
      • textual analysis
      • web scraping
    • Disciplinary Fields
      • Anthropology
      • Archaeology
      • Architecture
      • Art History
      • Business
      • Computer Science
      • Critical Digital Studies
      • Cultural Studies
      • Dance
      • Economics
      • Education
      • Environmental Studies
      • Film Studies
      • Gaming Studies
      • Geography
      • History
      • Information Science
      • Linguistics
      • Literary Studies
      • Marketing
      • Media and Communication Studies
      • Music Studies
      • Political Science
      • Psychology
      • Public Health
      • Sculpture
      • Sociology
      • Urban Studies
      • Visual Art
    • Cultural Analytics Practicum Blogposts
  • Current Staff
  • Newsletter
  • About
    • Games Group