Skip to content

Loretta C. Duckworth Scholars Studio

⠀

Menu
  • Scholars Studio Blog
    • Digital Methods
      • coding
      • critical making
      • data visualization
      • digital pedagogy
      • immersive technology (AR/VR)
      • mapping
      • textual analysis
      • web scraping
    • Disciplinary Fields
      • Anthropology
      • Archaeology
      • Architecture
      • Art History
      • Business
      • Computer Science
      • Critical Digital Studies
      • Cultural Studies
      • Dance
      • Economics
      • Education
      • Environmental Studies
      • Film Studies
      • Gaming Studies
      • Geography
      • History
      • Information Science
      • Linguistics
      • Literary Studies
      • Marketing
      • Media and Communication Studies
      • Music Studies
      • Political Science
      • Psychology
      • Public Health
      • Sculpture
      • Sociology
      • Urban Studies
      • Visual Art
    • Cultural Analytics Practicum Blogposts
  • Current Staff
  • Newsletter
  • About
    • Games Group 
Menu

Text Mining Pitchfork Music Reviews for Descriptors of Vocal Sounds

Posted on May 9, 2024 by Rodney McGhee

By R.J. McGhee

Introduction

Have you ever wondered how music reviewers write about vocalists in their reviews? How do music reviewers depict the way vocalists sing? Which adjectives and verbs help create a depiction of vocal styles?

As a student in a master’s program in Information Science and Technology and a musician, my goal for the Cultural Analytics Certificate Practicum project is to use my background knowledge to explore how reviewers of music albums describe sound, specifically the description of the musical properties of vocal sounds. I selected a dataset of music reviews from the prominent music journalism platform, Pitchfork, for this analysis.

To analyze the words used to describe vocal sounds in Pitchfork music reviews, I employ natural language processing techniques, utilizing Python libraries like pandas, and SpaCy to tally a list of Tagg’s “Sound Descriptor Words” within the Pitchfork Reviews. To see my full code, follow this link.

Background:

In his book, “Music Meanings,” Musicologist Phillip Tagg investigates the ways we discuss music and its significance to the world (Tagg, 2013). Tagg identifies the phenomenon of individuals talking about vocals using the “Aesthetic Perspective.” Tagg defines the “Aesthetic Perspective” in terms of how vocal sounds are perceived, interpreted, and reacted to by listeners (Tagg, 2013).

Tagg created a list of ‘Directly Sound Descriptive Adjectives and Verbs’ to catalog the language commonly used when discussing the qualities of vocal sound. See his list below:

Sound Descriptor Word List:

hum, high-pitched, whiny, squeaky, booming, low-pitched, deep, full-throated, gruff, breathy, husky, guttural, distinct, harsh, indistinct, muffled, plaintive, rasping, roaring, shrill, stammering, loud, declamatory, soft, quiet, monotone, lispy, bird-like, hoarse, throaty, babble, bark, bawl, belch, bellow, bleat, blubber, boom, buzz, cackle, caterwaul, chant, chatter, chuckle, chirp, cluck, complain, cough, croak, croon, cry, declaim, denounce, drone, exclaim, gargle, gasp, giggle, growl, grumble, gurgle, hiccup, hiss, hoot, howl, lament, laugh, lilt, moan, mumble, mutter, praise, preach, proclaim, pronounce, quack, quip, rant, recite, roar, scream, screech, shout, shriek, sigh, snap [at], snarl, snigger, snore, snort, sob, spit, splutter, squawk, squeak, stammer, stutter, twitter, ululate, wail, warble, weep, wheeze, whimper, whine, whinge, whisper, whistle, whoop, yammer, yap, yawn, yell, yelp, yow

By examining the frequency of these Sound Descriptors and their association with genre categorization in Pitchfork reviews, I hope to find insights into whether and how reviewers are using sound descriptors in their writing. This involves tallying the occurrences of words from Phillip Tagg’s list in sentences discussing vocal sounds.

The Dataset

For my dataset, I downloaded a set of 24,169 Pitchfork reviews spanning from January 5, 1999, to December 12, 2021 that Nolan Conaway web-scraped from the Pitchfork website.

I chose this dataset because of its accessibility and its cultural importance to music journalism. Pitchfork is a cornerstone of the online music review culture. Pitchfork started in 1996 as a small music review website and has now grown to be one of the most well-known music review websites internationally (“The History of Pitchfork’s Reviews Section in 38 Reviews,” 2021).

The dataset can be downloaded for free as a SQLite3 file. The tables contain columns which include the name of the artists, the title of the album, the score, the author of the review, the primary genres, the release year of the album, the publication year, and the review itself. 

Text Cleaning Process

My first step in this research project was to consolidate these tables into one functional table, which I can further filter to answer my research questions. My main avenue of organizing this data was to use a library in Python called pandas. I first converted each table into pandas dataframes using SQL calls, which then I was able to connect each table together using their respective table keys. I then created a new pandas dataframe by genre. The genres are assigned to every article in the dataset.

Pitchfork Genres

ExperimentalRockElectronicFolk/CountryRapPop/R&BJazzGlobalMetal

I also needed to prepare the data for analysis, so I applied SpaCy’s lemmatization feature to each sentence, facilitating the extraction of root forms of words. This step ensures the identification of terms from the sound descriptor and vocal term list, accommodating for the variations in word forms due to prefixes or suffixes (“Linguistic Features · SpaCy Usage Documentation,” n.d.).

Text Mining Process

To begin mining these texts, I needed to verify that I counted only instances where the sound descriptive words described vocal sounds. I created a basic list of words that would normally describe vocals and appear in reviews discussing vocals. See the list below:

  • Vocal
  • Vocalist
  • Voice
  • Lyric
  • Speak
  • Sing
  • Sings
  • Sung
  • Sang
  • Falsetto
  • Rap

I segmented the reviews into sentences using the NLTK library and checked each sentence for the predefined nouns associated with vocals. If found, I added a boolean value to the dataframe.

Result

Once my data was cleaned and filtered, I searched through these reviews and counted the number of times the list of Phillip Tagg words occurred by genre. My code counts the number of times these words occur in each genre. I averaged the result using the total number of words in each genre to create a frequency ratio. See some examples of my results below:


Top 5 Vocal Sound Descriptors for Rock Genre 

WordFrequencyFrequency
ratio (%)
Deep  303 .0092 
Soft  271 .0082 
Whisper 210 .0064 
Scream204 .0062 
Drone189.0057 

Top 5 Vocal Sound Descriptors for Folk/Country Genre 

WordFrequencyFrequency
ratio (%)
Deep  50 .0162 
Quiet  43 .0139 
Soft43 .0139 
Whisper35 .0113 
Drone25.0081 

Top 5 Vocal Sound Descriptors for Rap Genre 

WordFrequencyFrequency
ratio (%)
Deep  132 .0144 
Spit90.0098 
Boom82.0089 
Shout58.0063 
Soft52.0056 

Top 5 Vocal Sound Descriptors for Jazz Genre 

WordFrequencyFrequency
ratio (%)
Deep  10.0076 
Soft  8 .0061 
Whisper 8 .0061 
Scream6 .0046 
Drone5.0038 

Top 5 Vocal Sound Descriptors for Global Genre 

WordFrequencyFrequency
ratio (%)
Soft11 .0186 
Deep  10 .0169 
Chant5 .0084 
Wail5 .0084 
Plaintive4.0084 

Top 5 Vocal Sound Descriptors for Metal Genre 

WordFrequencyFrequency
ratio (%)
Scream42 .0174
Growl36 .0149 
Howl25 .0103 
Shout25 .0103 
Deep25.0103 

To see all of the results, please follow this link to access my Github page.

The term ‘deep’ appears as one of the top words in every single genre. This may reflect that the reviewers are referring to singers who use a ‘lower,’ ‘deeper’ voice. This may be due to the fact that Pitchfork is referring to a ‘deep’ voice, which is likely a musical expression that singers often use which stands out to reviewers.

After reviewing these top vocal sound descriptor words, the sound descriptors that come up the most tend to be descriptors that are often associated with the genres. For example, in folk, there are mentions of soft, whisper, and quiet, versus metal which contains scream, growl, howl, etc. I find this interesting, as the jazz and global genres contain words with vocals that tend to be soft and deep.  

Conclusions and Next Steps

After analyzing these results, we can conclude that the list of Directly Sound Descriptive Adjectives and Verbs can be an effective way to identify terms that relate to vocal inflections and potentially build this list. We should continue to understand Phillip Tagg’s philosophy and aim to seek insights into the diverse ways in which vocal sound is commonly perceived and described in everyday contexts.

The next steps involve continuing to expand our list of words mentioned earlier. In Philip Tagg’s book, he mentions that this list is by no means comprehensive in describing every single word aesthetically for vocals. I am interested in further utilizing SpaCy’s part-of-speech tagging to identify other words used in the reviews and finding more words I could add to this list.

References

Linguistic Features · spaCy Usage Documentation. (n.d.). Retrieved from Linguistic Features website: https://spacy.io/usage/linguistic-features

Tagg, P. (2013a). Music’s meanings : a modern musicology for non-musos (p. 346). New York: The Mass Media Music Scholars’ Press.

Tagg, P. (2013b). Music’s meanings : a modern musicology for non-musos (p. 353). New York: The Mass Media Music Scholars’ Press.

The History of Pitchfork’s Reviews Section in 38 Reviews. (2021, May 25). Pitchfork. Retrieved from https://pitchfork.com/features/lists-and-guides/the-history-of-the-pitchfork-reviews-section-in-38-important-reviews/

Recent Posts

  • The Untold History of Fletcher Street’s Stables April 21, 2025
  • Building an Immersive Archive of the Greek Orthodox Churches in Istanbul April 15, 2025
  • Tracing Influence in Genealogies of Communication Theory April 14, 2025

Tags

3D modeling 3D printing arduino augmented reality banned books coding corpus building critical making Cultural Heritage data cleaning data visualization Digital Preservation digital reconstruction digital scholarship film editing game design games gephi human subject research linked open data machine learning makerspace makerspace residency mapping network analysis oculus rift omeka OpenRefine Photogrammetry Python QGIS R SketchUp stylometry text analysis text mining textual analysis top news twitter video analysis virtual reality visual analysis voyant web scraping webscraping

Recent Posts

  • The Untold History of Fletcher Street’s Stables April 21, 2025
  • Building an Immersive Archive of the Greek Orthodox Churches in Istanbul April 15, 2025
  • Tracing Influence in Genealogies of Communication Theory April 14, 2025

Archives

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

Archives

Blog Tags

3D modeling (11) 3D printing (14) arduino (8) augmented reality (5) banned books (3) coding (12) corpus building (4) critical making (7) Cultural Heritage (11) data cleaning (4) data visualization (11) Digital Preservation (3) digital reconstruction (9) digital scholarship (12) film editing (3) game design (3) games (6) gephi (3) human subject research (3) linked open data (4) machine learning (6) makerspace (8) makerspace residency (4) mapping (30) network analysis (17) oculus rift (8) omeka (3) OpenRefine (4) Photogrammetry (5) Python (8) QGIS (10) R (9) SketchUp (4) stylometry (8) text analysis (10) text mining (4) textual analysis (32) top news (102) twitter (5) video analysis (4) virtual reality (17) visual analysis (5) voyant (4) web scraping (16) webscraping (3)

Recent Posts

  • The Untold History of Fletcher Street’s Stables April 21, 2025
  • Building an Immersive Archive of the Greek Orthodox Churches in Istanbul April 15, 2025
  • Tracing Influence in Genealogies of Communication Theory April 14, 2025
  • From Theory to Practice: Weaving in Response to the Grid in the Global Context March 26, 2025
  • Visiting a Land of Twilight February 24, 2025

Archives

©2025 Loretta C. Duckworth Scholars Studio | Design: Newspaperly WordPress Theme