

{"id":795,"date":"2015-07-14T11:44:27","date_gmt":"2015-07-14T15:44:27","guid":{"rendered":"https:\/\/sites.temple.edu\/tudsc\/?p=795"},"modified":"2020-09-17T06:20:41","modified_gmt":"2020-09-17T10:20:41","slug":"testing-authorship-attribution-with-signature","status":"publish","type":"post","link":"https:\/\/sites.temple.edu\/tudsc\/2015\/07\/14\/testing-authorship-attribution-with-signature\/","title":{"rendered":"Testing Authorship Attribution with Signature"},"content":{"rendered":"<p>By Jaclyn Partyka<\/p>\n<p><!--more--><\/p>\n<p>David Holmes <a href=\"http:\/\/llc.oxfordjournals.org\/content\/13\/3\/111.abstract\" target=\"_blank\" rel=\"noopener noreferrer\">claims<\/a> that \u201c[t]he major problem inhibiting stylometry&#8217;s acceptance within humanities scholarship is that, as yet, there is no consensus as to correct methodology or technique.\u201d So, even though scholars have been practicing various methods of stylometry for almost a century, methods seem to have evolved very individualistically, without agreement between different researchers. This conclusion is somewhat surprising within the context of the digital humanities, considering how the increased processing power of computers would allow scholars to perform forms of \u201c<a href=\"https:\/\/books.google.com\/books?id=YKMCy9I3PG4C&amp;printsec=frontcover&amp;source=gbs_ge_summary_r&amp;cad=0#v=onepage&amp;q&amp;f=false\" target=\"_blank\" rel=\"noopener noreferrer\">distant reading<\/a>\u201d analytics on an ever increasing archive of digitized texts. Upon embarking on my own version of stylometric research, I have tried out a number of different programs to test out their usefulness. Today I\u2019m going to review one called\u00a0<em>Signature<\/em>.<\/p>\n<p><a href=\"http:\/\/www.philocomp.net\/texts\/signature.htm\" target=\"_blank\" rel=\"noopener noreferrer\"><em>Signature <\/em><\/a>is a stylometric program designed by <a href=\"http:\/\/www.millican.org\/index.htm\" target=\"_blank\" rel=\"noopener noreferrer\">Peter Millican<\/a>\u00a0of Oxford University, previously of Leeds University. I wanted to begin my reviews of various stylometric programs here because <em>Signature<\/em> is centered on the most basic and straightforward stylometric techniques while also providing computer generated visualizations. <em>Signature<\/em> interprets stylometric data using five different frequency criteria: Word lengths, Sentence lengths, Paragraph lengths, Letters, and Punctuation. These kinds of frequency measures are typical for some of the earliest examples of stylometric research and so they are a great place to start interpreting your specific\u00a0corpus.<\/p>\n<p><em>Signature<\/em> is a free software download and has a relatively simple user interface, allowing you to upload your corpora using text or html files. You can also combine individual files into a corpus, which is very useful\u00a0 when working with multiple authors at once. <em>Signature<\/em> also gives you the option when uploading your files to divide them into halves, which is helpful for statistical analysis, but not very precise when it comes to researcher control.<\/p>\n<p>As I mentioned in <a href=\"https:\/\/sites.temple.edu\/tudsc\/2015\/06\/18\/starting-out-with-stylometry\/\" target=\"_blank\" rel=\"noopener noreferrer\">my last blog post<\/a>, my project is somewhat atypical of normal stylometry projects, since I\u2019m not comparing multiple authors, but looking at a single author\u2019s work for significant stylistic variations. It is for this reason that any results I get using a program like <em>Signature<\/em> will likely show only slight variation across individual texts or corpora, whereas if you were to compare two or more different authors the variations would likely be more significant.<\/p>\n<p>The first test I did using <em>Signature<\/em> was to visualize Roth\u2019s corpus* by sentence and paragraph length.<\/p>\n<p><a href=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusSentenceLength.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-802\" src=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusSentenceLength.jpg\" alt=\"Signature_RothCorpusSentenceLength\" width=\"807\" height=\"559\" srcset=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusSentenceLength.jpg 807w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusSentenceLength-300x208.jpg 300w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusSentenceLength-700x485.jpg 700w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusSentenceLength-232x161.jpg 232w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusSentenceLength-464x321.jpg 464w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusSentenceLength-624x432.jpg 624w\" sizes=\"auto, (max-width: 807px) 100vw, 807px\" \/><\/a><\/p>\n<p>The Sentence Length visualization shows the majority of the novels as maintaining a relatively even distribution of sentences consisting of 3 to 10 words, with <em>Exit Ghost<\/em> and <em>American Pastoral<\/em> being the outliers. And thank goodness for outliers! I think the variation here actually reinforces the specific stylistics of each novel since <em>Exit Ghost<\/em> is primarily composed in a dramatic dialogue, which would account for the shorter sentences, and <em>American Pastoral<\/em> tends to include a significant number of longer, stream-of-consciousness style musings. And, generally, the paragraph graph reinforces this interpretation, with the dramatic structure of\u00a0<em>Exit Ghost<\/em>\u00a0resulting in\u00a0shorter paragraphs. So far, so good.<\/p>\n<p><a href=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusParagraphLength.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-801\" src=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusParagraphLength.jpg\" alt=\"Signature_RothCorpusParagraphLength\" width=\"805\" height=\"564\" srcset=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusParagraphLength.jpg 805w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusParagraphLength-300x210.jpg 300w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusParagraphLength-700x490.jpg 700w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusParagraphLength-232x163.jpg 232w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusParagraphLength-464x325.jpg 464w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusParagraphLength-624x437.jpg 624w\" sizes=\"auto, (max-width: 805px) 100vw, 805px\" \/><\/a><\/p>\n<p>But what I\u2019m really interested in is how Roth\u2019s more \u201cautobiographical\u201d novels compare to his typical fiction. To look at this I chose <em>The\u00a0Facts: The Novelist\u2019s Autobiography<\/em>, because it truly bridges the gap between these different genres according to its structure. Basically, the novel begins with a letter from \u201cRoth\u201d** to Nathan Zuckerman, an author-character that many readers view as a mask of Roth\u2019s own understanding authorship. \u201cRoth\u201d asks Nathan if he should bother publishing this foray into autobiography instead of his normal fiction.\u00a0 Then the novel recounts, in relatively typical autobiographical reportage, Roth\u2019s childhood growing up in Newark, his college years, and his disastrous first marriage, and ending with the inspiration and publication of <em>Portnoy\u2019s Complaint<\/em>. What follows this section is Zuckerman\u2019s reply to Roth, and he basically cautions him not to publish. This metafictional and dialogic style paratextually incorporates fiction into Roth\u2019s autobiography, so I was interested in whether this unconventional approach would be reflected at the stylistic level.<\/p>\n<p><a href=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusFactsSentenceLength.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-800\" src=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusFactsSentenceLength.jpg\" alt=\"Signature_RothCorpusFactsSentenceLength\" width=\"803\" height=\"563\" srcset=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusFactsSentenceLength.jpg 803w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusFactsSentenceLength-300x210.jpg 300w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusFactsSentenceLength-700x491.jpg 700w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusFactsSentenceLength-232x163.jpg 232w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusFactsSentenceLength-464x325.jpg 464w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusFactsSentenceLength-624x437.jpg 624w\" sizes=\"auto, (max-width: 803px) 100vw, 803px\" \/><\/a><\/p>\n<p><a href=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusCounterlifeSentenceLength.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-798\" src=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusCounterlifeSentenceLength.jpg\" alt=\"Signature_RothCorpusCounterlifeSentenceLength\" width=\"802\" height=\"561\" srcset=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusCounterlifeSentenceLength.jpg 802w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusCounterlifeSentenceLength-300x210.jpg 300w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusCounterlifeSentenceLength-700x490.jpg 700w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusCounterlifeSentenceLength-232x162.jpg 232w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusCounterlifeSentenceLength-464x325.jpg 464w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusCounterlifeSentenceLength-624x436.jpg 624w\" sizes=\"auto, (max-width: 802px) 100vw, 802px\" \/><\/a><\/p>\n<p>To do this I compared Facts alongside the <em>Signature<\/em> Roth corpus according to sentence and paragraph length. I also contrasted these results by looking at the novel immediately preceding <em>The Facts<\/em> chronologically, <em>The Counterlife<\/em>.<\/p>\n<p><a href=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusFactsParagraphLength.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-799\" src=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusFactsParagraphLength.jpg\" alt=\"Signature_RothCorpusFactsParagraphLength\" width=\"808\" height=\"560\" srcset=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusFactsParagraphLength.jpg 808w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusFactsParagraphLength-300x208.jpg 300w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusFactsParagraphLength-700x485.jpg 700w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusFactsParagraphLength-232x161.jpg 232w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusFactsParagraphLength-464x322.jpg 464w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusFactsParagraphLength-624x432.jpg 624w\" sizes=\"auto, (max-width: 808px) 100vw, 808px\" \/><\/a><\/p>\n<p><a href=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusCounterlifeParagraphLength.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-803\" src=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusCounterlifeParagraphLength.jpg\" alt=\"Signature_RothCorpusCounterlifeParagraphLength\" width=\"800\" height=\"559\" srcset=\"https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusCounterlifeParagraphLength.jpg 800w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusCounterlifeParagraphLength-300x210.jpg 300w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusCounterlifeParagraphLength-700x489.jpg 700w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusCounterlifeParagraphLength-232x162.jpg 232w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusCounterlifeParagraphLength-464x324.jpg 464w, https:\/\/sites.temple.edu\/tudsc\/files\/2015\/07\/Signature_RothCorpusCounterlifeParagraphLength-624x436.jpg 624w\" sizes=\"auto, (max-width: 800px) 100vw, 800px\" \/><\/a><\/p>\n<p>While these are just preliminary results, it\u2019s clear from the graphs that <em>The Counterlife<\/em> aligns much more closely with works typical of Roth\u2019s oeuvre while <em>The Facts<\/em> deviates significantly. The difference may be attributed to a number of factors, but one of them may be that Roth\u2019s writing style slightly changes when he switched between writing autobiographical and fictional modes.<\/p>\n<p>As we can see from these few comparisons, stylometric probability tests only provide limited insight into the actual authorship of a specific text. Stylometry, like most digital humanities methods, cannot be used within a vacuum. Rather, these tools can help us visualize and confirm the significance of textual similarities and differences, but they cannot definitively determine the probability of authorship.<\/p>\n<p>Finally, <em>Signature<\/em> is a useful, but very limited stylometric tool due to a lack of resources and updates. Even though the designers of <em>Signature<\/em>\u00a0may have had the intention to develop additional versions, very little has been updated since 2013. Additionally, while the designers provide a helpful powerpoint file to explain how to use the program, there is no additional help file or resource contained within the program. There are some case studies using <em>Signature<\/em> available via Millican\u2019s <a href=\"http:\/\/www.millican.org\/other.htm\" target=\"_blank\" rel=\"noopener noreferrer\">website<\/a>, but there is no active community of <em>Signature<\/em> users.<\/p>\n<p>However, this is not to say that <em>Signature<\/em> does not have its benefits. It would probably be most useful in the classroom since students could practice loading corpora and generating some basic graphs, but it is very limiting for significant stylometric research.<\/p>\n<p>&nbsp;<\/p>\n<p>*If you are a Roth scholar, you will notice that this is not Roth\u2019s complete oeuvre, but it does provide a basic overview for these early tests.<\/p>\n<p>**Not to conflate the author with the fictionalized version of his implied author, I\u2019ll refer to this character as \u201cRoth.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"<p>By Jaclyn Partyka<\/p>\n","protected":false},"author":1418,"featured_media":797,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[2],"tags":[34,36,37,29],"class_list":["post-795","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-grad-students","tag-authorship","tag-stylometry","tag-textual-analysis","tag-visual-analysis"],"_links":{"self":[{"href":"https:\/\/sites.temple.edu\/tudsc\/wp-json\/wp\/v2\/posts\/795","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sites.temple.edu\/tudsc\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sites.temple.edu\/tudsc\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sites.temple.edu\/tudsc\/wp-json\/wp\/v2\/users\/1418"}],"replies":[{"embeddable":true,"href":"https:\/\/sites.temple.edu\/tudsc\/wp-json\/wp\/v2\/comments?post=795"}],"version-history":[{"count":0,"href":"https:\/\/sites.temple.edu\/tudsc\/wp-json\/wp\/v2\/posts\/795\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sites.temple.edu\/tudsc\/wp-json\/wp\/v2\/media\/797"}],"wp:attachment":[{"href":"https:\/\/sites.temple.edu\/tudsc\/wp-json\/wp\/v2\/media?parent=795"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sites.temple.edu\/tudsc\/wp-json\/wp\/v2\/categories?post=795"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sites.temple.edu\/tudsc\/wp-json\/wp\/v2\/tags?post=795"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}