A series of binaries permeates Nan Z. Da’s article “The Computational Case against Computational Literary Studies”: computation OR reading; numbers OR words; statistics OR critical thinking. Working from these false oppositions, the article conjures a conflict between computation and criticism. The field of cultural analytics, however rests on the discovery of compatibilities between these binaries: the ability of computation to work hand in hand with literary criticism and the use of critical interpretation by its practitioners to make sense of their statistics.
The oppositions she posits lead Da to focus exclusively on the null hypothesis testing of confirmatory data analysis (CDA): graphs are selected, hypotheses are proposed, and errors in significance are sought.
But, for mathematician John Tukey, the founder of exploratory data analysis (EDA), allowing the data to speak for itself, visualizing it without an underlying hypothesis, allows researchers to avoid the pitfalls of confirmation bias.This is what psychologist William McGuire (1989) calls “the hypothesis testing myth”: if a researcher begins by believing a hypothesis (for example, that literature is too complex for computational analysis), then, with a simple manipulation of statistics, she or he can prove herself or himself correct (by cherry-picking examples that support her argument).Practitioners bound by the orthodoxy of their fields often miss the new patterns revealed when statistics are integrated into new areas of research.
In literary studies, the visualizations produced by EDA do not replace the act of reading but instead redirect it to new ends.Each site of statistical significance reveals a new locus of reading: the act of quantification is no more a reduction than any interpretation.Statistical rigor remains crucial, but equally as essential are the ways in which these data objects are embedded within a theoretical apparatus that draws on literary interpretation.And yet, in her article, Da plucks single statistics from thirteen articles with an average length of about 10,250 words each.It is only by ignoring these 10,000 words, by refusing to read the context of the graph, the arguments, justifications, and dissentions, that she can marshal her arguments.
In Da’s adherence to CDA, her critiques require a hypothesis: when one does not exist outside of the absent context, she is forced to invent one. Even a cursory reading of “The Werther Topologies” reveals that we are not interested in questions of the “influence of Werther on other texts”: rather we are interested in exploring the effect on the corpus when it is reorganized around the language of Werther.The topology creates new adjacencies, prompting new readings: it does not prove or disprove, it is not right or wrong – to suggest otherwise is to make a category error.
Cultural analytics is not a virtual humanities that replaces the interpretive skills developed by scholars over centuries with mathematical rigor. It is an augmented humanities that, at its best, presents new kinds of evidence, often invisible to even the closest reader, alongside carefully considered theoretical arguments, both working in tandem to produce new critical work.
MARK ALGEE-HEWITT is an assistant professor of English and Digital Humanities at Stanford University where he directs the Stanford Literary Lab. His current work combines computational methods with literary criticism to explore large scale changes in aesthetic concepts during the eighteenth and nineteenth centuries. The projects that he leads at the Literary Lab include a study of racialized language in nineteenth-century American literature and a computational analysis of differences in disciplinary style. Mark’s work has appeared in New Literary History, Digital Scholarship in the Humanities, as well as in edited volumes on the Enlightenment and the Digital Humanities.
Many of the articles cited by Da combine both CDA and EDA; a movement of the field noted by Ted Underwood in Distant Horizons (p. xii).
Tukey, John. Exploratory Data Analysis New York, Pearson, 1977.
McGuire, William J. A perspectivist approach to the strategic planning of programmatic scientific research.” In Psychology of Science: Contributions to Metascience ed. B. Gholson et al. Cambridge: Cambridge UP, 1989. 214-245. See also Frederick Hartwig and Brian Dearling on the need to not rely exclusively on CDA (Exploratory Data Analysis, Newbury Park: Sage Publications, 1979) and John Behrens on the “hypothesis testing myth.” (“Principles and Procedures of Exploratory Data Analysis.” Psychological Methods, 2(2): 1997, 131-160.
Da, Nan Z. “The Computational Case against Computational Literary Analysis.” Critical Inquiry 45(3): 2019. 601-639.
See, for example, Gemma, Marissa, et al. “Operationalizing the Colloquial Style: Repetition in 19th-Century American Fiction” Digital Scholarship in the Humanities, 32(2): 2017. 312-335; or Laura B. McGrath et al. “Measuring Modernist Novelty” The Journal of Cultural Analytics (2018).
See, for example, our argument about the “modularity of criticism” in Algee-Hewitt, Mark, Fredner, Erik, and Walser, Hannah. “The Novel As Data.” Cambridge Companion to the Noveled. Eric Bulson. Cambridge: Cambridge UP, 2018. 189-215.
Absent the two books, which have a different relationship to length, Da extracts visualizations or numbers from 13 articles totaling 133,685 words (including notes and captions).
Da (2019), 634; Piper and Algee-Hewitt, (“The Werther Effect I” Distant Readings: Topologies of German Culture in the Long Nineteenth Century Ed Matt Erlin and Lynn Tatlock. Rochester: Camden House, 2014), 156-157.