Trust in Numbers
Hoyt Long and Richard Jean So
Nan Da’s “The Computational Case against Computational Literary Criticism” stands out from past polemics against computational approaches to literature in that it purports to take computation seriously. It recognizes that a serious engagement with this kind of research means developing literacy of statistical and other concepts. Insofar as her essay promises to move the debate beyond a flat rejection of numbers, and towards something like a conversation about replication, it is a useful step forward.
This, however, is where its utility ends. “Don’t trust the numbers,” Da warns. Or rather, “Don’t trust their numbers, trust mine.” But should you? If you can’t trust their numbers, she implies, the entire case for computational approaches falls apart. Trust her numbers and you’ll see this. But her numbers cannot be trusted. Da’s critique of fourteen articles in the field of cultural analytics is rife with technical and factual errors. This is not merely quibbling over details. The errors reflect a basic lack of understanding of fundamental statistical concepts and are akin to an outsider to literary studies calling George Eliot a “famous male author.” Even more concerning, Da fails to understand statistical method as a contextual, historical, and interpretive project. The essay’s greatest error, to be blunt, is a humanist one.
Here we focus on Da’s errors related to predictive modeling. This is the core method used in the two essays of ours that she critiques. In “Turbulent Flow,” we built a model of stream-of-consciousness (SOC) narrative with thirteen linguistic features and found that ten of them, in combination, reliably distinguished passages that we identified as SOC (as compared with passages taken from a corpus of realist fiction). Type-token ratio (TTR), a measure of lexical diversity, was the most distinguishing of these, though uninformative on its own. The purpose of predictive modeling, as we carefully explain in the essay, is to understand how multiple features work in concert to identify stylistic patterns, not alone. Nothing in Da’s critique suggests she is aware of this fundamental principle.
Indeed, Da interrogates just one feature in our model (TTR) and argues that modifying it invalidates our modeling. Specifically, she tests whether the strong association of TTR with SOC holds after removing words in her “standard stopword list,” instead of in the stopword list we used. She finds it doesn’t. There are two problems with this. First, TTR and “TTR minus stopwords” are two separate features. We actually included both in our model and found the latter to be minimally distinctive. Second, while the intuition to test for feature robustness is appropriate, it is undercut by the assertion that there is a “standard” stopword list that should be universally applied. Ours was specifically curated for use with nineteenth- and early twentieth-century fiction. Even if there was good reason to adopt her “standard” list, one still must rerun the model to test if the remeasured “TTR minus stopwords” feature changes the overall predictive accuracy. Da doesn’t do this. It’s like fiddling with a single piano key and, without playing another note, declaring the whole instrument to be out of tune.
But the errors run deeper than this. In Da’s critique of “Literary Pattern Recognition,” she tries to invalidate the robustness of our model’s ability to classify English-language haiku poems from nonhaiku poems. She does so by creating a new corpus of “English translations of Chinese couplets” and tests our model on this corpus. Why do this? She suggests that it is because they are filled “with similar imagery” to English haiku and are similarly “Asian.” This is a misguided decision that smacks of Orientalism. It completely erases context and history, suggesting an ontological relation where there is none. This is why we spend over twelve pages delineating the English haiku form in both critical and historical terms.
These errors exemplify a consistent refusal to contextualize and historicize one’s interpretative practices (indeed to “read well”), whether statistically or humanistically. We do not believe there exist “objectively” good literary interpretations or that there is one “correct” way to do statistical analysis: Da’s is a position most historians of science, and most statisticians themselves, would reject. Conventions in both literature and science are continuously debated and reinterpreted, not handed down from on high. And like literary studies, statistics is a body of knowledge formed from messy disciplinary histories, as well as diverse communities of practice. Da’s essay insists on a highly dogmatic, “objective,” black-and-white version of knowledge, a disposition totally antithetical to bothstatistics and literary studies. It is not a version that encourages much trust.
Hoyt Long is associate professor of Japanese literature at the University of Chicago. He publishes widely in the fields of Japanese literary studies, media history, and cultural analytics. His current book project is Figures of Difference: Quantitative Approaches to Modern Japanese Literature.
Richard Jean So is assistant professor of English and cultural analytics at McGill University. He works on computational approaches to literature and culture with a focus on contemporary American writing and race. His current book project is Redlining Culture: A Data History of Race and US Fiction.