Nan Z. Da’s statistical review of computational literary studies (CLS) takes issue with an approach I also have concerns about, but it is misconceived in its framing of the field and of statistical inquiry. Her definition of CLS—using statistics, predominantly machine learning, to investigate word patterns—excludes most of what I would categorize as computational literary studies, including research that: employs data construction and curation as forms of critical analysis; analyzes bibliographical and other metadata to explore literary trends; deploys machine-learning methods to identify literary phenomena for noncomputational interpretation; or theorizes the implications of methods such as data visualization and machine learning for literary studies. (Interested readers will find diverse forms of CLS in the work of Ryan Cordell, Anne DeWitt, Johanna Drucker, Lauren Klein, Matthew Kirschenbaum, Anouk Lang, Laura B. McGrath, Stephen Ramsay, and Glenn Roe, among others.)
Beyond its idiosyncratic and restrictive definition of CLS, what strikes me most about Da’s essay is its constrained and contradictory framing of statistical inquiry. For most of the researchers Da cites, the pivot to machine learning is explicitly conceived as rejecting a positivist view of literary data and computation in favor of modelling as a subjective practice. Da appears to argue, first, that this pivot has not occurred enough (CLS takes a mechanistic approach to literary interpretation) and, second, that it has gone too far (CLS takes too many liberties with statistical inference, such as “metaphor[izing] … coding and statistics” [p. 606 n. 9]). On the one hand, then, Da repeatedly implies that, if CLS took a slightly different path—that is, trained with more appropriate samples, demonstrated greater rigor in preparing textual data, avoided nonreproducible methods like topic modelling, used Natural Language Processing with the sophistication of corpus linguists—it could reach a tipping point at which the data used, methods employed, and questions asked became appropriate to statistical analysis. On the other, she precludes this possibility in identifying “reading literature well” as the “cut-off point” at which computational textual analysis ceases to have “utility” (p. 639). This limited conception of statistical inquiry also emerges in Da’s two claims about statistical tools for text mining: they are “ethically neutral”; and they must be used “in accordance with their true function” (p. 620), which Da defines as reducing information to enable quick decision making. Yet as with any intellectual inquiry, surely any measurements—let alone measurements with this particular aim—are interactions with the world that have ethical dimensions.
Statistical tests of statistical arguments are vital. And I agree with Da’s contention that applications of machine learning to identify word patterns in literature often simplify complex historical and critical issues. As Da argues, these simplifications include conceiving of models as “intentional interpretations” (p. 621) and of word patterns as signifying literary causation and influence. But there’s a large gap between identifying these problems and insisting that statistical tools have a “true function” that is inimical to literary studies. Our discipline has always drawn methods from other fields (history, philosophy, psychology, sociology, and others). Perhaps it’s literary studies’ supposed lack of functional utility (something Da claims to defend) that has enabled these adaptations to be so productive; perhaps such adaptations have been productive because the meaning of literature is not singular but forged constitutively with a society where the prominence of particular paradigms (historical, philosophical, psychological, sociological, now statistical) at particular moments shapes what and how we know. In any case, disciplinary purity is no protection against poor methodology; and cross disciplinarity can increase methodological awareness.
Da’s rigid notion of a “true function” for statistics prevents her asking more “argumentatively meaningful” (p. 639) questions about possible encounters between literary studies and statistical methods. These might include: If not intentional or interpretive, what is the epistemological—and ontological and ethical—status of patterns discerned by machine learning? Are there ways of connecting word counts with other, literary and nonliterary, elements that might enhance the “explanatory power” (p. 604) and/or critical potential of such models and, if not, why not? As is occurring in fields such as philosophy, sociology, and science and technology studies, can literary studies apply theoretical perspectives (such as feminist empiricism or new materialism) to reimagine literary data and statistical inquiry? Without such methodological and epistemological reflection, Da’s statistical debunking of statistical models falls into the same trap she ascribes to those arguments: of confusing “what happens mechanistically with insight” (p. 639). We very much need critiques of mechanistic—positivist, reductive, and ahistorical—approaches to literary data, statistics, and machine learning. Unfortunately, Da’s critique demonstrates the problems it decries.
KATHERINE BODE is associate professor of literary and textual studies at the Australian National University. Her latest book, A World of Fiction: Digital Collections and the Future of Literary History (2018), offers a new approach to literary research with mass-digitized collections, based on the theory and technology of the scholarly edition. Applying this model, Bode investigates a transnational collection of around 10,000 novels and novellas, discovered in digitized nineteenth-century Australian newspapers, to offer new insights into phenomena ranging from literary anonymity and fiction syndication to the emergence and intersections of national literary traditions.