More Responses to “The Computational Case against Computational Literary Studies” 

Earlier this month, Critical Inquiry hosted an online forum featuring responses to and discussion about Nan Z. Da’s “The Computational Case against Computational Literary Studies.”  To accommodate further commentary to Da’s article and to the forum itself, we have created a new page for responses.

RESPONSES

  • Taylor Arnold (University of Richmond).
  • Duncan Buell (University of South Carolina, Columbia).

 


Taylor Arnold

As a statistician who has worked and published extensively within the fields of digital humanities (DH) and computational linguistics over the past decade, I have been closely following Nan Z. Da’s article “The Computational Case against Computational Literary Studies” and the ensuing conversations in the online forum. It has been repeatedly pointed out that the article contains numerous errors and misunderstandings about statistical inference, Bayesian inference, and mathematical topology. It is not my intention here to restate these same objections. I want to focus instead on an aspect of the work that has gone relatively undiscussed: the larger role to be played by statistics and statisticians within computational DH.

Da correctly points out that computational literary studies, and computational DH more generally, takes a large proportion of its methods, theories, and tools from the field of statistics. And yet, she also notes, scholars have had only limited collaborations with statisticians. It is easy to produce quantitative evidence of this fact. There are a total of zero trained statisticians (having either a Ph.D. or an academic position with the title of statistics) amongst: the 25 members on the editorial board of Cultural Analytics, 11 editors of Digital Humanities Quarterly, 22 members of the editorial board for Digital Scholarship in the Humanities, 10 members of the executive committee for the Australasian Association for Digital Humanities, 9 members of the executive committee for the Association for Computers and the Humanities, 9 members of the executive committee for the European Association for Digital Humanities, and the 4 executive council members in the Canadian Society for Digital Humanities.[1]While I do have great respect for these organizations and many of the people involved with them, the total of absence of any professional statisticians—and in many of the cited examples, lack of scholars with a terminal degree in any technical field—is a problem for a field grounded, at least in part, by the analysis of data.

In the last line of her response “Final Comments,” Da calls for a peer-review process “in which many people,” meaning statisticians and computer scientists, “are brought into peer review.” That is a good place to start but not nearly sufficient. I, and likely many other computationally trained scholars, am already frequently asked to review papers and abstract proposals for the aforementioned journals and professional societies. Da as well has claimed that her Critical Inquiry article was also vetted by a computational reviewer. The actual problem is instead that statisticians need to be involved in computational analyses from the start. To only use computational scholars at the level of peer-review risks falling into the classic trap famously described by Sir Ronald Fisher: consulting a statistician after already having collected data is nothing more than “a post mortem examination.”[2]

To see the potential for working closely with statisticians, one must look no further than Da’s own essay. She critiques the overuse and misinterpretation of term frequencies, latent Dirichlet allocation, and network analysis within computational literary studies. Without a solid background in these methods, however, the article opens itself up to the obvious (at least to a statistician) counterarguments offered in the forum by scholars such as Lauren Klein, Andrew Piper, and Ted Underwood. Had Da cowritten the article with someone with a background in statistics—she even admits that she is “far from being the ideal candidate for assessing this work,”[3] so why she would undertake this task alone in the first place is a mystery—these mistakes could have been avoided and replaced with stronger arguments. As a statistician, I also agree with many of her stated concerns over the particular methods listed in the article.[4]However, the empty critiques of what not to do could and should have been replaced with alternative methods that address some of Da’s concerns over reproducibility and multiple hypothesis testing. These corrections and additions would have been possible if she had heeded her own advice about engaging with statisticians.

My research in computational digital humanities has been a mostly productive and enjoyable experience. I have been fortunate to have colleagues who treat me as an equal within our joint research and I believe this has been the primary reason for the success of these projects. These relationships are unfortunately far from the norm. Collaborations with statisticians and computer scientists are too frequently either unattributed or avoided altogether. The field of DH often sees itself as challenging epistemological constraints towards the study of the humanities and transcending traditional disciplinary boundaries. These lofty goals are attainable only if scholars from other intellectual traditions are fully welcomed into the conversation as equal collaborators.

[1]I apologize in advance if I have missed anyone in the tally. I did my best to be diligent, but not every website provided easily checked contact information.

[2]Presidential Address to the First Indian Statistical Congress, 1938. Sankhya 4, 14-17.

[3]https://critinq.wordpress.com/2019/04/03/computational-literary-studies-participant-forum-responses-day-3-4/

[4]As a case in point, just last week I had a paper accepted for publication in which we lay out an argument and methodologies for moving beyond word counting methods in DH. See: Arnold, T., Baillier, N., Lissón, P., and Tilton, L. “Beyond lexical frequencies: Using R for text analysis in the digital humanities.” Linguistic Resources and Evaluation. To Appear.

TAYLOR ARNOLD is an assistant professor of statistics at the University of Richmond. He codirects the distant viewing lab with Lauren Tilton, an NEH-funded project that develops computational techniques to analyze visual culture on a large scale. He is the co-author the books Humanities Data in R and Computational Approach to Statistical Learning.

 


Duncan Buell

As a computer scientist who has been collaborating in the digital humanities for ten years now, I found Da’s article both well-written and dead on in its arguments about the shallow use of computation. I am teaching a course in text analysis this semester, and I find myself discussing repeatedly with my students the fact that they can computationally find patterns which are almost certainly not causal.

The purpose of computing being insight and not numbers (to quote Richard Hamming), computation in any area that looks like data mining is an iterative process. The first couple of iterations can be used to suggest directions for further study. That further study requires more careful analysis and computation. And at the end one comes back to analysis by scholars to determine if there’s really anything there. This can be especially true of text, more so than with scientific data, because text as data is so inherently messy; many of the most important features of text are almost impossible to quantify statistically and almost impossible to set rules for a priori.

Those first few iterations are the fun 90 percent of the work because new things show up that might only be seen by computation. It’s the next 90 percent of the work that isn’t so much fun and that often doesn’t get done. Da argues that scholars should step back from their perhaps too-easy conclusions and dig deeper. Unlike with much scientific data, we don’t have natural laws and equations to fall back on with which the data must be consistent. Ground truth is much harder to tease out, and skeptical calibration of numerical results is crucial.

Part of Da’s criticism, which seems to have been echoed by one respondent (Piper), is that scholars are perhaps too quick to conclude a “why” for the numbers they observe. Although for the purpose of making things seem more intuitive scientists often speak as if there were a “why,” there is in fact none of that.  Physics, as I learned in my freshman class at university, describes “what”; it does not explain “why.” The pull of gravity is 9.8 meters per second per second, as described by Newton’s equations. The empirical scientist will not ask why this is but will use the fact to provide models for physical interactions. It is the job of the theorist to provide a justification for the equations.

There is a need for more of this in the digital humanities. One can perform all kinds of computations (my collaborators and I, for example, have twenty thousand first-year-composition essays collected over several years). But to really provide value to scholarship one needs to frame quantitative questions that might correlate with ideas of scholarly interest, do the computations, calibrate the results, and verify that there is causation behind the results. This can be done and has been done in the digital humanities, but it isn’t as common as it should be, and Da is only pointing out this unfortunate fact.

DUNCAN BUELL is the NCR Professor of Computer Science and Engineering at the University of South Carolina, Columbia.

1 Comment

Filed under Uncategorized

Bruno Latour and Dipesh Chakrabarty: Geopolitics and the “Facts” of Climate Change

Bruno Latour and Dipesh Chakrabarty visited WB202 to discuss new “questions of concern” and the fight over “facts” and climate change in the world after Trump’s election. Latour and Timothy Lenton’s “Extending the Domain of Freedom, or Why Gaia Is So Hard to Understand” appeared in the Spring 2019 issue of Critical Inquiry. Chakrabarty’s “The Planet: An Emergent Humanist Category” is forthcoming in Autumn 2019.

You can also listen and subscribe to WB202 at:

iTunes

Google Play

TuneIn

Leave a comment

Filed under Podcast

Computational Literary Studies: Participant Forum Responses, Day 3

 

Stanley Fish

Some commentators to this forum object to my inclusion in it in part because I have no real credentials in the field. They are correct. Although I have now written five pieces on the Digital Humanities—three brief op-eds in the New York Times, an essay entitled “The Interpretive Poverty of Data” published in the blog Balkinization, and a forthcoming contribution to the New York University Journal of Law & Liberty with the title “If You Count It They Will Come”—in none of these do I display any real knowledge of statistical methods. My only possible claim to expertise, and it is a spurious one, is that my daughter is a statistician. I recently heard her give an address on some issue in bio-medical statistics and I barely understood 20 percent of it. Nevertheless, I would contend that this confessed ignorance is no bar to my pronouncing on the Digital Humanities because my objections to it are lodged on a theoretical level in relation to which actual statistical work in the field is beside the point. I don’t care what form these analyses take. I know in advance that they will fail (at least in relation to the claims made from them) in two ways: either they crank up a huge amount of machinery in order to produce something that was obvious from the get go—they just dress up garden variety literary intuition in numbers—or the interpretive conclusions they draw from the assembled data are entirely arbitrary, without motivation except the motivation to have their labors yield something, yield anything. Either their herculean efforts do nothing or when something is done with them, it is entirely illegitimate. This is so (or so I argue) because the underlying claim of the Digital Humanities (and of its legal variant Corpus Linguistics) that formal features––anything from sentence length, to image clusters, to word frequencies, to collocations of words, to passive constructions, to you name it—carry meaning is uncashable. They don’t unless all of the factors the Digital Humanities procedures leave out—including, but not limited to, context, intention, literary history, the idea of literature itself—are put back in. I was pleased therefore to find that Professor Da, possessed of a detailed knowledge infinitely greater than mine, supports my relatively untutored critique. When she says that work in Computational Studies comes in two categories—“papers that present a statistical no result finding as a finding” and “papers that draw conclusions from its finding that are wrong”—I can only cheer. When she declares “CLS as it currently exists has very little explanatory power,” I think that she gives too much credit to the project with the words “very little”; it has no explanatory power. And then there is this sentence, which to my mind, absolutely clinches the case: “there are many different ways of extracting factors and loads of new techniques for odd data sets, but these are atheoretical approaches, meaning, strictly, that you can’t use them with the hope that they will work magic for you in producing interpretations that are intentional” and “have meaning and insight.” For me the word intentional is the key. The excavation of verbal patterns must remain an inert activity until added to it is the purpose of some intentional agent whose project gives those patterns significance. Once you detach the numbers from the intention that generated them, there is absolutely nothing you can do with them, or, rather (it is the same thing) you can do with them anything you like. At bottom CLS or Digital Humanities is a project dedicated to irresponsibility masked by diagrams and massive data mining. The antidote to the whole puffed-up thing is nicely identified by Professor Da in her final paragraph: “just read the texts.”

 

STANLEY FISH is a professor of law at Florida International University and a visiting professor at the Benjamin N. Cardozo School of Law. He is also a member of the extended Critical Inquiry editorial board.

3 Comments

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses, Day 3

Final Comments

Nan Z. Da

(This is the last of three responses to the online forum. The others are “Errors” and “Argument.”)

I want to state that nothing about this forum has been unbalanced or unfair. I wrote the article. Those who may not agree with it (in part or in its entirety) have every right to critique it in an academic forum.

What my critics and neutral parties on this forum seem to want from “The Computational Case” is nothing short of: (1) an across-the-board reproducibility check (qua OSC, as Piper suggests), plus (2) careful analyses of CLS work in which even the “suppression” of tiny hedges would count as misrepresentation, plus (3) a state-of-the-field for computational literary studies and related areas of the digital humanities, past and emergent. To them, that’s the kind of intellectual labor that would make my efforts valid.

Ted Underwood’s suggestion that my article and this forum have in effect been stunts designed to attract attention does a disservice to a mode of scholarship that we may simply call critical inquiry. He is right that this might be a function of the times. The demand, across social media and elsewhere, that I must answer for myself right away for critiquing CLS in a noncelebratory manner is a symptom of the social and institutional power computational studies and the digital humanities have garnered to themselves.

Yes, “field-killing” is a term that doesn’t belong in scholarship, and one more indication that certain kinds of academic discourse should only take place in certain contexts. That said, an unrooted rhetoric of solidarity and “moreness”—we’re all in this together—is a poor way to argue. Consider what Sarah Brouillette has powerfully underscored about the institutional and financial politics of this subfield: it is time, as I’ve said, to ask some questions.

Underwood condemns social media and other public responses. He has left out the equally pernicious efforts on social media and in other circles to invalidate my article by whispering—or rather, publically publishing doubts—about Critical Inquiry’s peer review process. It has been suggested, by Underwood and many other critics of this article, that it was not properly peer-reviewed by someone out-of-field. This is untrue—my paper was reviewed by an expert in quantitative analysis and mathematical modeling—and it is damaging. It suggests that anyone who dares to check the work of leading figures in CLS will be tried by gossip.

Does my article make empirical mistakes? Yes, a few, mostly in section 3. I will list them in time, but they do not bear on the macro-claims in that section. With the exception of a misunderstanding in the discussion of Underwood’s essay none of the rebuttals presented in this forum made on empirical grounds have any substance. Piper’s evidence that I “failed at basic math” refers to a simple rhetorical example in which I rounded down to the nearest thousand for the sake of legibility.

Anyone who does serious quantitative analysis will see that I am far from being the ideal candidate for assessing this work. Still, I think the fundamental conflict of interest at issue here should be obvious to all. People who can do this work on a high level tend not to care to critique it, or else they tend not to question how quantitative methods intersect with the distinctiveness of literary criticism, in all its forms and modes of argumentation. In the interest of full disclosure: after assessing the validity of my empirical claims, my out-of-field peer reviewer did not finally agree with me that computational methods works poorly on literary objects. This is the crux of the issue. Statisticians or computer scientists can check for empirical mistakes and errors in implementation; they do not understand what would constitute a weak or conceptually-confused argument in literary scholarship. This is why the guidelines I lay out in my appendix, in which many people are brought into peer review, should be considered.

NAN Z. DA teaches literature at the University of Notre Dame.

1 Comment

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses, Day 3

Mark Algee-Hewitt

In 2010, as a new postdoctoral fellow, I presented a paper on James Thomson’s 1730 poem The Seasons to a group of senior scholars. The argument was modest: I used close readings to suggest that in each section of the poem Thomson simulated an aesthetic experience for his readers before teaching them how to interpret it. The response was mild and mostly positive. Six months later, having gained slightly more confidence, I presented the same project with a twist: I included a graph that revealed my readings to be based on a pattern of repeated discourse throughout the poem. The response was swift and polarizing: while some in the room thought that the quantitative methods deepened the argument, others argued strongly that I was undermining the whole field. For me, the experience was formative: the simple presence of numbers was enough to enrage scholars many years my senior, long before Digital Humanities gained any prestige, funding, or institutional support.

My experience suggests that this project passed what Da calls the “smell test”: the critical results remained valid, even without the supporting apparatus of the quantitative analysis. And while Da might argue that this proves that the quantitative aspect of the project was unnecessary in the first place, I would respectfully disagree. The pattern I found was the basis for my reading and to present it as if I had discovered it through reading alone was, at best, disingenuous. The quantitative aspect to my argument also allowed me to connect the poem to a larger pattern of poetics throughout the eighteenth century.  And I would go further to contend that just as introduction of quantification into a field changes the field, so too does the field change the method to suit its own ends; and that confirming a statistical result through its agreement with conclusions derived from literary historical methods is just as powerful as a null hypothesis test. In other words, Da’s “smell test” suggests a potential way forward in synthesizing these methods.

But the lesson I learned remains as powerful as ever: regardless of how they are embedded in research, regardless of who uses them, computational methods provoke an immediate, often negative, response in many humanities scholars. And it is worth asking why. Just as it is always worth reexamining the institutional, political, and gendered history of methods such as new history, formalism, and even close reading, so too is it important, as Katherine Bode suggests, to think through these same issues in Digital Humanities as a whole. And it is crucial that we do so without erasing the work of the new, emerging, and often structurally vulnerable members of the field that Lauren Klein highlights. These methods have a powerful appeal among emerging groups of students and young scholars. And to seek to shut down scholarship by asserting a blanket incompatibility between method and object is to do a disservice to the fascinating work of emerging scholars that is reshaping our critical practices and our understanding of literature.

MARK ALGEE-HEWITT is an assistant professor of English and Digital Humanities at Stanford University where he directs the Stanford Literary Lab. His current work combines computational methods with literary criticism to explore large scale changes in aesthetic concepts during the eighteenth and nineteenth centuries. The projects that he leads at the Literary Lab include a study of racialized language in nineteenth-century American literature and a computational analysis of differences in disciplinary style. Mark’s work has appeared in New Literary History, Digital Scholarship in the Humanities, as well as in edited volumes on the Enlightenment and the Digital Humanities.

Leave a comment

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses, Day 3

Katherine Bode

Da’s is the first article (I’m aware of) to offer a statistical rejection of statistical approaches to literature. The exaggerated ideological agenda of earlier criticisms, which described the use of numbers or computers to analyze literature as neoliberal, neoimperialist, neoconservative, and more, made them easy to dismiss. Yet to some extent, this routinized dismissal instituted a binary in CLS, wherein numbers, statistics, and computers became distinct from ideology. If nothing else, this debate will hopefully demonstrate that no arguments––including statistical ones––are ideologically (or ethically) neutral.

But this realization doesn’t get us very far. If all arguments have ideological and ethical dimensions, then making and assessing them requires something more than proving their in/accuracy; more than establishing their reproducibility, replicability, or lack thereof. Da’s “Argument” response seemed to move us toward what is needed in describing the aim of her article as: “to empower literary scholars and editors to ask logical questions about computational and quantitative literary criticism should they suspect a conceptual mismatch between the result and the argument or perceive the literary-critical payoff to be extraordinarily low.” However, she closes that path down in allowing only one possible answer to such questions: “in practice” there can be no “payoff … [in terms of] literary-critical meaning, from these methods”; CLS “conclusions”––whether “corroborat[ing] or disprov[ing] existing knowledge”––are only ever “tautological at best, merely superficial at worse.”

Risking blatant self-promotion, I’d say I’ve often used quantification to show “something interesting that derives from measurements that are nonreductive.” For instance, A World of Fiction challenges the prevailing view that nineteenth-century Australian fiction replicates the legal lie of terra nullius by not representing Aboriginal characters, in establishing their widespread prevalence in such fiction; and contrary to the perception of the Australian colonies as separate literary cultures oriented toward their metropolitan centers, it demonstrates the existence of a largely separate, strongly interlinked, provincial literary culture.[1] To give just one other example from many possibilities, Ted Underwood’s “Why Literary Time is Measured in Minutes” uses hand-coded samples from three centuries of literature to indicate an acceleration in the pace of fiction.[2] Running the gauntlet from counting to predictive modelling, these arguments are all statistical, according to Da’s definition: “if numbers and their interpretation are involved, then statistics has come into play.” And as in this definition, they don’t stop with numerical results, but explore their literary critical and historical implications.

If what happens prior to arriving at a statistical finding cannot be justified, the argument is worthless; the same is true if what happens after that point is of no literary-critical interest. Ethical considerations are essential in justifying what is studied, why, and how. This is not––and should not be––a low bar. I’d hoped this forum would help build connections between literary and statistical ways of knowing. The idea that quantification and computation can only yield superficial or tautological literary arguments shows that we’re just replaying the same old arguments, even if both sides are now making them in statistical terms.

KATHERINE BODE is associate professor of literary and textual studies at the Australian National University. Her latest book, A World of Fiction: Digital Collections and the Future of Literary History (2018), offers a new approach to literary research with mass-digitized collections, based on the theory and technology of the scholarly edition. Applying this model, Bode investigates a transnational collection of around 10,000 novels and novellas, discovered in digitized nineteenth-century Australian newspapers, to offer new insights into phenomena ranging from literary anonymity and fiction syndication to the emergence and intersections of national literary traditions.

[1]Katherine Bode, A World of Fiction: Digital Collections and the Future of Literary History (Ann Arbor: University of Michigan Press, 2018).

[2]Ted Underwood, “Why Literary Time is Measured in Minutes,” ELH 85.2 (2018): 341–365.

Leave a comment

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses, Day 3

 

Lauren F. Klein

The knowledge that there are many important voices not represented in this forum has prompted me to think harder about the context for the lines I quoted at the outset of my previous remarks. Parham’s own model for “The New Rigor” comes from diversity work, and the multiple forms of labor—affective as much as intellectual—that are required of individuals, almost always women and people of color, in order to compensate for the structural deficiencies of the university. I should have provided that context at the outset, both to do justice to Parham’s original formulation, and because the same structural deficiencies are at work in this forum, as they are in the field of DH overall.

In her most recent response, Katherine Bode posed a series of crucial questions about why literary studies remains fixated on the “individualistic, masculinist mode of statistical criticism” that characterizes much of the work that Da takes on in her essay. Bode further asks why the field of literary studies has allowed this focus to overshadow so much of the transformative work that has been pursued alongside—and, at times, in direct support of––this particular form of computational literary studies.

But I think we also know the answers, and they point back to the same structural deficienciesthat Parham explores in her essay: a university structure that rewards certain forms of work and devalues others. In a general academic context, we might point to mentorship, advising, and community-building as clear examples of this devalued work. But in the context of the work discussed in this forum, we can align efforts to recover overlooked texts, compile new datasets, and preserve fragile archives, with the undervalued side of this equation as well. It’s not only that these forms of scholarship, like the “service” work described just above, are performed disproportionally by women and people of color. It is also that, because of the ways in which archives and canons are constructed, projects that focus on women and people of color require many more of these generous and generative scholarly acts. Without these acts, and the scholars who perform them, much of the formally-published work on these subjects could not begin to exist.

Consider Kenton Rambsy’s “Black Short Story Dataset,” a dataset creation effort that he undertook because his own research questions about the changing composition of African American fiction anthologies could not be answered by any existing corpus; Margaret Galvan’s project to create an archive of comics in social movements, which she has undertaken in order to support her own computational work as well as her students’ learning; or any number of the projects published with Small Axe Archipelagos, a born-digital journal edited and produced by a team of librarians and faculty that has been intentionally designed to be read by people who live in the Caribbean as well as for scholars who work on that region. These projects each involve sophisticated computational thinking—at the level of resource creation and platform development as well as of analytical method. They respond both to specific research questions and to larger scholarly need. They require work, and they require time.

It’s clear that these projects provide significant value to the field of literary studies, as they do to the digital humanities and to the communities to which their work is addressed. In the end, the absence of the voices of the scholars who lead these projects, both from this forum and from the scholarship it explores, offers the most convincing evidence of what—and who—is valued most by existing university structures; and what work—and what people—should be at the center of conversations to come.

LAUREN F. KLEIN is associate professor at the School of Literature, Media, and Communication, Georgia Institute of Technology.

2 Comments

Filed under Uncategorized