Monthly Archives: April 2019

Seventy Into ’48: The State as a Scandal

Khaled Furani

A state, is called the coldest of all cold monsters. Coldly lieth it also; and this lie creepeth from its mouth: ‘I, the state, am the people’…where all are poison-drinkers, the good and the bad: the state, where all lose themselves, the good and the bad: the state, where the slow suicide of all—is called ‘life.’ —Friedrich Nietzsche, Thus Spake Zarathustra, 1883

We are summoned today to reflect on the seventieth anniversary of 1948. On this occasion, I present a certain “gift” to my conqueror. It is in a sense an absurd gift. In a “birthday card,” I extend a gift of truth, or rather regions of truth that may come with an effort towards self-recognition. These are regions that both conqueror and conquered—inhabiting discrepant conditions of fear due to discrepant power at their disposal—may rarely visit, just as one may rarely plunge into one’s own darkness. It is a gift of recognition that 1948 is a truth of a darkness unfolding. That year—and probably a further past—lives with us still, not behind us in the past. We are seventy years into 1948, not simply since 1948. What does it mean to be seventy years into the darkness of 1948?

I do not claim 1948 as ongoing merely due to the ongoing conquest of land, by means both legal and extra-legal. Rather, 1948 stands for unfinished business, by which I mean the variegated business of finishing off the Palestinian body, one-by-one and collectively. The Palestinian’s language, home, memory, land, water, and physical and political body must be cleared away, must vanish, for purity to be attained, for victory to be declared, for death itself to be conquered, for security to be achieved. Or so runs the illusion.

So long as purity stands for security then we ought to be on the alert for a “genocidal desire” at work. This is a desire for massive death for the sake of purity of the Jewish state (meaning composed purely of Jewish bodies) whose symptoms include: erasure of the Arabic language, destruction of historic and living homes, excision and criminalization of native memory, confiscation of lands, pollution of fields, obliteration and ghettoization of villages and towns, theft and contamination of water supplies, withholding of medicine and medical care, experimentation and weapons testing on populations, and elimination of bodies, directly and by proxy.

This genocidal desire seems to find nourishment in fear, fear that lives, for example, in the hoary but protean slogan promising a people said to be without a land a land said to have no people. That is, the Palestinian must not be so that the Israeli can be, just as wild nature must be extirpated from civilization. This genocidal desire has a traceable frequency of appearances, as well as effects. A common alarmist call maligns even Palestinian eggs and sperm going about their work. I am talking about the refrain of “demographic threat.” Then there is the frequent appearance of inciteful graffiti under bridges, on highways, and in streets and alleys throughout the country—“death to the Arabs” and “Kahane was right”—etched with apparent impunity. For tracing some of this desire’s effects, consider all those uprooted from the land. Read their poets. Fadwa Tuqan inscribed their unmet wish on her tomb: “It is enough for me to die on her and be buried in her, under her soil, melt and vanish, and come back to life as weed in her soil, as a flower.” Her wish to escape dying in exile, a wish to return to life in her own soil, even if only as a weed, should perhaps be enough to recognize the destruction wrought by this genocidal desire. In case it is not, I offer some numbers.

Photo by Mohamad Badarne

Traces of a Genocidal Desire

One woman each month. Two children each month. One man each day last month, and perhaps every month since 2000. I am citing a rough but rather probable “slow trickle” of hidden murder: a generally unreported rate of destroyed Palestinian bodies under Israel’s many hands, not including mass killings as in declared military “operations,” also known as “mowing the lawn.” Some bodies are murdered by “on duty” weapons and others by rampant “off duty” weapons. Some bodies are eradicated by soldiers or police in Jerusalem, the West Bank, or Gaza. Other bodies are annihilated in a carefully managed self-destruction of Palestinian citizenry of Israel. Via its selective surveillance and “law enforcement,” one eye of the state never sleeps—it watches for and prosecutes words, even poems in cyberspace—while the other eye “turns blind” when it comes to the influx of weapons for killing ourselves. As one hand tracks weapons and words across the physical and virtual earth, the other appears paralyzed to act against them in this very land.

This destruction of physical bodies is perhaps the most brutal of lenses through which to see how we are seventy years now into an abyss that is ‘48, seventy years into the unfinished business of finishing off the Palestinian body, multifariously, collectively, and yes, corporally. Seventy years, but actually longer, of not only wanting more land but also less and less Palestinians. Thus, by no means a deviation, the “Nationality Law,” like the “Law of Return,” is but one law in a battery of legislation for fulfilling the principle of purity.

This protean principle stems from the fear of impurity and can even be found at work every time fear lives in uttering “Arab” as a way not to see or say “Palestinian,” and “minorities” or “the sector” to see neither. But who is really a minority in this landscape? What enables a powerful minority of immigrants not to recognize a majority in whose midst it keeps bulldozing its way to a fortress? Who pays for this fortress and its enabling landscape that is the modern “Middle East”? At what price?

Photo by Razan Shalabi

The Price of Traps

Trap 1: Cement

In its relentless quest for purity, I see Israel caught in a kind of scandal, from the Greek skandalon, in the sense of a trap, one that can be typified by “cement and weeds.” Clearly, like any metaphor, it has its limits, but it helps me express the recurring drama of an Israel as a prevailing culture of cement and a peasants’ verdant and fecund Palestine, now destroyed and buried over, remaining only as weeds that grow through cracks, to pollinate and spread out through the air. The debacle for Israel is that despite all efforts at purification and eradication, “the weeds” never really go away. Israel is doomed to pour ever-sprawling cement and spew ever-toxic pesticides, to ultimately no avail. I am not sure what degree of obtuseness is required to not recognize where life, any life, is or is not viable: in the thorny, undesired, yet green of the weeds or in the cold, hard, grey of cement.

Trap 2: The Ghetto Incarnate

While Jews coming from Europe aspired for a kind of freedom when colonizing Palestine, it is unfreedom that they have built with their own hands. This kind of unfreedom is the same kind that comes with models like the shtetl or crusader’s castle, crisscrossed by all sorts of ramparts, immediately visible and less so. Aspiring for rootedness at “home,” rather than grow amidst the age-old olive trees, they sought to uproot them and plant instead fast-growing, concealing, highly flammable pines imported from their xenophobic oppressors. Loyal to its European baggage, the more Israel purges the roots of Palestine the more it plunges into its own grave. Through a coursing river, it planted a mikveh, a still pool for purification. And the river in this case would be the Arab-Muslim “civilizational space”—historically a home for flourishing Jewish traditions, among others—reduced to a fragmented, faltering complex of nation-states. Caught in a pendulum between Jewish and democratic, Israel fails to wonder if it should be a state or something better than a state. Fleeing from the diseases of purificatory Europe with its plaguing “cures,” Israel brings putrefication to the entire body of the “Middle East,” by which I mean modern sovereignty’s aseptic powers.

Trap 3: Vitality and Vitiation

The cage of the Ghetto Incarnate is ensnared by other cages, peculiar to Israel being a state, and being a state here, making the Jews’ “homecoming” very impiously unbecoming. As a state, and like any state, Israel is so worried about its death that it suffocates the possibility of its citizens coming into an authentic relation with theirs. And it so venerates “life,” that is, its life, that it vitiates access to a genuine life that recognizes life’s companion: death. It calls upon God only to end up acting like one. And on its altar, its citizenry is requested to surrender and sacrifice a basic sense of humility, a basic recognition of interdependence and fragility in themselves and in the universe. Israel thereby doubles down on its zarut, that is, its foreignness, as a kind of avodah zarah (idol worship), which should be a stranger to Abrahamic tradition and strange to take root in the land from which this very tradition grew.

In the meantime, we as autochthones of this place, descendants of its fellaheen and Bedouin, as organic guardians of the land’s evolving consciousness, including the Sumerian, Akkadian, Babylonian, Assyrian, Pharaonic, Persian, Phoenician, Philistine, Nabatean, Canaanite, Syriac, Aramaic, Hebraic, Hellenic, and Latin, among others to be sure that make up Palestine, attempt to thrive among their remains or risk our own calcification. Doing so means recognizing and confronting the cages first erected seventy years ago, but maybe much earlier. Perhaps we should be asking what does it mean to be 102 years into the darkness of Sykes-Picot and 370 years into the darkness of the Peace of Westphalia, the peace that pacified us by waging a fatal war on our sense of life and above all on life’s precariousness?

Photo by Razan Shalabi

[This paper originated as a talk given at a panel on “70 to ‘48: Reflections on Local Time,” held by the Sociology and Anthropology Department at Tel Aviv University on December 27, 2018.

Khaled Furani is an associate professor in the Department of Sociology and Anthropology, Tel Aviv University.

Leave a comment

Filed under Uncategorized

More Responses to “The Computational Case against Computational Literary Studies” 

Earlier this month, Critical Inquiry hosted an online forum featuring responses to and discussion about Nan Z. Da’s “The Computational Case against Computational Literary Studies.”  To accommodate further commentary to Da’s article and to the forum itself, we have created a new page for responses.

RESPONSES

  • Taylor Arnold (University of Richmond).
  • Duncan Buell (University of South Carolina, Columbia).

 


Taylor Arnold

As a statistician who has worked and published extensively within the fields of digital humanities (DH) and computational linguistics over the past decade, I have been closely following Nan Z. Da’s article “The Computational Case against Computational Literary Studies” and the ensuing conversations in the online forum. It has been repeatedly pointed out that the article contains numerous errors and misunderstandings about statistical inference, Bayesian inference, and mathematical topology. It is not my intention here to restate these same objections. I want to focus instead on an aspect of the work that has gone relatively undiscussed: the larger role to be played by statistics and statisticians within computational DH.

Da correctly points out that computational literary studies, and computational DH more generally, takes a large proportion of its methods, theories, and tools from the field of statistics. And yet, she also notes, scholars have had only limited collaborations with statisticians. It is easy to produce quantitative evidence of this fact. There are a total of zero trained statisticians (having either a Ph.D. or an academic position with the title of statistics) amongst: the 25 members on the editorial board of Cultural Analytics, 11 editors of Digital Humanities Quarterly, 22 members of the editorial board for Digital Scholarship in the Humanities, 10 members of the executive committee for the Australasian Association for Digital Humanities, 9 members of the executive committee for the Association for Computers and the Humanities, 9 members of the executive committee for the European Association for Digital Humanities, and the 4 executive council members in the Canadian Society for Digital Humanities.[1]While I do have great respect for these organizations and many of the people involved with them, the total of absence of any professional statisticians—and in many of the cited examples, lack of scholars with a terminal degree in any technical field—is a problem for a field grounded, at least in part, by the analysis of data.

In the last line of her response “Final Comments,” Da calls for a peer-review process “in which many people,” meaning statisticians and computer scientists, “are brought into peer review.” That is a good place to start but not nearly sufficient. I, and likely many other computationally trained scholars, am already frequently asked to review papers and abstract proposals for the aforementioned journals and professional societies. Da as well has claimed that her Critical Inquiry article was also vetted by a computational reviewer. The actual problem is instead that statisticians need to be involved in computational analyses from the start. To only use computational scholars at the level of peer-review risks falling into the classic trap famously described by Sir Ronald Fisher: consulting a statistician after already having collected data is nothing more than “a post mortem examination.”[2]

To see the potential for working closely with statisticians, one must look no further than Da’s own essay. She critiques the overuse and misinterpretation of term frequencies, latent Dirichlet allocation, and network analysis within computational literary studies. Without a solid background in these methods, however, the article opens itself up to the obvious (at least to a statistician) counterarguments offered in the forum by scholars such as Lauren Klein, Andrew Piper, and Ted Underwood. Had Da cowritten the article with someone with a background in statistics—she even admits that she is “far from being the ideal candidate for assessing this work,”[3] so why she would undertake this task alone in the first place is a mystery—these mistakes could have been avoided and replaced with stronger arguments. As a statistician, I also agree with many of her stated concerns over the particular methods listed in the article.[4]However, the empty critiques of what not to do could and should have been replaced with alternative methods that address some of Da’s concerns over reproducibility and multiple hypothesis testing. These corrections and additions would have been possible if she had heeded her own advice about engaging with statisticians.

My research in computational digital humanities has been a mostly productive and enjoyable experience. I have been fortunate to have colleagues who treat me as an equal within our joint research and I believe this has been the primary reason for the success of these projects. These relationships are unfortunately far from the norm. Collaborations with statisticians and computer scientists are too frequently either unattributed or avoided altogether. The field of DH often sees itself as challenging epistemological constraints towards the study of the humanities and transcending traditional disciplinary boundaries. These lofty goals are attainable only if scholars from other intellectual traditions are fully welcomed into the conversation as equal collaborators.

[1]I apologize in advance if I have missed anyone in the tally. I did my best to be diligent, but not every website provided easily checked contact information.

[2]Presidential Address to the First Indian Statistical Congress, 1938. Sankhya 4, 14-17.

[3]https://critinq.wordpress.com/2019/04/03/computational-literary-studies-participant-forum-responses-day-3-4/

[4]As a case in point, just last week I had a paper accepted for publication in which we lay out an argument and methodologies for moving beyond word counting methods in DH. See: Arnold, T., Baillier, N., Lissón, P., and Tilton, L. “Beyond lexical frequencies: Using R for text analysis in the digital humanities.” Linguistic Resources and Evaluation. To Appear.

TAYLOR ARNOLD is an assistant professor of statistics at the University of Richmond. He codirects the distant viewing lab with Lauren Tilton, an NEH-funded project that develops computational techniques to analyze visual culture on a large scale. He is the co-author the books Humanities Data in R and Computational Approach to Statistical Learning.

 


Duncan Buell

As a computer scientist who has been collaborating in the digital humanities for ten years now, I found Da’s article both well-written and dead on in its arguments about the shallow use of computation. I am teaching a course in text analysis this semester, and I find myself discussing repeatedly with my students the fact that they can computationally find patterns which are almost certainly not causal.

The purpose of computing being insight and not numbers (to quote Richard Hamming), computation in any area that looks like data mining is an iterative process. The first couple of iterations can be used to suggest directions for further study. That further study requires more careful analysis and computation. And at the end one comes back to analysis by scholars to determine if there’s really anything there. This can be especially true of text, more so than with scientific data, because text as data is so inherently messy; many of the most important features of text are almost impossible to quantify statistically and almost impossible to set rules for a priori.

Those first few iterations are the fun 90 percent of the work because new things show up that might only be seen by computation. It’s the next 90 percent of the work that isn’t so much fun and that often doesn’t get done. Da argues that scholars should step back from their perhaps too-easy conclusions and dig deeper. Unlike with much scientific data, we don’t have natural laws and equations to fall back on with which the data must be consistent. Ground truth is much harder to tease out, and skeptical calibration of numerical results is crucial.

Part of Da’s criticism, which seems to have been echoed by one respondent (Piper), is that scholars are perhaps too quick to conclude a “why” for the numbers they observe. Although for the purpose of making things seem more intuitive scientists often speak as if there were a “why,” there is in fact none of that.  Physics, as I learned in my freshman class at university, describes “what”; it does not explain “why.” The pull of gravity is 9.8 meters per second per second, as described by Newton’s equations. The empirical scientist will not ask why this is but will use the fact to provide models for physical interactions. It is the job of the theorist to provide a justification for the equations.

There is a need for more of this in the digital humanities. One can perform all kinds of computations (my collaborators and I, for example, have twenty thousand first-year-composition essays collected over several years). But to really provide value to scholarship one needs to frame quantitative questions that might correlate with ideas of scholarly interest, do the computations, calibrate the results, and verify that there is causation behind the results. This can be done and has been done in the digital humanities, but it isn’t as common as it should be, and Da is only pointing out this unfortunate fact.

DUNCAN BUELL is the NCR Professor of Computer Science and Engineering at the University of South Carolina, Columbia.

2 Comments

Filed under Uncategorized

Bruno Latour and Dipesh Chakrabarty: Geopolitics and the “Facts” of Climate Change

Bruno Latour and Dipesh Chakrabarty visited WB202 to discuss new “questions of concern” and the fight over “facts” and climate change in the world after Trump’s election. Latour and Timothy Lenton’s “Extending the Domain of Freedom, or Why Gaia Is So Hard to Understand” appeared in the Spring 2019 issue of Critical Inquiry. Chakrabarty’s “The Planet: An Emergent Humanist Category” is forthcoming in Autumn 2019.

You can also listen and subscribe to WB202 at:

iTunes

Google Play

TuneIn

Leave a comment

Filed under Podcast

Computational Literary Studies: Participant Forum Responses, Day 3

 

Stanley Fish

Some commentators to this forum object to my inclusion in it in part because I have no real credentials in the field. They are correct. Although I have now written five pieces on the Digital Humanities—three brief op-eds in the New York Times, an essay entitled “The Interpretive Poverty of Data” published in the blog Balkinization, and a forthcoming contribution to the New York University Journal of Law & Liberty with the title “If You Count It They Will Come”—in none of these do I display any real knowledge of statistical methods. My only possible claim to expertise, and it is a spurious one, is that my daughter is a statistician. I recently heard her give an address on some issue in bio-medical statistics and I barely understood 20 percent of it. Nevertheless, I would contend that this confessed ignorance is no bar to my pronouncing on the Digital Humanities because my objections to it are lodged on a theoretical level in relation to which actual statistical work in the field is beside the point. I don’t care what form these analyses take. I know in advance that they will fail (at least in relation to the claims made from them) in two ways: either they crank up a huge amount of machinery in order to produce something that was obvious from the get go—they just dress up garden variety literary intuition in numbers—or the interpretive conclusions they draw from the assembled data are entirely arbitrary, without motivation except the motivation to have their labors yield something, yield anything. Either their herculean efforts do nothing or when something is done with them, it is entirely illegitimate. This is so (or so I argue) because the underlying claim of the Digital Humanities (and of its legal variant Corpus Linguistics) that formal features––anything from sentence length, to image clusters, to word frequencies, to collocations of words, to passive constructions, to you name it—carry meaning is uncashable. They don’t unless all of the factors the Digital Humanities procedures leave out—including, but not limited to, context, intention, literary history, the idea of literature itself—are put back in. I was pleased therefore to find that Professor Da, possessed of a detailed knowledge infinitely greater than mine, supports my relatively untutored critique. When she says that work in Computational Studies comes in two categories—“papers that present a statistical no result finding as a finding” and “papers that draw conclusions from its finding that are wrong”—I can only cheer. When she declares “CLS as it currently exists has very little explanatory power,” I think that she gives too much credit to the project with the words “very little”; it has no explanatory power. And then there is this sentence, which to my mind, absolutely clinches the case: “there are many different ways of extracting factors and loads of new techniques for odd data sets, but these are atheoretical approaches, meaning, strictly, that you can’t use them with the hope that they will work magic for you in producing interpretations that are intentional” and “have meaning and insight.” For me the word intentional is the key. The excavation of verbal patterns must remain an inert activity until added to it is the purpose of some intentional agent whose project gives those patterns significance. Once you detach the numbers from the intention that generated them, there is absolutely nothing you can do with them, or, rather (it is the same thing) you can do with them anything you like. At bottom CLS or Digital Humanities is a project dedicated to irresponsibility masked by diagrams and massive data mining. The antidote to the whole puffed-up thing is nicely identified by Professor Da in her final paragraph: “just read the texts.”

 

STANLEY FISH is a professor of law at Florida International University and a visiting professor at the Benjamin N. Cardozo School of Law. He is also a member of the extended Critical Inquiry editorial board.

5 Comments

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses, Day 3

Final Comments

Nan Z. Da

(This is the last of three responses to the online forum. The others are “Errors” and “Argument.”)

I want to state that nothing about this forum has been unbalanced or unfair. I wrote the article. Those who may not agree with it (in part or in its entirety) have every right to critique it in an academic forum.

What my critics and neutral parties on this forum seem to want from “The Computational Case” is nothing short of: (1) an across-the-board reproducibility check (qua OSC, as Piper suggests), plus (2) careful analyses of CLS work in which even the “suppression” of tiny hedges would count as misrepresentation, plus (3) a state-of-the-field for computational literary studies and related areas of the digital humanities, past and emergent. To them, that’s the kind of intellectual labor that would make my efforts valid.

Ted Underwood’s suggestion that my article and this forum have in effect been stunts designed to attract attention does a disservice to a mode of scholarship that we may simply call critical inquiry. He is right that this might be a function of the times. The demand, across social media and elsewhere, that I must answer for myself right away for critiquing CLS in a noncelebratory manner is a symptom of the social and institutional power computational studies and the digital humanities have garnered to themselves.

Yes, “field-killing” is a term that doesn’t belong in scholarship, and one more indication that certain kinds of academic discourse should only take place in certain contexts. That said, an unrooted rhetoric of solidarity and “moreness”—we’re all in this together—is a poor way to argue. Consider what Sarah Brouillette has powerfully underscored about the institutional and financial politics of this subfield: it is time, as I’ve said, to ask some questions.

Underwood condemns social media and other public responses. He has left out the equally pernicious efforts on social media and in other circles to invalidate my article by whispering—or rather, publically publishing doubts—about Critical Inquiry’s peer review process. It has been suggested, by Underwood and many other critics of this article, that it was not properly peer-reviewed by someone out-of-field. This is untrue—my paper was reviewed by an expert in quantitative analysis and mathematical modeling—and it is damaging. It suggests that anyone who dares to check the work of leading figures in CLS will be tried by gossip.

Does my article make empirical mistakes? Yes, a few, mostly in section 3. I will list them in time, but they do not bear on the macro-claims in that section. With the exception of a misunderstanding in the discussion of Underwood’s essay none of the rebuttals presented in this forum made on empirical grounds have any substance. Piper’s evidence that I “failed at basic math” refers to a simple rhetorical example in which I rounded down to the nearest thousand for the sake of legibility.

Anyone who does serious quantitative analysis will see that I am far from being the ideal candidate for assessing this work. Still, I think the fundamental conflict of interest at issue here should be obvious to all. People who can do this work on a high level tend not to care to critique it, or else they tend not to question how quantitative methods intersect with the distinctiveness of literary criticism, in all its forms and modes of argumentation. In the interest of full disclosure: after assessing the validity of my empirical claims, my out-of-field peer reviewer did not finally agree with me that computational methods works poorly on literary objects. This is the crux of the issue. Statisticians or computer scientists can check for empirical mistakes and errors in implementation; they do not understand what would constitute a weak or conceptually-confused argument in literary scholarship. This is why the guidelines I lay out in my appendix, in which many people are brought into peer review, should be considered.

NAN Z. DA teaches literature at the University of Notre Dame.

 

3 Comments

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses, Day 3

Mark Algee-Hewitt

In 2010, as a new postdoctoral fellow, I presented a paper on James Thomson’s 1730 poem The Seasons to a group of senior scholars. The argument was modest: I used close readings to suggest that in each section of the poem Thomson simulated an aesthetic experience for his readers before teaching them how to interpret it. The response was mild and mostly positive. Six months later, having gained slightly more confidence, I presented the same project with a twist: I included a graph that revealed my readings to be based on a pattern of repeated discourse throughout the poem. The response was swift and polarizing: while some in the room thought that the quantitative methods deepened the argument, others argued strongly that I was undermining the whole field. For me, the experience was formative: the simple presence of numbers was enough to enrage scholars many years my senior, long before Digital Humanities gained any prestige, funding, or institutional support.

My experience suggests that this project passed what Da calls the “smell test”: the critical results remained valid, even without the supporting apparatus of the quantitative analysis. And while Da might argue that this proves that the quantitative aspect of the project was unnecessary in the first place, I would respectfully disagree. The pattern I found was the basis for my reading and to present it as if I had discovered it through reading alone was, at best, disingenuous. The quantitative aspect to my argument also allowed me to connect the poem to a larger pattern of poetics throughout the eighteenth century.  And I would go further to contend that just as introduction of quantification into a field changes the field, so too does the field change the method to suit its own ends; and that confirming a statistical result through its agreement with conclusions derived from literary historical methods is just as powerful as a null hypothesis test. In other words, Da’s “smell test” suggests a potential way forward in synthesizing these methods.

But the lesson I learned remains as powerful as ever: regardless of how they are embedded in research, regardless of who uses them, computational methods provoke an immediate, often negative, response in many humanities scholars. And it is worth asking why. Just as it is always worth reexamining the institutional, political, and gendered history of methods such as new history, formalism, and even close reading, so too is it important, as Katherine Bode suggests, to think through these same issues in Digital Humanities as a whole. And it is crucial that we do so without erasing the work of the new, emerging, and often structurally vulnerable members of the field that Lauren Klein highlights. These methods have a powerful appeal among emerging groups of students and young scholars. And to seek to shut down scholarship by asserting a blanket incompatibility between method and object is to do a disservice to the fascinating work of emerging scholars that is reshaping our critical practices and our understanding of literature.

MARK ALGEE-HEWITT is an assistant professor of English and Digital Humanities at Stanford University where he directs the Stanford Literary Lab. His current work combines computational methods with literary criticism to explore large scale changes in aesthetic concepts during the eighteenth and nineteenth centuries. The projects that he leads at the Literary Lab include a study of racialized language in nineteenth-century American literature and a computational analysis of differences in disciplinary style. Mark’s work has appeared in New Literary History, Digital Scholarship in the Humanities, as well as in edited volumes on the Enlightenment and the Digital Humanities.

1 Comment

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses, Day 3

Katherine Bode

Da’s is the first article (I’m aware of) to offer a statistical rejection of statistical approaches to literature. The exaggerated ideological agenda of earlier criticisms, which described the use of numbers or computers to analyze literature as neoliberal, neoimperialist, neoconservative, and more, made them easy to dismiss. Yet to some extent, this routinized dismissal instituted a binary in CLS, wherein numbers, statistics, and computers became distinct from ideology. If nothing else, this debate will hopefully demonstrate that no arguments––including statistical ones––are ideologically (or ethically) neutral.

But this realization doesn’t get us very far. If all arguments have ideological and ethical dimensions, then making and assessing them requires something more than proving their in/accuracy; more than establishing their reproducibility, replicability, or lack thereof. Da’s “Argument” response seemed to move us toward what is needed in describing the aim of her article as: “to empower literary scholars and editors to ask logical questions about computational and quantitative literary criticism should they suspect a conceptual mismatch between the result and the argument or perceive the literary-critical payoff to be extraordinarily low.” However, she closes that path down in allowing only one possible answer to such questions: “in practice” there can be no “payoff … [in terms of] literary-critical meaning, from these methods”; CLS “conclusions”––whether “corroborat[ing] or disprov[ing] existing knowledge”––are only ever “tautological at best, merely superficial at worse.”

Risking blatant self-promotion, I’d say I’ve often used quantification to show “something interesting that derives from measurements that are nonreductive.” For instance, A World of Fiction challenges the prevailing view that nineteenth-century Australian fiction replicates the legal lie of terra nullius by not representing Aboriginal characters, in establishing their widespread prevalence in such fiction; and contrary to the perception of the Australian colonies as separate literary cultures oriented toward their metropolitan centers, it demonstrates the existence of a largely separate, strongly interlinked, provincial literary culture.[1] To give just one other example from many possibilities, Ted Underwood’s “Why Literary Time is Measured in Minutes” uses hand-coded samples from three centuries of literature to indicate an acceleration in the pace of fiction.[2] Running the gauntlet from counting to predictive modelling, these arguments are all statistical, according to Da’s definition: “if numbers and their interpretation are involved, then statistics has come into play.” And as in this definition, they don’t stop with numerical results, but explore their literary critical and historical implications.

If what happens prior to arriving at a statistical finding cannot be justified, the argument is worthless; the same is true if what happens after that point is of no literary-critical interest. Ethical considerations are essential in justifying what is studied, why, and how. This is not––and should not be––a low bar. I’d hoped this forum would help build connections between literary and statistical ways of knowing. The idea that quantification and computation can only yield superficial or tautological literary arguments shows that we’re just replaying the same old arguments, even if both sides are now making them in statistical terms.

KATHERINE BODE is associate professor of literary and textual studies at the Australian National University. Her latest book, A World of Fiction: Digital Collections and the Future of Literary History (2018), offers a new approach to literary research with mass-digitized collections, based on the theory and technology of the scholarly edition. Applying this model, Bode investigates a transnational collection of around 10,000 novels and novellas, discovered in digitized nineteenth-century Australian newspapers, to offer new insights into phenomena ranging from literary anonymity and fiction syndication to the emergence and intersections of national literary traditions.

[1]Katherine Bode, A World of Fiction: Digital Collections and the Future of Literary History (Ann Arbor: University of Michigan Press, 2018).

[2]Ted Underwood, “Why Literary Time is Measured in Minutes,” ELH 85.2 (2018): 341–365.

Leave a comment

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses, Day 3

 

Lauren F. Klein

The knowledge that there are many important voices not represented in this forum has prompted me to think harder about the context for the lines I quoted at the outset of my previous remarks. Parham’s own model for “The New Rigor” comes from diversity work, and the multiple forms of labor—affective as much as intellectual—that are required of individuals, almost always women and people of color, in order to compensate for the structural deficiencies of the university. I should have provided that context at the outset, both to do justice to Parham’s original formulation, and because the same structural deficiencies are at work in this forum, as they are in the field of DH overall.

In her most recent response, Katherine Bode posed a series of crucial questions about why literary studies remains fixated on the “individualistic, masculinist mode of statistical criticism” that characterizes much of the work that Da takes on in her essay. Bode further asks why the field of literary studies has allowed this focus to overshadow so much of the transformative work that has been pursued alongside—and, at times, in direct support of––this particular form of computational literary studies.

But I think we also know the answers, and they point back to the same structural deficienciesthat Parham explores in her essay: a university structure that rewards certain forms of work and devalues others. In a general academic context, we might point to mentorship, advising, and community-building as clear examples of this devalued work. But in the context of the work discussed in this forum, we can align efforts to recover overlooked texts, compile new datasets, and preserve fragile archives, with the undervalued side of this equation as well. It’s not only that these forms of scholarship, like the “service” work described just above, are performed disproportionally by women and people of color. It is also that, because of the ways in which archives and canons are constructed, projects that focus on women and people of color require many more of these generous and generative scholarly acts. Without these acts, and the scholars who perform them, much of the formally-published work on these subjects could not begin to exist.

Consider Kenton Rambsy’s “Black Short Story Dataset,” a dataset creation effort that he undertook because his own research questions about the changing composition of African American fiction anthologies could not be answered by any existing corpus; Margaret Galvan’s project to create an archive of comics in social movements, which she has undertaken in order to support her own computational work as well as her students’ learning; or any number of the projects published with Small Axe Archipelagos, a born-digital journal edited and produced by a team of librarians and faculty that has been intentionally designed to be read by people who live in the Caribbean as well as for scholars who work on that region. These projects each involve sophisticated computational thinking—at the level of resource creation and platform development as well as of analytical method. They respond both to specific research questions and to larger scholarly need. They require work, and they require time.

It’s clear that these projects provide significant value to the field of literary studies, as they do to the digital humanities and to the communities to which their work is addressed. In the end, the absence of the voices of the scholars who lead these projects, both from this forum and from the scholarship it explores, offers the most convincing evidence of what—and who—is valued most by existing university structures; and what work—and what people—should be at the center of conversations to come.

LAUREN F. KLEIN is associate professor at the School of Literature, Media, and Communication, Georgia Institute of Technology.

2 Comments

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses, Day 2

 

Ted Underwood

More could be said about specific claims in “The Computational Case.” But frankly, this forum isn’t happening because literary critics were persuaded by (or repelled by) Da’s statistical arguments. The forum was planned before publication because the essay’s general strategy was expected to make waves. Social media fanfare at the roll-out made clear that rumors of a “field-killing” project had been circulating for months among scholars who might not yet have read the text but were already eager to believe that Da had found a way to hoist cultural analytics by its own petard—the irrefutable authority of mathematics.

That excitement is probably something we should be discussing. Da’s essay doesn’t actually reveal much about current trends in cultural analytics. But the excitement preceding its release does reveal what people fear about this field—and perhaps suggest how breaches could be healed.

While it is undeniably interesting to hear that colleagues have been anticipating your demise, I don’t take the rumored plans for field-murder literally. For one thing, there’s no motive: literary scholars have little to gain by eliminating other subfields. Even if quantitative work had cornered a large slice of grant funding in literary studies (which it hasn’t), the total sum of all grants in the discipline is too small to create a consequential zero-sum game.

The real currency of literary studies is not grant funding but attention, so I interpret excitement about “The Computational Case” mostly as a sign that a large group of scholars have felt left out of an important conversation. Da’s essay itself describes this frustration, if read suspiciously (and yes, I still do that). Scholars who tried to critique cultural analytics in a purely external way seem to have felt forced into an unrewarding posture—“after all, who would not want to appear reasonable, forward-looking, open-minded?” (p. 603). What was needed instead was a champion willing to venture into quantitative territory and borrow some of that forward-looking buzz.

Da was courageous enough to try, and I think the effects of her venture are likely to be positive for everyone. Literary scholars will see that engaging quantitative arguments quantitatively isn’t all that hard and does produce buzz. Other scholars will follow Da across the qualitative/quantitative divide, and the illusory sharpness of the field boundary will fade.

Da’s own argument remains limited by its assumption that statistics is an alien world, where humanistic guidelines like “acknowledge context” are replaced by rigid hypothesis-testing protocols. But the colleagues who follow her will recognize, I hope, that statistical reasoning is an extension of ordinary human activities like exploration and debate. Humanistic principles still apply here. Quantitative models can test theories, but they are also guided by theory, and they shouldn’t pretend to answer questions more precisely than our theories can frame them. In short, I am glad Da wrote “The Computational Case” because her argument has ended up demonstrating—as a social gesture—what its text denied: that questions about mathematical modeling are continuous with debates about interpretive theory.

TED UNDERWOOD is professor of information sciences and English at the University of Illinois, Urbana-Champaign. He has published in venues ranging from PMLA to the IEEE International Conference on Big Data and is the author most recently of Distant Horizons: Digital Evidence and Literary Change (2019).

1 Comment

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses, Day 2

 

Katherine Bode

The opening statements were fairly critical of Da’s article, less so of CLS. To balance the scales, I want to suggest that Da’s idiosyncratic definition of CLS is partly a product of problematic divisions within digital literary studies.

Da omits what I’d call digital literary scholarship: philological, curatorial, and media archaeological approaches to digital collections and data. Researchers who pursue these approaches, far from reducing all digit(al)ized literature(s) to word counts, maintain––like Da––that analyses based purely or predominantly on such features tend to produce “conceptual fallacies from a literary, historical, or cultural-critical perspective” (p. 604). Omitting such research is part of the way in which Da operationalizes her critique of CLS: defining the field as research that focuses on word counts, then criticizing the field as limited because focused on word counts.

But Da’s perspective is mirrored by many of the researchers she cites. Ted Underwood, for instance, describes “otiose debates about corpus construction” as “well-intentioned red herrings” that detract attention from the proper focus of digital literary studies on statistical methods and inferences.[1] Da has been criticized for propagating a male-dominated version of CLS. But those who pursue the methods she criticizes are mostly men. By contrast, much digital literary scholarship is conducted by women and/or focused on marginalized literatures, peoples, or cultures. The tendency in CLS to privilege data modeling and analysis––and to minimize or dismiss the work of data construction and curation––is part of the culture that creates the male dominance of that field.

More broadly, both the focus on statistical modelling of word frequencies in found datasets, and the prominence accorded to such research in our discipline, puts literary studies out of step with digital research in other humanities fields. In digital history, for instance, researchers collaborate to construct rich datasets––for instance, of court proceedings (as in The Proceedings of the Old Bailey)[2] or social complexity (as reported in a recent Nature article)[3]––that can be used by multiple researchers, including for noncomputational analyses. Where such research is statistical, the methods are often simpler than machine learning models (for instance, trends over time; measures of relationships between select variables) because the questions are explicitly related to scale and the aggregation of well-defined scholarly phenomena, not to epistemologically-novel patterns discerned among thousands of variables.

Some things I want to know: Why is literary studies so hung up on (whether in favor of, or opposed to) this individualistic, masculinist mode of statistical criticism? Why is this focus allowed to marginalize earlier, and inhibit the development of new, large-scale, collaborative environments for both computational and noncomputational literary research? Why, in a field that is supposedly so attuned to identity and inequality, do we accept––and foreground––digital research that relies on platforms (Google Books, HathiTrust, EEBO, and others) that privilege dominant literatures and literary cultures? What would it take to bridge the scholarly and critical––the curatorial and statistical––dimensions of (digital) literary studies and what alternative, shared futures for our discipline could result?

KATHERINE BODE is associate professor of literary and textual studies at the Australian National University. Her latest book, A World of Fiction: Digital Collections and the Future of Literary History (2018), offers a new approach to literary research with mass-digitized collections, based on the theory and technology of the scholarly edition. Applying this model, Bode investigates a transnational collection of around 10,000 novels and novellas, discovered in digitized nineteenth-century Australian newspapers, to offer new insights into phenomena ranging from literary anonymity and fiction syndication to the emergence and intersections of national literary traditions.

[1]Ted Underwood, Distant Horizons: Digital Evidence and Literary Change (Chicago: Chicago University Press, 2019): 180; 176.

[2]Tim Hitchcock, Robert Shoemaker, Clive Emsley, Sharon Howard and Jamie McLaughlin, et al., The Proceedings of the Old Bailey, http://www.oldbaileyonline.org, version 8.0, March 2018).

[3]Harvey Whitehouse, Pieter François, Patrick E. Savage, Thomas E. Currie, Kevin C. Feeney, Enrico Cioni, Rosalind Purcell, et al., “Complex Societies Precede Moralizing Gods Throughout World History,” Nature March 20 (2019): 1.

3 Comments

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses, Day 2

 

Argument

(This response follows Nan Da’s previous “Errors” response)

Nan Z Da

First, a qualification. Due to the time constraints of this forum, I can only address a portion of the issues raised by the forum participants and in ways still imprecise. I do plan to issue an additional response that addresses the more fine-grained technical issues.

“The Computational Case against Computational Literary Studies” was not written for the purposes of refining CLS. The paper does not simply call for “more rigor” or for replicability across the board. It is not about figuring out which statistical mode of inquiry best suits computational literary analysis. It is not a method paper; as some of my respondents point out, those are widely available.

The article was written to empower literary scholars and editors to ask logical questions about computational and quantitative literary criticism should they suspect a conceptual mismatch between the result and the argument or perceive the literary-critical payoff to be extraordinarily low.

The paper, I hope, teaches us to recognize two types of CLS work. First, there is statistically rigorous work that cannot actually answer the question it sets out to answer or doesn’t ask an interesting question at all. Second, there is work that seems to deliver interesting results but is either nonrobust or logically confused. The confusion sometimes issues from something like user error, but it is more often the result of the suboptimal or unnecessary use of statistical and other machine-learning tools. The paper was an attempt to demystify the application of those tools to literary corpora and to explain why technical errors are amplified when your goal is literary interpretation or description.

My article is the culmination of a long investigation into whether computational methods and their modes of quantitative analyses can have purchase in literary studies. My answer is that what drives quantitative results and data patterns often has little to do with the literary critical or literary historical claims being made by scholars that claim to be finding such results and uncovering such patterns—though it sometimes looks like it. If the conclusions we find in CLS corroborate or disprove existing knowledge, this is not a sign that they are correct but that they are tautological at best, merely superficial at worst.

The article is agnostic on what literary criticism ought to be and makes no prescriptions about interpretive habits. The charge that it takes a “purist” position is pure projection. The article aims to describe what scholarship ought not to be. Even the appeal to reading books in the last pages of the article does not presume the inherent meaningfulness of “actually reading” but only serves as a rebuttal to the use of tools that wish to do simple classifications for which human decision would be immeasurably more accurate and much less expensive.

As to the question of Exploratory Data Analysis versus Confirmatory Data Analysis: I don’t prioritize one over the other. If numbers and their interpretation are involved, then statistics has to come into play; I don’t know any way around this. If you wish to simply describe your data, then you have to show something interesting that derives from measurements that are nonreductive. As to the appeal to exploratory tools: if your tool will never be able to explore the problem in question, because it lacks power or is overfitted to its object, your exploratory tool is not needed.

It seems unobjectionable that quantitative methods and nonquantitative methods might work in tandem.  My paper is simply saying: that may be true in theory but it falls short in practice. Andrew Piper points us to the problem of generalization, of how to move from local to global, probative to illustrative. This is precisely the gap my article interrogates because that’s where the collaborative ideal begins to break down. One may call the forcible closing of that gap any number of things—a new hermeneutics, epistemology, or modality—but in the end, the logic has to clear.

My critics are right to point out a bind. The bind is theirs, however, not mine. My point is also that, going forward, it is not for me or a very small group of people to decide what the value of this work is, nor how it should be done.

Ed Finn accuses me of subjecting CLS to a double standard: “Nobody is calling in economists to assess the validity of Marxist literary analysis, or cognitive psychologists to check applications of affect theory, and it’s hard to imagine that scholars would accept the disciplinary authority of those critics.”

This is faulty reasoning. For one thing, literary scholars ask for advice and assessment from scholars in other fields all the time. For another, the payoff of the psychoanalytic reading, even as it seeks extraliterary meaning and validity, is not for psychology but for literary-critical meaning, where it succeeds or fails on its own terms. CLS wants to say, “it’s okay that there isn’t much payoff in our work itself as literary criticism, whether at the level of prose or sophistication of insight; the payoff is in the use of these methods, the description of data, the generation of a predictive model, or the ability for someone else in the future to ask (maybe better) questions. The payoff is in the building of labs, the funding of students, the founding of new journals, the cases made for tenure lines and postdoctoral fellowships and staggeringly large grants. When these are the claims, more than one discipline needs to be called in to evaluate the methods, their applications, and their result. Because printed critique of certain literary scholarship is generally not refuted by pointing to things still in the wings, we are dealing with two different scholarly models. In this situation, then, we should be maximally cross-disciplinary.

NAN Z. DA teaches literature at the University of Notre Dame.

 

Nan Z. Da, Critical Response III. On EDA, Complexity, and Redundancy: A Response to Underwood and Weatherby

2 Comments

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses, Day 2

 

Errors

Nan Z. Da

This first of two responses addresses errors, real and imputed; the second response is the more substantive.

1. There is a significant mistake in footnote 39 (p. 622) of my paper. In it I attribute to Hugh Craig and Arthur F. Kinney the argument that Marlowe wrote parts of some late Shakespeare plays after his (Marlowe’s) death. The attribution is incorrect. What Craig asks in “The Three Parts of Henry VI” (pp. 40-77) is whether Marlowe wrote segments of these plays. I would like to extend my sincere apologies to Craig and to the readers of this essay for the misapprehension that it caused.

2. The statement “After all, statistics automatically assumes” (p. 608) is incorrect. A more correct statement would be: In standard hypothesis testing a 95 percent confidence level means that, when the null is true, you will correctly fail to reject 95 percent of the time.

3. The description of various applications of text-mining/machine-learning (p. 620) as “ethically neutral” is not worded carefully enough. I obviously do not believe that some of these applications, such as tracking terrorists using algorithms, is ethically neutral. I meant that there are myriad applications of these tools: for good, ill, and otherwise. On balance it’s hard to assign an ideological position to them.

4. Ted Underwood is correct that, in my discussion of his article on “The Life Cycle of Genres,” I confused the “ghastly stew” with the randomized control sets used in his predictive modeling. Underwood also does not make the elementary statistical mistake I suggest he has made in my article (“Underwood should train his model on pre-1941” [p. 608]).

As to the charge of misrepresentation: paraphrasing a paper whose “single central thesis … is that the things we call ‘genres’ may be entities of different kinds, with different life cycles and degrees of textual coherence” is difficult. Underwood’s thesis here refers to the relative coherence of detective fiction, gothic, and science fiction over time, with 1930 as the cutoff point.

The other things I say about the paper remain true. The paper cites various literary scholars’ definitions of genre change, but its implicit definition of genre is “consistency over time of 10,000 frequently used terms.” It cannot “reject Franco Moretti’s conjecture that genres have generational cycles” (a conjecture that most would already find too reductive) because it is not using the same testable definition of genre or change.

5. Topic Modeling: my point isn’t that topic models are non-replicable but that, in this particular application, they are non-robust. Among other evidence: if I remove one document out of one hundred, the topics change. That’s a problem.

6. As far as Long and So’s essay “Turbulent Flow” goes, I need a bit more time than this format allows to rerun the alternatives responsibly. So and Long have built a tool in which there are thirteen features for predicting the difference between two genres—Stream of Consciousness and Realism. They say: most of these features are not very predictive alone but together become very predictive, with that power being concentrated in just one feature. I show that that one feature isn’t robust. To revise their puzzling metaphor: it’s as if someone claims that a piano plays beautifully and that most of that sound comes from one key. I play that key; it doesn’t work.

7. So and Long argue that by proving that their classifier misclassifies nonhaikus—not only using English translations of Chinese poetry, as they suggest, but also Japanese poetry that existed long before the haiku—I’ve made a “misguided decision that smacks of Orientalism. . . . It completely erases context and history, suggesting an ontological relation where there is none.” This is worth getting straight. Their classifier lacks power because it can only classify haikus with reference to poems quite different from haikus; to be clear, it will classify equally short texts with overlapping keywords close to haikus as haikus. Overlapping keywords is their predictive feature, not mine. I’m not sure how pointing this out is Orientalist. As for their model, I would if pushed say it is only slightly Orientalist, if not determinatively so.

8. Long and So claim that my “numbers cannot be trusted,” that my “critique . . . is rife with technical and factual errors”; in a similar vein it ends with the assertion that my essay doesn’t “encourag[e] much trust.”  I’ll admit to making some errors in this article, though not in my analyses of Long and So’s papers (the errors mostly occur in section 3). I hope to list all of these errors in the more formal response that appears in print or else in an online appendix. That said, an error is not the same as a specious insinuation that the invalidation of someone’s model indicates Orientalism, pigheadedness, and so on. Nor is an error the same as the claim that “CI asked Da to widen her critique to include female scholars and she declined” recently made by So, which is not an error but a falsehood.

NAN Z. DA teaches literature at the University of Notre Dame.

 

Nan Z. Da, Critical Response III. On EDA, Complexity, and Redundancy: A Response to Underwood and Weatherby

2 Comments

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses

 

Ted Underwood

In the humanities, as elsewhere, researchers who work with numbers often reproduce and test each other’s claims.Nan Z. Da’s contribution to this growing genre differs from previous examples mainly in moving more rapidly. For instance, my coauthors and I spent 5,800 words describing, reproducing, and partially criticizing one article about popular music.By contrast, Da dismisses fourteen publications that use different methods in thirty-eight pages. The article’s energy is impressive, and its long-term effects will be positive.

But this pace has a cost. Da’s argument may be dizzying if readers don’t already know the works summarized, as she rushes through explanation to get to condemnation. Readers who know these works will recognize that Da’s summaries are riddled with material omissions and errors. The time is ripe for a theoretical debate about computing in literary studies. But this article is unfortunately too misleading—even at the level of paraphrase—to provide a starting point for the debate.

For instance, Da suggests that my article “The Life Cycles of Genres”makes genres look stable only because it forgets to compare apples to apples: “Underwood should train his model on pre-1941 detective fiction (A) as compared to pre-1941 random stew and post-1941 detective fiction (B) as compared to post-1941 random stew, instead of one random stew for both” (p. 608).3

This perplexing critique tells me to do exactly what my article (and public code) make clear that I did: compare groups of works matched by publication date.4There is also no “random stew” in the article. Da’s odd phrase conflates a random contrast set with a ghastly “genre stew” that plays a different role in the argument.

More importantly, Da’s critique suppresses the article’s comparative thesis—which identifies detective fiction as more stable than several other genres—in order to create a straw man who argues that all genres “have in fact been more or less consistent from the 1820s to the present” (p. 609). Lacking any comparative yardstick to measure consistency, this straw thesis becomes unprovable. In other cases Da has ignored the significant results of an article, in order to pour scorn on a result the authors acknowledge as having limited significance—without ever mentioning that the authors acknowledge the limitation. This is how she proceeds with Jockers and Kirilloff (p. 610).

In short, this is not an article that works hard at holistic critique. Instead of describing the goals that organize a publication, Da often assumes that researchers were trying (and failing) to do something she believes they should have done. Topic modeling, for instance, identifies patterns in a corpus without pretending to find a uniquely correct description. Humanists use the method mostly for exploratory analysis. But Da begins from the assumption that topic modeling must be a confused attempt to prove hypotheses of some kind. So, she is shocked to discover (and spends a page proving) that different topics can emerge when the method is run multiple times. This is true. It is also a basic premise of the method, acknowledged by all the authors Da cites—who between them spend several pages discussing how results that vary can nevertheless be used for interpretive exploration. Da doesn’t acknowledge the discussion.

Finally, “The Computational Case” performs some crucial misdirection at the outset by implying that cultural analytics is based purely on linguistic evidence and mainly diction. It is true that diction can reveal a great deal, but this is a misleading account of contemporary trends. Quantitative approaches are making waves partly because researchers have learned to extract social relations from literature and partly because they pair language with external social testimony—for instance the judgments of reviewers.Some articles, like my own on narrative pace, use numbers entirely to describe the interpretations of human readers.Once again, Da’s polemical strategy is to isolate one strand in a braid, and critique it as if it were the whole.

A more inquisitive approach to cultural analytics might have revealed that it is not a monolith but an unfolding debate between several projects that frequently criticize each other. Katherine Bode, for instance, has critiqued other researchers’ data (including mine), in an exemplary argument that starts by precisely describing different approaches to historical representation.Da could have made a similarly productive intervention—explaining, for instance, how researchers should report uncertainty in exploratory analysis. Her essay falls short of that achievement because a rush to condemn as many examples as possible has prevented it from taking time to describe and genuinely understand its objects of critique.

TED UNDERWOOD is professor of information sciences and English at the University of Illinois, Urbana-Champaign. He has published in venues ranging from PMLA to the IEEE International Conference on Big Data and is the author most recently of Distant Horizons: Digital Evidence and Literary Change (2019).

1.Andrew Goldstone, “Of Literary Standards and Logistic Regression: A Reproduction,” January 4, 2016, https://andrewgoldstone.com/blog/2016/01/04/standards/. Jonathan Goodwin, “Darko Suvin’s Genres of Victorian SF Revisited,” Oct 17, 2016, https://jgoodwin.net/blog/more-suvin/.

2. Ted Underwood, “Can We Date Revolutions in the History of Literature and Music?”, The Stone and the Shell, October 3, 2015, https://tedunderwood.com/2015/10/03/can-we-date-revolutions-in-the-history-of-literature-and-music/ Ted Underwood, Hoyt Long, Richard Jean So, and Yuancheng Zhu, “You Say You Found a Revolution,” The Stone and the Shell, February 7, 2016, https://tedunderwood.com/2016/02/07/you-say-you-found-a-revolution/.

3. Nan Z. Da, “The Computational Case against Computational Literary Studies,” Critical Inquiry 45 (Spring 2019): 601-39.

4. Ted Underwood, “The Life Cycles of Genres,” Journal of Cultural Analytics, May 23, 2016, http://culturalanalytics.org/2016/05/the-life-cycles-of-genres/.

5. Eve Kraicer and Andrew Piper, “Social Characters: The Hierarchy of Gender in Contemporary English-Language Fiction,” Journal of Cultural Analytics, January 30, 2019, http://culturalanalytics.org/2019/01/social-characters-the-hierarchy-of-gender-in-contemporary-english-language-fiction/

6. Ted Underwood, “Why Literary Time is Measured in Minutes,” ELH 25.2 (2018): 341-65.

7. Katherine Bode, “The Equivalence of ‘Close’ and ‘Distant’ Reading; or, Toward a New Object for Data-Rich Literary History,” MLQ 78.1 (2017): 77-106.

 

Ted Underwood, Critical Response II. The Theoretical Divide Driving Debates about Computation

1 Comment

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses

 

The Select

Andrew Piper

Nan Z. Da’s study published in Critical Inquiry participates in an emerging trend across a number of disciplines that falls under the heading of “replication.”[1] In this, her work follows major efforts in other fields, such as the Open Science Collaboration’s “reproducibility project,” which sought to replicate past studies in the field of psychology.[2] As the authors of the OSC collaboration write, the value of replication, when done well, is that it can “increase certainty when findings are reproduced and promote innovation when they are not.”

And yet despite arriving at sweeping claims about an entire field, Da’s study fails to follow any of the procedures and practices established by projects like the OSC.[3] While invoking the epistemological framework of replication—that is, to prove or disprove the validity of both individual articles as well as an entire field—her practices follow instead the time-honoured traditions of selective reading from the field of literary criticism. Da’s work is ultimately valuable not because of the computational case it makes (that work still remains to be done), but the way it foregrounds so many of the problems that accompany traditional literary critical models when used to make large-scale evidentiary claims. The good news is that this article has made the problem of generalization, of how we combat the problem of selective reading, into a central issue facing the field.

Start with the evidence chosen. When undertaking their replication project, the OSC generated a sample of one hundred studies taken from three separate journals within a single year of publication to approximate a reasonable cross-section of the field. Da on the other hand chooses “a handful” of articles (fourteen by my count) from different years and different journals with no clear rationale of how these articles are meant to represent an entire field. The point is not the number chosen but that we have no way of knowing why these articles and not others were chosen and thus whether her findings extend to any work beyond her sample. Indeed, the only linkage appears to be that these studies all “fail” by her criteria. Imagine if the OSC had found that 100 percent of articles sampled failed to replicate. Would we find their results credible? Da by contrast is surprisingly only ever right.

Da’s focus within articles exhibits an even stronger degree of nonrepresentativeness. In their replication project, the OSC establishes clearly defined criteria through which a study can be declared not to replicate, while also acknowledging the difficulty of arriving at this conclusion. Da by contrast applies different criteria to every article, making debatable choices, as well as outright errors, that are clearly designed to foreground differences.[4] She misnames authors of articles, mis-cites editions, mis-attributes arguments to the wrong book, and fails at some basic math.[5] And yet each of these assertions always adds-up to the same certain conclusion: failed to replicate. In Da’s hands, part is always a perfect representation of whole.

Perhaps the greatest limitation of Da’s piece is her extremely narrow (that is, nonrepresentative) definition of statistical inference and computational modeling. In Da’s view, the only appropriate way to use data is to perform what is known as significance testing, where we use a statistical model to test whether a given hypothesis is “true.”[6] There is no room for exploratory data analysis, for theory building, or predictive modeling in her view of the field.[7] This is particularly ironic given that Da herself performs no such tests. She holds others to standards to which she herself is not accountable. Nor does she cite articles where authors explicitly undertake such tests[8] or research that calls into question the value of such tests[9] or research that explores the relationship between word frequency and human judgments that she finds so problematic.[10] The selectivity of Da’s work is deeply out of touch with the larger research landscape.

All of these practices highlight a more general problem that has for too long gone unexamined in the field of literary study. How are we to move reliably from individual observations to general beliefs about things in the world? Da’s article provides a tour de force of the problems of selective reading when it comes to generalizing about individual studies or entire fields. Addressing the problem of responsible and credible generalization will be one of the central challenges facing the field in the years to come. As with all other disciplines across the university, data and computational modeling will have an integral role to play in that process.

ANDREW PIPER is Professor and William Dawson Scholar in the Department of Languages, Literatures, and Cultures at McGill University. He is the author most recently of Enumerations: Data and Literary Study (2018).

[1]Nan Z. Da, “The Computational Case Against Computational Literary Studies,” Critical Inquiry 45 (Spring 2019) 601-639. For accessible introductions to what has become known as the replication crisis in the sciences, see Ed Yong, “Psychology’s Replication Crisis Can’t Be Wished Away,” The Atlantic March 4, 2016.

[2]Open Science Collaboration, “Estimating the Reproducibility of Psychological Science,” Science 28 Aug 2015: Vol. 349, Issue 6251, aac4716.DOI: 10.1126/science.aac4716.

[3]Compare Da’s sweeping claims with the more modest ones made by the OSC in Science even given their considerably larger sample and far more rigorous effort at replication, reproduced here. For a discussion of the practice of replication, see Brian D. Earp and David Trafimow, “Replication, Falsification, and the Crisis of Confidence in Social Psychology,” Frontiers in Psychology May 19, 2015: doi.org/10.3389/fpsyg.2015.00621.

[4]For a list, see Ben Schmidt, “A computational critique of a computational critique of a computational critique.” I provide more examples in the scholarly response here: Andrew Piper, “Do We Know What We Are Doing?Journal of Cultural Analytics, April 1, 2019.

[5]She cites Mark Algee-Hewitt as Mark Hewitt, cites G. Casella as the author of Introduction to Statistical Learning when it was Gareth James, cites me and Andrew Goldstone as co-authors in the Appendix when we were not, claims that “the most famous example of CLS forensic stylometry” was Hugh Craig and Arthur F. Kinney’s book that advances a theory of Marlowe’s authorship of Shakespeare’s plays which they do not, and miscalculates the number of people it would take to read fifteen thousand novels in a year. The answer is 1250 not 1000 as she asserts. This statistic is also totally meaningless.

[6]Statements like the following also suggest that she is far from a credible guide to even this aspect of statistics: “After all, statistics automatically assumes that 95 percent of the time there is no difference and that only 5 percent of the time there is a difference. That is what it means to look for p-value less than 0.05.” This is not what it means to look for a p-value less than 0.05. A p-value is the estimated probability of getting our observed data assuming our null hypothesis is true. The smaller the p-value, the more unlikely it is to observe what we did assuming our initial hypothesis is true. The aforementioned 5% threshold says nothing about how often there will be a “difference” (in other words, how often the null hypothesis is false). Instead, it says: “if our data leads us to conclude that there is a difference, we estimate that we will be mistaken 5% of the time.” Nor does “statistics” “automatically” assume that .05 is the appropriate cut-off. It depends on the domain, the question and the aims of modeling. These are gross over-simplifications.

[7]For reflections on literary modeling, see Andrew Piper, “Think Small: On Literary Modeling.” PMLA 132.3 (2017): 651-658; Richard Jean So, “All Models Are Wrong,” PMLA 132.3 (2017); Ted Underwood, “Algorithmic Modeling: Or, Modeling Data We Do Not Yet Understand,” The Shape of Data in Digital Humanities: Modeling Texts and Text-based Resources, eds. J. Flanders and F. Jannidis (New York: Routledge, 2018).

[8]See Andrew Piper and Eva Portelance, “How Cultural Capital Works: Prizewinning Novels, Bestsellers, and the Time of Reading,” Post-45 (2016); Eve Kraicer and Andrew Piper, “Social Characters: The Hierarchy of Gender in Contemporary English-Language Fiction,” Journal of Cultural Analytics, January 30, 2019. DOI: 10.31235/osf.io/4kwrg; and Andrew Piper, “Fictionality,” Journal of Cultural Analytics, Dec. 20, 2016. DOI: 10.31235/osf.io/93mdj.

[9]The literature debating the values of significance testing is vast. See Simmons, Joseph P., Leif D. Nelson, and Uri Simonsohn. “False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant.” Psychological Science 22, no. 11 (November 2011): 1359–66. doi:10.1177/0956797611417632.

 [10]See Rens Bod, Jennifer Hay, and Stefanie Jannedy, Probabilistic Linguistics (Cambridge, MA: MIT Press, 2003); Dan Jurafsky and James Martin, “Vector Semantics,” Speech and Language Processing, 3rd Edition (2018): https://web.stanford.edu/~jurafsky/slp3/6.pdf; for the relation of communication to information theory, M.W. Crocker, Demberg, V. & Teich, E. “Information Density and Linguistic Encoding,” Künstliche Intelligenz 30.1 (2016) 77-81. https://doi.org/10.1007/s13218-015-0391-y; and for the relation to language acquisition and learning, Erickson  LC, Thiessen  ED, “Statistical learning of language: theory, validity, and predictions of a statistical learning account of language acquisition,” Dev. Rev. 37 (2015): 66–108.doi:10.1016/j.dr.2015.05.002.

1 Comment

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses

 

Trust in Numbers

Hoyt Long and Richard Jean So

 

Nan Da’s “The Computational Case against Computational Literary Criticism” stands out from past polemics against computational approaches to literature in that it purports to take computation seriously. It recognizes that a serious engagement with this kind of research means developing literacy of statistical and other concepts. Insofar as her essay promises to move the debate beyond a flat rejection of numbers, and towards something like a conversation about replication, it is a useful step forward.

This, however, is where its utility ends. “Don’t trust the numbers,” Da warns. Or rather, “Don’t trust their numbers, trust mine.” But should you? If you can’t trust their numbers, she implies, the entire case for computational approaches falls apart. Trust her numbers and you’ll see this. But her numbers cannot be trusted. Da’s critique of fourteen articles in the field of cultural analytics is rife with technical and factual errors. This is not merely quibbling over details. The errors reflect a basic lack of understanding of fundamental statistical concepts and are akin to an outsider to literary studies calling George Eliot a “famous male author.” Even more concerning, Da fails to understand statistical method as a contextual, historical, and interpretive project. The essay’s greatest error, to be blunt, is a humanist one.

Here we focus on Da’s errors related to predictive modeling. This is the core method used in the two essays of ours that she critiques. In “Turbulent Flow,” we built a model of stream-of-consciousness (SOC) narrative with thirteen linguistic features and found that ten of them, in combination, reliably distinguished passages that we identified as SOC (as compared with passages taken from a corpus of realist fiction). Type-token ratio (TTR), a measure of lexical diversity, was the most distinguishing of these, though uninformative on its own. The purpose of predictive modeling, as we carefully explain in the essay, is to understand how multiple features work in concert to identify stylistic patterns, not alone. Nothing in Da’s critique suggests she is aware of this fundamental principle.

Indeed, Da interrogates just one feature in our model (TTR) and argues that modifying it invalidates our modeling. Specifically, she tests whether the strong association of TTR with SOC holds after removing words in her “standard stopword list,” instead of in the stopword list we used. She finds it doesn’t. There are two problems with this. First, TTR and “TTR minus stopwords” are two separate features. We actually included both in our model and found the latter to be minimally distinctive. Second, while the intuition to test for feature robustness is appropriate, it is undercut by the assertion that there is a “standard” stopword list that should be universally applied. Ours was specifically curated for use with nineteenth- and early twentieth-century fiction. Even if there was good reason to adopt her “standard” list, one still must rerun the model to test if the remeasured “TTR minus stopwords” feature changes the overall predictive accuracy. Da doesn’t do this. It’s like fiddling with a single piano key and, without playing another note, declaring the whole instrument to be out of tune.

But the errors run deeper than this. In Da’s critique of “Literary Pattern Recognition,” she tries to invalidate the robustness of our model’s ability to classify English-language haiku poems from nonhaiku poems. She does so by creating a new corpus of “English translations of Chinese couplets” and tests our model on this corpus. Why do this? She suggests that it is because they are filled “with similar imagery” to English haiku and are similarly “Asian.” This is a misguided decision that smacks of Orientalism. It completely erases context and history, suggesting an ontological relation where there is none. This is why we spend over twelve pages delineating the English haiku form in both critical and historical terms.

These errors exemplify a consistent refusal to contextualize and historicize one’s interpretative practices (indeed to “read well”), whether statistically or humanistically. We do not believe there exist “objectively” good literary interpretations or that there is one “correct” way to do statistical analysis: Da’s is a position most historians of science, and most statisticians themselves, would reject.  Conventions in both literature and science are continuously debated and reinterpreted, not handed down from on high. And like literary studies, statistics is a body of knowledge formed from messy disciplinary histories, as well as diverse communities of practice. Da’s essay insists on a highly dogmatic, “objective,” black-and-white version of knowledge, a disposition totally antithetical to bothstatistics and literary studies. It is not a version that encourages much trust.

Hoyt Long is associate professor of Japanese literature at the University of Chicago. He publishes widely in the fields of Japanese literary studies, media history, and cultural analytics. His current book project is Figures of Difference: Quantitative Approaches to Modern Japanese Literature.

Richard Jean So is assistant professor of English and cultural analytics at McGill University. He works on computational approaches to literature and culture with a focus on contemporary American writing and race. His current book project is Redlining Culture: A Data History of Race and US Fiction.

1 Comment

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses

 

What the New Computational Rigor Should Be

Lauren F. Klein

Writing about the difficulties of evaluating digital scholarship in a recent special issue of American Quarterlydevoted to DH, Marisa Parham proposes the concept of “The New Rigor” to account for the labor of digital scholarship as well as its seriousness: “It is the difference between what we say we want the world to look like and what we actually carry out in our smallest acts,” she states (p. 683). In “The Computational Case against Computational Literary Studies,” Nan Z. Da also makes the case for a new rigor, although hers is more narrowly scoped. It entails both a careful adherence to the methods of statistical inquiry and a concerted rejection of the application of those methods to domains—namely, literary studies—that fall beyond their purported use.

No one would argue with the former. But it is the latter claim that I will push back against. Several times in her essay, Da makes the case that “statistical tools are designed to do certain things and solve specific problems,” and for that reason, they should not be employed to “capture literature’s complexity” (pp. 619-20, 634). To be sure, there exists a richness of language and an array of ineffable—let alone quantifiable—qualities of literature that cannot be reduced to a single model or diagram. But the complexity of literature exceeds even that capaciousness, as most literary scholars would agree. And for that very reason, we must continue to explore new methods for expanding the significance of our objects of study. As literary scholars, we would almost certainly say that we want to look at—and live in—a world that embraces complexity. Given that vision, the test of rigor then becomes, to return to Parham’s formulation, how we usher that world into existence through each and every one of “our smallest acts” of scholarship, citation, and critique.

In point of fact, many scholars already exhibit this new computational rigor. Consider how Jim Casey, the national codirector of the Colored Conventions Project, is employing social network analysis—including the centrality scores and modularity measures that Da finds lacking in the example she cites—in order to detect changing geographic centers for this important nineteenth-century organizing movement. Or how Lisa Rhody has found an “interpretive space that is as vital as the weaving and unraveling at Penelope’s loom” in a topic model of a corpus of 4,500 poems. This interpretive space is one that Rhody creates in no small part by accounting for the same fluctuations of words in topics—the result of the sampling methods employed in almost all topic model implementations—that Da invokes, instead, in order to dismiss the technique out of hand. Or how Laura Estill, Dominic Klyve, and Kate Bridal have employed statistical analysis, including a discussion of the p-values that Da believes (contramany statisticians) are always required, in order to survey the state of Shakespeare studies as a field.

That these works are authored by scholars in a range of academic roles, including postdoctoral fellows and DH program coordinators as well as tenure-track faculty, and are published in a range of venues, including edited collections and online as well as domain-specific journals; further points to the range of extant work that embraces the complexity of literature in precisely the ways that Da describes. But these works to do more: they also embrace the complexity of the statistical methods that they employ. Each of these essays involve a creative repurposing of the methods they borrow from more computational fields, as well as a trenchant self-critique. Casey, for example, questions how applying techniques of social network analysis, which are premised on a conception of sociality as characterized by links between individual “nodes,” can do justice to a movement celebrated for its commitment to collective action. Rhody, for another, considers the limits of the utility of topic modeling, as a tool “designed to be used with texts that employ as little figurative language as possible,” for her research questions about ekphrasis. These essays each represent “small acts” and necessarily so. But taken alongside the many other examples of computational work that are methodologically sound, creatively conceived, and necessarily self-critical, they constitute the core of a field committed to complexity in both the texts they elucidate and the methods they employ.

In her formulation of the “The New Rigor,” Parham—herself a literary scholar—places her emphasis on a single word: “Carrying, how we carry ourselves in our relationships and how we carry each other, is the real place of transformation,” she writes. Da, the respondents collected in this forum, and all of us in literary studies—computational and not—might linger on that single word. If our goal remains to celebrate the complexity of literature—precisely because it helps to illuminate the complexity of the world—then we must carry ourselves, and each other, with intellectual generosity and goodwill. We must do so, moreover, with a commitment to honoring the scholarship, and the labor, that has cleared the path up to this point. Only then can we carry forward the field of computational literary studies into the transformative space of future inquiry.

LAUREN F. KLEIN is associate professor at the School of Literature, Media, and Communication, Georgia Institute of Technology.

3 Comments

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses

 

What Is Literary Studies?

Ed Finn

This is the question that underpins Da’s takedown of what she calls computational literary studies (CLS). The animus with which she pursues this essay is like a search light that creates a shadow behind it. “The discipline is about reducing reductionism,” she writes (p. 638), which is a questionable assertion about a field that encompasses many kinds of reduction and contradictory epistemic positions, from thing theory to animal studies. Da offers no evidence or authority to back up her contention that CLS fails to validate its claims. Being charitable, what Da means, I think, is that literary scholars should always attend to context, to the particulars of the works they engage.

Da’s essay assails what she terms the false rigor of CLS: the obsession with reductive analyses of large datasets, the misapplied statistical methods, the failure to disentangle artifacts of measurement from significant results. And there may be validity to these claims: some researchers use black box tools they don’t understand, not just in the digital humanities but in fields from political science to medicine. The most helpful contribution of Da’s article is tucked away in the online appendix, where she suggests a very good set of peer review and publication guidelines for DH work. I can imagine a version of this essay that culminated with those guidelines rather than the suggestion that “reading literature well” is a bridge too far for computational approaches.

The problem with the spotlight Da shines on the rigor of CLS is that shadow looming behind it. What does rigor look like in “the discipline” of literary studies, which is defined so antagonistically to CLS here? What are the standards of peer review that ensure literary scholarship validates its methods, particularly when it draws those methods from other disciplines? Nobody is calling in economists to assess the validity of Marxist literary analysis, or cognitive psychologists to check applications of affect theory, and it’s hard to imagine that scholars would accept the disciplinary authority of those critics. I am willing to bet Critical Inquiry’s peer review process for Da’s article did not include federal grants program officers, university administrators, or scholars of public policy being asked to assess Da’s rhetorical—but central—question “of why we need ‘labs’ or the exorbitant funding that CLS has garnered” (p. 603).

I contend this is actually a good idea: literary studies can benefit from true dialog and collaboration with fields across the entire academy. Da clearly feels that this is justified in the case of CLS, where she calls for more statistical expertise (and brings in a statistician to guide her analysis in this paper). But why should CLS be singled out for this kind of treatment?

Either one accepts that rigor sometimes demands literary studies should embrace expertise from other fields—like Da bringing in a statistician to validate her findings for this paper—or one accepts that literary studies is made up of many contradictory methods and that “the discipline” is founded on borrowing methods from other fields without any obligation validate findings by the standards of those other fields. What would it look like to generalize Da’s proposals for peer review to other areas of literary studies? The contemporary research I find most compelling makes this more generous move: bringing scholars in the humanities together with researchers in the social sciences, the arts, medicine, and other arenas where people can actually learn from one another and do new kinds of work.

To me, literary studies is the practice of reading and writing in order to better understand the human condition. And the condition is changing. Most of what we read now comes to us on screens that are watching us as we watch them. Many of the things we think about have been curated and lobbed into our consciousness by algorithmic feeds and filters. I studied Amazon recommendation networks because they play an important role in contemporary American literary reception and the lived experience of fiction for millions of readers—at least circa 2010, when I wrote the article. My approach in that work hewed to math that I understand and a scale of information that I call small data because it approximates the headspace of actual readers thinking about particular books. Small data always leads back to the qualitative and to the particular, and it is a minor example of the contributions humanists can make beyond the boundaries of “the discipline.”

We desperately need the humanities to survive the next century, when so many of our species’ bad bets are coming home to roost. Text mining is not “ethically neutral,” as Da gobsmackingly argues (p. 620), any more than industrialization was ethically neutral, or the NSA using network analysis to track suspected terrorists (Da’s example of a presumably acceptable “operationalizable end” for social network analysis) (p. 632). The principle of charity would, I hope, preclude Da’s shortsighted framing of what matters in literary studies, and it would open doors to other fields like computer science where many researchers are, either unwittingly or uncaringly, deploying words like human and read and write with the same kind of facile dismissal of methods outside “the discipline” that are on display here. That is the context in which we read and think about literature now, and if we want to “read literature well,” we need to bring the insights of literary study to broader conversations where we participate, share, educate, and learn.

ED FINN is the founding director of the Center for Science and the Imagination at Arizona State University where he is an associate professor in the School of Arts, Media, and Engineering and the Department of English.

1 Comment

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses

 

Sarah Brouillette

DH is here to stay, including in the CLS variant whose errors Nan Da studies. This variant is especially prevalent in English programs, and it will continue to gain force there. Even when those departments have closed or merged with other units, people with CLS capacities will continue to find positions—though likely contractually —when others no longer can. This is not to say that DH is somehow itself the demise of the English department. The case rather is that both the relative health of DH and the general decline in literary studies—measured via enrollments, number of tenured faculty, and university heads’ dispositions toward English—arise from the same underlying factors. The pressures that English departments face are grounded in the long economic downturn and rising government deficits, deep cuts to funding for higher education, rising tuition, and a turn by university administrators toward boosting business and STEM programs. We know this. There has been a foreclosure of futurity for students who are facing graduation with significant debt burdens and who doubt that they will find stable work paying a good wage. Who can afford the luxury of closely reading five hundred pages of dense prose? Harried anxious people accustomed to working across many screens, many open tabs, with constant pings from social media, often struggle with sustained reading. Myself included. DH is a way of doing literary studies without having to engage in long periods of sustained reading, while acquiring what might feel like job skills. It doesn’t really matter how meaningful CLS labs’ findings are. As Da points out, practitioners themselves often emphasize how tentative their findings are or stress flaws in the results or the method that become the occasion for future investment and development. That is the point: investment and development. The key to DH’s relative health is that it supports certain kinds of student training and the development of technologically enhanced learning environments. One of the only ways to get large sums of grant money from the Social Sciences and Humanities Research Council of Canada (SSHRC) is to budget for equipment and for student training. Computer training is relatively easy to describe in a budget justification. Universities for their part often like DH labs because they attract these outside funders, and because grants don’t last forever, a campus doesn’t have to promise anything beyond short-term training and employment. As for the students: to be clear, those with DH skills don’t necessarily walk more easily into jobs than those without them. But DH labs, which at least in Canada need to be able to list training as a priority, offer an experience of education that has an affective appeal for many students—an appeal that universities work hard to cultivate and reinforce. This cultivation is there in the constant contrasts made between old fashioned and immersive learning, between traditional and project-based classrooms, between the dull droning lecture and the experiential . . . well, experience. (The government of Ontario has recently mandated that every student have an opportunity to experience “work-integrated learning” before graduation.) It is there also in the push to make these immersive experiences online ones, mediated by learning management systems such as Brightspace or Canvas, which store data via Amazon Web Services. Learning in universities increasingly occurs in data capturable forms. The experience of education, from level of participation to test performance, is cultivated, monitored, and tracked digitally. Students who have facility with digital technologies are, needless to say, at an advantage in this environment. Meanwhile the temptation to think that courses that include substantial digital components are more practical and professional – less merely academic – is pretty understandable, as universities are so busily cultivating and managing engagement in a context in which disengagement otherwise makes total sense. DH is simply far more compatible with all of these observable trends than many other styles of literary inquiry.

SARAH BROUILLETTE is a professor in the Department of English at Carleton University in Ottawa, Canada.

2 Comments

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses

 

Katherine Bode

Nan Z. Da’s statistical review of computational literary studies (CLS) takes issue with an approach I also have concerns about, but it is misconceived in its framing of the field and of statistical inquiry. Her definition of CLS—using statistics, predominantly machine learning, to investigate word patterns—excludes most of what I would categorize as computational literary studies, including research that: employs data construction and curation as forms of critical analysis; analyzes bibliographical and other metadata to explore literary trends; deploys machine-learning methods to identify literary phenomena for noncomputational interpretation; or theorizes the implications of methods such as data visualization and machine learning for literary studies. (Interested readers will find diverse forms of CLS in the work of Ryan Cordell, Anne DeWitt, Johanna Drucker, Lauren Klein, Matthew Kirschenbaum, Anouk Lang, Laura B. McGrath, Stephen Ramsay, and Glenn Roe, among others.)

Beyond its idiosyncratic and restrictive definition of CLS, what strikes me most about Da’s essay is its constrained and contradictory framing of statistical inquiry. For most of the researchers Da cites, the pivot to machine learning is explicitly conceived as rejecting a positivist view of literary data and computation in favor of modelling as a subjective practice. Da appears to argue, first, that this pivot has not occurred enough (CLS takes a mechanistic approach to literary interpretation) and, second, that it has gone too far (CLS takes too many liberties with statistical inference, such as “metaphor[izing] … coding and statistics” [p. 606 n. 9]). On the one hand, then, Da repeatedly implies that, if CLS took a slightly different path—that is, trained with more appropriate samples, demonstrated greater rigor in preparing textual data, avoided nonreproducible methods like topic modelling, used Natural Language Processing with the sophistication of corpus linguists—it could reach a tipping point at which the data used, methods employed, and questions asked became appropriate to statistical analysis. On the other, she precludes this possibility in identifying “reading literature well” as the “cut-off point” at which computational textual analysis ceases to have “utility” (p. 639). This limited conception of statistical inquiry also emerges in Da’s two claims about statistical tools for text mining: they are “ethically neutral”; and they must be used “in accordance with their true function” (p. 620), which Da defines as reducing information to enable quick decision making. Yet as with any intellectual inquiry, surely any measurements—let alone measurements with this particular aim—are interactions with the world that have ethical dimensions.

Statistical tests of statistical arguments are vital. And I agree with Da’s contention that applications of machine learning to identify word patterns in literature often simplify complex historical and critical issues. As Da argues, these simplifications include conceiving of models as “intentional interpretations” (p. 621) and of word patterns as signifying literary causation and influence. But there’s a large gap between identifying these problems and insisting that statistical tools have a “true function” that is inimical to literary studies. Our discipline has always drawn methods from other fields (history, philosophy, psychology, sociology, and others). Perhaps it’s literary studies’ supposed lack of functional utility (something Da claims to defend) that has enabled these adaptations to be so productive; perhaps such adaptations have been productive because the meaning of literature is not singular but forged constitutively with a society where the prominence of particular paradigms (historical, philosophical, psychological, sociological, now statistical) at particular moments shapes what and how we know. In any case, disciplinary purity is no protection against poor methodology; and cross disciplinarity can increase methodological awareness.

Da’s rigid notion of a “true function” for statistics prevents her asking more “argumentatively meaningful” (p. 639) questions about possible encounters between literary studies and statistical methods. These might include: If not intentional or interpretive, what is the epistemological—and ontological and ethical—status of patterns discerned by machine learning? Are there ways of connecting word counts with other, literary and nonliterary, elements that might enhance the “explanatory power” (p. 604) and/or critical potential of such models and, if not, why not? As is occurring in fields such as philosophy, sociology, and science and technology studies, can literary studies apply theoretical perspectives (such as feminist empiricism or new materialism) to reimagine literary data and statistical inquiry? Without such methodological and epistemological reflection, Da’s statistical debunking of statistical models falls into the same trap she ascribes to those arguments: of confusing “what happens mechanistically with insight” (p. 639). We very much need critiques of mechanistic—positivist, reductive, and ahistorical—approaches to literary data, statistics, and machine learning. Unfortunately, Da’s critique demonstrates the problems it decries.

 

KATHERINE BODE is associate professor of literary and textual studies at the Australian National University. Her latest book, A World of Fiction: Digital Collections and the Future of Literary History (2018), offers a new approach to literary research with mass-digitized collections, based on the theory and technology of the scholarly edition. Applying this model, Bode investigates a transnational collection of around 10,000 novels and novellas, discovered in digitized nineteenth-century Australian newspapers, to offer new insights into phenomena ranging from literary anonymity and fiction syndication to the emergence and intersections of national literary traditions.

1 Comment

Filed under Uncategorized

Computational Literary Studies: Participant Forum Responses

 

Criticism, Augmented

Mark Algee-Hewitt

A series of binaries permeates Nan Z. Da’s article “The Computational Case against Computational Literary Studies”: computation OR reading; numbers OR words; statistics OR critical thinking. Working from these false oppositions, the article conjures a conflict between computation and criticism. The field of cultural analytics, however rests on the discovery of compatibilities between these binaries: the ability of computation to work hand in hand with literary criticism and the use of critical interpretation by its practitioners to make sense of their statistics.

The oppositions she posits lead Da to focus exclusively on the null hypothesis testing of confirmatory data analysis (CDA): graphs are selected, hypotheses are proposed, and errors in significance are sought.[1]

But, for mathematician John Tukey, the founder of exploratory data analysis (EDA), allowing the data to speak for itself, visualizing it without an underlying hypothesis, allows researchers to avoid the pitfalls of confirmation bias.[2]This is what psychologist William McGuire (1989) calls “the hypothesis testing myth”: if a researcher begins by believing a hypothesis (for example, that literature is too complex for computational analysis), then, with a simple manipulation of statistics, she or he can prove herself or himself correct (by cherry-picking examples that support her argument).[3]Practitioners bound by the orthodoxy of their fields often miss the new patterns revealed when statistics are integrated into new areas of research.

In literary studies, the visualizations produced by EDA do not replace the act of reading but instead redirect it to new ends.[4]Each site of statistical significance reveals a new locus of reading: the act of quantification is no more a reduction than any interpretation.[5]Statistical rigor remains crucial, but equally as essential are the ways in which these data objects are embedded within a theoretical apparatus that draws on literary interpretation.[6]And yet, in her article, Da plucks single statistics from thirteen articles with an average length of about 10,250 words each.[7]It is only by ignoring these 10,000 words, by refusing to read the context of the graph, the arguments, justifications, and dissentions, that she can marshal her arguments.

In Da’s adherence to CDA, her critiques require a hypothesis: when one does not exist outside of the absent context, she is forced to invent one. Even a cursory reading of “The Werther Topologies” reveals that we are not interested in questions of the “influence of Werther on other texts”: rather we are interested in exploring the effect on the corpus when it is reorganized around the language of Werther.[8]The topology creates new adjacencies, prompting new readings: it does not prove or disprove, it is not right or wrong – to suggest otherwise is to make a category error.

Cultural analytics is not a virtual humanities that replaces the interpretive skills developed by scholars over centuries with mathematical rigor. It is an augmented humanities that, at its best, presents new kinds of evidence, often invisible to even the closest reader, alongside carefully considered theoretical arguments, both working in tandem to produce new critical work.

 

MARK ALGEE-HEWITT is an assistant professor of English and Digital Humanities at Stanford University where he directs the Stanford Literary Lab. His current work combines computational methods with literary criticism to explore large scale changes in aesthetic concepts during the eighteenth and nineteenth centuries. The projects that he leads at the Literary Lab include a study of racialized language in nineteenth-century American literature and a computational analysis of differences in disciplinary style. Mark’s work has appeared in New Literary History, Digital Scholarship in the Humanities, as well as in edited volumes on the Enlightenment and the Digital Humanities.

[1]Many of the articles cited by Da combine both CDA and EDA; a movement of the field noted by Ted Underwood in Distant Horizons (p. xii).

[2]Tukey, John. Exploratory Data Analysis New York, Pearson, 1977.

[3]McGuire, William J. A perspectivist approach to the strategic planning of programmatic scientific research.” In Psychology of Science: Contributions to Metascience ed. B. Gholson et al. Cambridge: Cambridge UP, 1989. 214-245. See also Frederick Hartwig and Brian Dearling on the need to not rely exclusively on CDA (Exploratory Data Analysis, Newbury Park: Sage Publications, 1979) and John Behrens on the “hypothesis testing myth.” (“Principles and Procedures of Exploratory Data Analysis.” Psychological Methods, 2(2): 1997, 131-160.

[4]Da, Nan Z. “The Computational Case against Computational Literary Analysis.” Critical Inquiry 45(3): 2019. 601-639.

[5]See, for example, Gemma, Marissa, et al. “Operationalizing the Colloquial Style: Repetition in 19th-Century American Fiction” Digital Scholarship in the Humanities, 32(2): 2017. 312-335; or Laura B. McGrath et al. “Measuring Modernist Novelty” The Journal of Cultural Analytics (2018).

[6]See, for example, our argument about the “modularity of criticism” in Algee-Hewitt, Mark, Fredner, Erik, and Walser, Hannah. “The Novel As Data.” Cambridge Companion to the Noveled. Eric Bulson. Cambridge: Cambridge UP, 2018. 189-215.

[7]Absent the two books, which have a different relationship to length, Da extracts visualizations or numbers from 13 articles totaling 133,685 words (including notes and captions).

[8]Da (2019), 634; Piper and Algee-Hewitt, (“The Werther Effect I” Distant Readings: Topologies of German Culture in the Long Nineteenth Century Ed Matt Erlin and Lynn Tatlock. Rochester: Camden House, 2014), 156-157.

1 Comment

Filed under Uncategorized