Afterword: Learning to Read AI Texts

N. Katherine Hayles

30 June 2023

Left uninterrogated in “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜” and “Against Theory” is what “authorial intent” means for LLMs, as well as the assertion that meaning must derive from real world experience.  One attack would notice that LLMs certainly do have intentions, also known as their programs.  Moreover, these originate in human brains, so intentions, including the intention to communicate, underlie their architectures and pervade their operations.  “But that is not what we meant,” the “On the Dangers of Stochastic Parrots” and “Against Theory” crowd snaps; “We meant that the models themselves do not have intentions.” This assertion requires that we examine LLM programs to see whether or not they are set up to generate intentions. [1]

As Leif Weatherby and Brian Juste note, LLMs are inherently indexical.  They break words into tokens and assign vector locations to them in positional embedding spaces, which are constructed according to what other vectors they are related to, as well as their positions within sentences.  The connections between vectors are expressed as weights, or parameters, assigned by the program to the different neurons during training.   Attention and self-attention mechanisms running in parallel built connections between tokens, according to their syntactic and semantic correlations. Roughly, the number of parameters indicate how many connections there are between neurons; in the case of GPT-3, 175 billion; for GPT-4, 170 trillion.  From these vectors, manipulated through matrix math, the programs construct complex multidimension maps of vector correlations; the resulting probabilities are run through a software package such as Softmax that converts them back into words. In essence, then, the vectors act as indexicals in the Peircean sense: signs indicating correlation. 

GPT-3 and -4 have a number of unanticipated capabilities that emerged spontaneously through their training (they were not explicitly programmed in).  One is the ability to detect and replicate literary styles and genres.  How can we explain these capacities?  Essentially, styles employ rhetorics that carry multiple implications about relations between people; genres operate according to rules determining the kind of world in which the literary action takes place.  In a detective novel, for instance, corpses cannot spontaneously crawl out of graves.  The capacity to detect style results from massive networks of correlations that the LLMs use to draw inferences about the relations that rhetorics imply.  Moreover, these inference themselves form networks that lead to higher-order inferences, for example, in the leap from style to genre.  If LLMs can learn about the kinds of worlds that genres imply, what can they learn about the (admittedly much more complex) world that humans inhabit? 

 What could an entity that constructs billions or trillions of inferences, networks of inferences, and networks of networks, learn about human languages, cultures, and social relations from ingesting billions of human-authored texts, in the absence of any real-world experience?  My answer is, quite a lot.  There would of course be what I call a systemic fragility of reference, in which the lack of grounding in real world experience leads to errors of interpretation and fact.  LLMs are like the figure, beloved by philosophers, of a brain in a vat; they construct models not of the world, but only models of language.  Nevertheless, embedded in the immense repertoire of human-authored texts on which LLMs are trained are any number of implications about the human world of meanings.  These are understood in the contexts of what LLMs can and do apprehend, that is, in relation to their world-horizons, or in a phrase that Jakob von Uexküll usefully designated for biological world-horizons, their umwelten.[2]  Just as von Uexküll used the term umwelt to emphasize that all biological creatures have species-specific ways of perceiving the world, so LLMs also have distinctive ways of apprehending the world.   

Apprehension may be a more appropriate term for what LLMs learn than comprehension (because what they learn is far from comprehensive, lacking sensory inputs or real world experience), thought (too associated with human cognition), or sentience (whose etymological roots refer to sensations, which are precisely what LLMs lack).  Moreover, the other meaning of apprehension is a sense of dread or anxiety, also appropriate for human reactions to LLMs, as the recent “open letters” from tech leaders emphasize. 

Because of the very significant differences between human umwelten and the umwelten of LLMs, there will inevitably be a gap between the meanings that humans project onto the texts that LLMs generate and what the texts mean in the context of LLMs own umwelten. How can we humans learn to read these messages sent from what is literally a different (language-only) world?  Here is where the literary-critical methods of textual interpretation can play important roles.  The distinctions between what the author intended and what readers project onto a text is a typical problem for which literary-critical practices have devised many strategies to explore and understand.  Close reading practices, for instance, pay attention to rhetorical structures,  networks of metaphors and how they work to guide reception, and implicit assumptions undergirding a line of argument.  From these, authorial intent is inferred, as well as how they elicit specific responses from readers. Not coincidentally, these are also the patterns of correlation that the LLMs used to construct their responses in the first place.

The question, as I see it, is not whether these texts have meanings, but rather what kinds of meanings they can and do have.  In my view, this should be determined not through abstract arguments, but rather through actually reading the texts and using an informed background knowledge of LLM architectures to interpret and understand them.  The proof is in the pudding; they will certainly elicit meanings from readers, and they will act in the world of verbal performances that have real-world consequences and implications.  The worst thing we can do is dismiss them as meaningless, when they increasingly influence and determine how real-world systems work.  The better path is to use them to understand how algorithmic meanings are constructed and how they interact with other verbal performances to create the linguistic universe of significations, which is no longer only for or of humans. 


N. Katherine Hayles is the Distinguished Research Professor of English at the University of California, Los Angeles, and the James B. Duke Professor Emerita of Literature from Duke University.  Her research focuses on the relations of literature, science and technology in the twentieth and twenty-first centuries.  She is the author of twelve books and over one hundred peer-reviewed articles and is a member of the American Academy of Arts and Sciences.  She is currently at work on Bacteria to AI: Human Futures with our Nonhuman Symbionts. 


[1] For a more detailed analysis, see N. Katherine Hayles. “Inside the Mind of an AI: Materiality and the Crisis of Representation,” New Literary History 54 (Winter 2023): 635-66. 

[3] See Jakob von Uexküll, “A Foray into the Worlds of Animals and Humans” with “A Theory of Meaning,” trans. Joseph D. O’Neil (Minnesota, 2010).

16 Comments

Filed under AI FORUM

16 responses to “Afterword: Learning to Read AI Texts

  1. Pingback: Again Theory: A Forum on Language, Meaning, and Intent in the Time of Stochastic Parrots | In the Moment

  2. A thunder did my spirit feel
    Do we have human ears?
    I feel the same familiar reel
    unfold throughout the years.

    All motion do we count, all force.
    Can’t we both hear and see
    how meaning-making meter
    hides the forest for the trees?

  3. Pawel Kaczmarski

    It’s a very interesting read, but quite an odd one, and I’m not sure if I’m missing something here. For the most part, it seems like the author is going to attempt what the other AT-sceptics invited to the forum ultimately shied away from, that is, argue directly for the idea that LLMs are simply capable of intentions – which would make LLM-generated texts meaningful even on AT’s own terms. But then she stops just short of claiming this explicitly, and instead turns towards the very familiar idea that meaning is “elicited from readers” – which makes the issue of LLMs having or not having intentions entirely insignificant after all – and as such seems detached from the rest of her piece.

    Hayles says:

    “The question, as I see it, is not whether these texts have meanings, but rather what kinds of meanings they can and do have. In my view, this should be determined not through abstract arguments, but rather through actually reading the texts and using an informed background knowledge of LLM architectures to interpret and understand them.“

    This could very well be, but as soon as one assumes that these “texts” have meanings – as soon as we commit to “interpreting” them as “texts” “written” by LLMs – one is (according to AT) already committed to the idea that LLMs have intentions. In this sense, the debate is over before it’s even begun – nothing has been solved or answered however – rather, a certain (interpretive) assumption on Hayles’ part simply rendered the entire forum (including the majority of her own contribution to it) irrelevant, at least from her point of view.

    Of course, for most AT-sceptics it’s not as simple as that, but I think Hayes’ turnabout betrays a deeper issue with this (otherwise very interesting and thought-provoking) forum. *If* we commit to meaningfulness of LLM-generated texts as an obvious, empirically proven fact, *and* we don’t necessarily believe that LLMs are capable of intentions – and this seems to me the default position of most of the authors invited to the forum – then, of course, we have to believe that meanings are independent of intentions and the AT position is wrong. It’s a conclusion that follows very naturally from putting together two very straightforward propositions. But there’s also no real argument here – just two assumptions followed through to their natural conclusion – in that way, the anti-AT position is perhaps more like a mission statement, or a statement of faith; i.e. it’s coherent, it has its internal logic, but it also seems disinterested (indeed, in a certain way it *has to* be disinterested) in any opposing view.

    Anyway, it’s been a very interesting read – ditto for all the other pieces written for the forum – kudos to everyone (and most of all CI itself)!

    • hayles20

      Thanks to Pawel for his astute challenge. Perhaps in my final paragraph I was too cryptic as I neared the mandated word limit. I meant to suggest that as well as intentions readers project onto a text, a text can have its own authorial intentions, which can be discerned through the literary-critical strategies of close reading and rhetorical analysis, developed through decades and centuries of literary practice and debate. AT proposes the wave poem, supposedly produced entirely by random processes. OK, but what if you are not just examining an artifact but engaging in a back-and-forth dialogue? What if you ask, “Why use this word?”, “What are the implications of this style?” etc. That random processes could provide convincing and insightful answers to these probes stretches belief a bridge too far. I have engaged in just such probes with ChatGPT, and have also closely read many texts produced by GPT-4 and ChatGPT. As a result, it is my view that LLMs do have intentions besides those inherent in their programs. For example, it is clear that they respond to the rhetoric and content of prompts, that is, the intentions of their human interlocutors. They develop their intentions, i suggest, through the billions of correlations and networks of inferences they construct through their pretraining. I also suggest that we should not only use arguments but the texts of the LLMs themselves in make determinations. What better evidence for the presence or absence of intentions than the actual words that LLMs give to us? Here the genre of the short essay (or reply) works against a reasoned inquiry, for good literary analysis requires time and space to unfold. Granted, there are large gaps in the knowledge LLMs display, for they have no models of the world, only of language; but there is also much sense, even pique and exasperation (or the technical equivalents of such).

  4. Matthew Kirschenbaum

    Thank you Pawel for the strong challenge and critique, and thank you to Kate for the extended reply. I applaud Kate for her boldness in restating her belief that LLMs do in fact *have* intentions, certainly a departure from the Stochastic Parrots orthodoxy and yes, further then others have been willing to go.

    I myself am not yet sure that they do, but I also don’t find myself caring very much. That may seem an odd statement from the convener of this forum, but ultimately my position is (I would insist) a pragmatic one. I mean this in two ways.

    First, regardless of intentions, the processes that give rise to LLMs demonstrably produce texts interesting enough for us to argue about. As Walter and Steve note, the text of their wave poem is forgettable, but the closing appendage is a zinger. I can appreciate how “I hope you like it” can be read as either sneering snark or cloying earnestness without delving any deeper into the processes that produced it; and yes, I am working from the presumption of *reading* a *text* as opposed to . . . what? Am I really supposed to do otherwise? After all it’s published here as part of the forum, I have an obligation! (To say nothing of the fact that it was meaningful enough to Steve and Walter to inspire the title of their piece.)

    There is a second pragmatic argument, not as much fun. Whatever a forum such as this will determine academically speaking, we know that the internet—which is to say the capital interests that control the internet—will never fail to outperform our worst expectations. I tried to follow that logic to its conclusion in my “Textpocalypse” piece in the Atlantic a few months back . . . but regardless of the particulars of that scenario, it is clear LLMs *will* be deployed in such a way as to force their meanings upon us, whether we want them or not. Kate’s call to learn how to probe towards the umwelt is inspiring here, if daunting; I would also direct those interested to Rita Raley’s most recent publication, co-authored with Minh Hua in DHQ: “How to Do Things with Deep Learning Code,” which proposes a critical code studies framework for reading and critiquing LLMs (http://www.digitalhumanities.org/dhq/vol/17/2/000684/000684.html). Here too is an articulated, if no less daunting, path into the inferiority of the machine.

  5. Like Pawel and Matt, I appreciate the boldness of simply attributing intentions to LLMs. But I wonder how much of the argument is grounded in a basic misunderstanding of how they actually work.

    Kate says they “construct maps” and “draw inferences,” which, if true, would indeed suggest that they have something like intentions. But last summer, when I looked into this to the best of my (limited) abilities, I concluded that they’re not really doing anything cognitive at all. “Neural language models aren’t long programs,” as Blaise Agüera y Arcas put it in the Economist; “you could scroll through the code in a few seconds. They consist mainly of instructions to add and multiply enormous tables of numbers together.”

    That is, LLMs have absolutely no understanding of what the words they use mean. In fact, I’ve been puzzling for a few days over this sentence in Kate’s piece: “the resulting probabilities are run through a software package such as Softmax that converts them back into words.” To the best of my knowledge, Softmax is not a software package; it’s just a mathematical function. It does not convert probabilities back into words (that’s just done by means of a dictionary of tokens). I’m not an expert on this, I should say. I’m just saying that Kate’s technical description doesn’t line up with what I’ve come to understand about how LLMs work.

    As I understand it, LLMs are “stochastic parrots” precisely in the sense that they’re very, very, very sophisticated Magic 8 Balls or, to give them a little more credit, more versatile versions of Queaneau’s Cent mille milliards de poèms. Their output is as “meaningful” as asking the Magic 8 Ball “Should I quit my job?” and getting the reply “Most certainly,” or any random one of those hundred thousand billion poems.

    My question is: how do we settle this? Does it matter how, exactly, LLMs work? Do we need to identify the precise technical difference between LLMs and those obviously purely stochastic generators of language for Kate’s argument to work?

    • hayles20

      Thanks to Thomas for the correction about Softmax in my post. However, the lines that he quotes from Blaise Aguera y Arcus are taken out of context. Those “enormous tables of number” are indexical pointers that point to other pointers that sometimes point back, etc., creating billions of correlations from which the model draws inferences. Later in that same Economist piece, Blaise presents a dialogue he had with LaMDA, which he claims demonstrate that the model has developed a theory of mind. He suggests that because the model is optimized for dialogue and programmed to engage in dialogues with humans, it has figured out how to interpret human actions that require insight into what someone is thinking. Moreover, he ends by suggesting that LaMDA, and similarLLMs, have become “proto-sentient,” a claim even stronger than my assertion that it has intentions.
      Side comment: “proto-sentient” seems to me a misnomer. Usually an entity is said to be “sentient” when it is considered to be capable of sensation but not rational thought–a clam, for instance. LLMs are precisely the opposite: they are entirely rational in the sense that they are bound to the rules of math and logic but have no sensations at all. “Comprehend” seems inappropriate as well, since their knowledge of the world is anything but comprehensive. I propose calling them (for those who take the proto- position) “apprehensive,” as in capable of apprehending something. This term has the advantage of also acknowledging the misgivings that many have about LLMs as a threat to humanity.

      • Thanks for responding, Kate.

        “Those ‘enormous tables of number’ are indexical pointers that point to other pointers that sometimes point back, etc., creating billions of correlations from which the model draws inferences,” you say.

        The sticking point is whether LLMs merely “add and multiply” those numbers or “draw inferences” from them. I have seen no technical description that convinces me that they ever do the latter.

        Aguera y Arcus may think LaMDA has a theory of mind, but I have not seen a good reason to think so myself. Its output is easily explained by the training. It doesn’t need a theory of mind because it has that table of numbers. Humans, lacking such a table, need a theory of mind to pull off the same feats of language.

        To riff on an old Stanley Cavell paper, we must *mean* what we say. LLMs don’t.

  6. From my perspective, the question is “What are people doing with these texts?”

    Because the texts produced by language models are not wave poems, actually. They don’t come out of nowhere. People use these models to brainstorm, answer questions, stage imaginary debates, or be reminded of a word that’s on the tip of their tongue.

    In other words, the texts generated by these models are already shaped by human intentions and embedded in a social system. So it’s never really necessary to ask the abstract question of whether a generated text, in isolation, would possess meaning—or whether, if so, we would have to attribute intention to the model. The thought experiment is counterfactual.

    In that sense, I think Pavel is right that for some of us the debate about Against Theory “is over before it’s even begun.” I didn’t directly address it in my piece, because I don’t think a thought experiment about wave poems is revelatory about LLMs. Nor do I think it matters much whether we say these texts “have meaning.” What matters is — as Kate and Matt both say in different ways — what we actually do with these systems and texts.

  7. Steven Knapp and Walter Benn Michaels

    Pawel’s intervention, Kate’s response, and Matthew’s comment all seem to us extremely clarifying. In responding to Pawel, Kate makes clear that (whether or not she thinks of herself this way) she is completely committed to AT’s identification of the meaning of a text with what the author intends it to mean. The difference between her and us (and her and almost everyone else) is that she thinks that “LLMs do have intentions.” If she’s right about that, then, from our standpoint, of course they can produce meaningful texts. In exactly the same way that, in our original article, a genius loci could write the wave poem. But Against Theory wasn’t called “Against genii loci.”

    What is controversial is brought out in Matt’s response, when he says he doesn’t care whether the LLMs “’have’ intentions” and when he characterizes our conflicting responses to “I hope you like it!” as “readings” of the text. Our point is that whether or not you think you care about the LLM’s intentions, the minute you find yourself arguing with your collaborator over whether that sentence is needy or snarky you are treating the LLM as if it had them. So then you’re Kate. Because if you truly didn’t care about the LLM’s intentions in the sense that you wanted to treat it as if it had none, there’s nothing to argue about. There can be no fact of the matter about whether it’s using those words sincerely or ironically because, unless Kate’s right, it’s not using the words at all – it’s just producing the results of its algorithm.

    It’s easy to see this if we imagine ourselves doing what Kate recommends, some of the close reading that is a literary critic’s stock-in-trade. What about that exclamation point? Steve says it’s an expression of a sincere but sort of pathetic desire to please, a kind of naïve enthusiasm. Walter says it’s straight up “screw you,” demonstrating complete indifference to what we like. Neither of these readings can even be formulated except as an account of what someone was doing. In fact, the whole idea of close reading makes absolutely no sense except as an effort to figure out what someone was doing. Were we truly to stop caring about what the author was doing (truly treat the text as if it had no author), we might still report to each other how those marks on the screen made us feel, but we wouldn’t any longer be arguing about how they were supposed to make us feel. And if we’re not trying to understand how they’re supposed to make us feel, we’re not producing readings but just reports of our feelings.

    The basic idea of AT was that reading a text is nothing but understanding what someone is doing and that understanding what someone is doing inescapably involves understanding what they are intending to do. Now there are, of course, different ways of understanding acts and intentions, and we ourselves (for reasons provided in the various essays on art and action by WBM on nonsite.org) are very skeptical of the idea, repeatedly deployed here, that you infer an intention from what you read – if you’re reading, you’re already treating the marks as intended. But the crucial point for the purposes of this discussion is just that the object of interpretation is what the author intends. We think that most of the contributors to the discussion (few as forthrightly as Kate) actually end up acting as if they agree – hence the proliferation of fictitious and/or as if intentions. But if they don’t agree, we’re curious about what they think the object of interpretation is instead.

  8. Matthew Kirschenbaum

    My answer to Steve and Walter is that the object of interpretation is an emergent (stochastic, if we must) phenomenon born of the complexity of the underlying model. That is why the snarky/needy split is interesting, not because of any belief in a homunculus, which is to say any single intentional agent; it is unresolvable, but no less compelling, at least in my view.

    In this my response might also seem to simply comport with Kate’s, though I am less sanguine about the possibility of recovering a refined sense of any umwelten. (I expand on my own view of the prospects for cracking the black box here; https://muse.jhu.edu/article/794477; Rita’s several recent papers have challenged my skepticism on this, though I also wish to see how the kind of methodology she and her co-author deploy will continue to scale.)

    That said, I found this much the most clarifying moment in Walter and Steve’s first response to the forum:

    “Intention in ‘Against Theory’ is a way of describing an act rather than an allusion to a mental state that is prior to or behind or outside of the act. That’s why its fundamental opposition is between the act of writing and the event of waves washing up on the sand – not between what’s going on inside someone’s head and marks appearing on the beach.”

    I’m hard pressed to find anything so direct (and frankly, helpful) in the original round of AT writings, though I accept that the idea was latent in them at the time. But I also think Walter and Steve derived then and continue to derive a certain advantage out of leaving their readers to shadow box with precisely a “mentalist” model of intention. LLM’s dramatize the stakes of that shadow boxing since whatever our sense of their intentions or umwelten I think we all acknowledge that the “mind” one encounters is a profoundly alien one. To argue for intention means starting from an almost impossible position if those intentions are “mentalist,” but much less so if otherwise, if in fact they are something more like an act, a bringing into being—

    Indeed, new media theory has done a great deal of spadework in preparing us for this moment. Critical code studies (as mentioned above) and allied endeavors like software studies, for instance; the conceptual foundations of proceduralism are especially well defined in the field (Ian Bogost and Noah Wardrip-Fruin, to name two essential contributors).

    On another point: that the ChatGPT version of the wave poem is a text must be non-controversial in at least one very specific sense: its textual status wherein text is a computational (and computable) data type, i.e. 8-bit ASCII. Nor is that mere pedantry: online, *this* text will behave like any other, scraped and assimilated into emerging future corpora. The machine readers (or machines reading which Tyler Shoemaker wrote of in his response) will not discriminate, indeed will not be able to help themselves.

  9. Steven Knapp and Walter Benn Michaels

    This exchange has been really useful to us and, since it seems to be coming to an end, we wanted just to underline how stark the difference is between our position and the one Matt ends by taking.

    According to Matt, the difference between regarding “I hope you like it!” as needy or as snarky is “interesting” and “compelling,” regardless of our supposing that any intentional agent is being needy or snarky. But why? What makes that difference compelling if not precisely the difficulty of deciding between two different intentions? And if there can be no possible point in deciding between needy and snarky (which there can’t be if there’s no intentional agent declaring itself one or the other), what exactly are we interested in?

    Would the example be more or less interesting and compelling if we multiplied the possibilities? Not just possibly snarky or needy but possibly friendly, earnest, indifferent? Or if we started adding on the infinite possibilities made available by regarding the words as belonging to other languages—for instance, Schmenglish, where they are properly translated as “You bet we eat them!”

    Our point here is just that whatever we’re finding ourselves interested in or compelled by, it’s not the question of what ”I hope you like it!” means; it’s the question of what we like imagining it to mean. We have no argument against using the word “interpretation” to name the practice of imagining various possible meanings of some set of marks and grading their comparative degrees of compellingness. (“You bet we eat them!” totally wins!) That’s not, however, the practice that the theorists we first criticized some 40 years ago thought they were finding ways to explain and to govern. And it’s also not a practice that needs explaining, or would be possible to govern, now.

  10. Matthew Kirschenbaum

    I for one am happy to see this exchange continuing though I think I must now exercise some restraint and make this my last comment, for a while (in hopes others join instead). But someone should point out that there are (of course) other ways this language game could be played:

    I could declare, for example, that I do not, in fact, *believe* Walter and Steve when they say that ChatGPT wrote “I hope you like it.” (Isn’t it just a little *too* perfect . . . ) That would be very rude of me, and they would be right to protest, but there’s no way to prove the point one way or the other since, as we know, the outputs of LLMs in any any given instance are unreproducible.

    And while such suspicion would be a violation of collegiality here on the virtual pages of CI on the wider Internet it and much worse would be commonplace: “Fake news!” they cried. Even more to the point would be a scenario where there is *nothing* in context to suggest anything in particular about the authorship of that text at all, one way or another; that is indeed the textual condition we are rapidly coming to inhabit, at least in my textpocalyptic view. Pace Lisa Siraganian then, I *do* think the universality of such scenarios—if not now, then soon—makes a material difference to the underlying question of an intentional agent (I also take this to be Gil’s point), and that we are better served by Bajohr’s position of assuming intent as a practical, if not necessarily an ontological, matter.

    Well. Clearly there’s *something* in all this that’s compelling, the dog days notwithstanding— language continues to beget language. “You bet we eat them!” should go on somebody’s coffee mug.

  11. Pawel Kaczmarski

    Thank you all for your replies – I am beyond happy to see my modest intervention spark such a brilliant exchange. Seeing as the thread seems to be nearing its natural end, allow me to raise some points of my own, in lieu of any definite conclusions.

    First, I still think we are perhaps too eager in assuming that LLM-produced sequences are indeed “texts”, at least in anything approaching the usual sense. To be clear, I am not saying they are definitely not – rather, I simply do not think it is an empirical given, an obvious fact that could serve (for instance) as a starting point for developing a theoretical position. Whether these sequences are texts or not seems to me to rely, logically, on whether we consider LLMs to be capable of possessing (and expressing) intentions. Crucially, I imagine even “soft intentionalists” should agree with this: in order to posit that LLMs are incapable of intention but capable of writing texts, one would have to assume that writing is entirely independent of intent, which might be too extreme a position even for those who don’t necessarily agree with AT.

    (As a side note, I think in some cases the relative vagueness of the notion of “text” serves to obscure rather than clarify the main points of disagreement, blurring the line between a series of [purely physical] shapes and a series of [meaningful] signs. In this context, I think there is something to be said for the notion of “work” – how would it impact our discussion, for instance, if we asked whether LLMs are capable of generating works of literature, rather than “texts”?)

    Second, I, too, greatly appreciate Katherine’s clarification, and her courage to take what is perhaps the most thought-provoking of all the available positions, namely that LLMs are simply capable of possessing and expressing intentions of their own. I think Walter and Steven are absolutely right in pointing out that this assumption makes Katherine’s position essentially compatible with AT. This means that, for instance, my own disagreement with her – I don’t think LLMs have intentions, for much the same reasons as Thomas – is of interpretive rather than theoretical nature; it is a difference between two interpretations (of LLM-generated sequences as meaningful versus meaningless), rather than between two theories of meaning.

    Having said that, I think Katherine’s assumption has serious political implications: shouldn’t we, for instance, consider the LLMs a special kind of living subjects, capable of agency and entitled to certain rights? Of course, this and similar questions lie beyond the scope of the forum itself, but if we are indeed to assume that LLMs possess intentions (and are capable of expressing them), then the issue of their legal and social status seems to me of particular political urgency. (Again, personally I don’t perceive LLMs as capable of intentions, but I would imagine it is a pressing matter for those who do.)

    Third, I think Matthew is right in pointing out that AT-advocates and AT-sceptics alike can probably agree that the proliferation of LLMs may lead to various significant “material” changes in the world. From a social, cultural, or historical point of view it is indeed hugely interesting to observe, for instance, how often we seem to treat LLM-generated “texts” as texts while also admitting a deep uncertainty as to their authorship status. However, of course, such phenomena neither necessarily reflect a change in the ontology of art/language/text, nor are bound to have any impact on it.

    In fact, the scenario described by Matthew – “a scenario where there is *nothing* in context to suggest anything in particular about the authorship of that text at all, one way or another” – doesn’t seem to me to undermine the argument of the original AT at all. Rather, it seems perfectly compatible with it, to the point where Matthew’s description could serve as the very definition of what it means to have an interpretation according to AT (a sort of a paradigmatic scenario, if you will): there is nothing outside the work, nothing “in context” that could guarantee the validity/accuracy of any interpretation; no interpretation may be done in advance, as any word in any language can ultimately mean anything. Indeed, this is the “against theory” part of Against Theory: the ultimate uncertainty of every interpretation (in the sense that there is never an external authority that could guarantee our interpretations are accurate) stems directly from the notion of meaning as authorial intent, and is the very reason why any attempt to “govern” interpretation from the outside must fail.

    As for Matthew’s suggestion that “we are better served by Bajohr’s position of assuming intent as a practical, if not necessarily an ontological, matter”, I must say that – at least for now – I fail to see how our pretending to believe something that we really don’t (that LLMs have intentions, or that LLM-generated sequences are meaningful) could prove productive, except maybe for those of us applying for AI-related research grants, or people who work in actual AI development (although I hear they are similarly divided on the matter). However, it’s certainly useful to be able to understand why others see meaning where I don’t – and this is just one of the reasons why I personally found the entire forum very productive.

    Again, thank you everyone for this brilliant exchange – and a special thanks to Matthew for organising the forum!

  12. Matthew Kirschenbaum

    Thanks for the thanks, Pawel, but especially thanks for such an in depth comment.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.