“we trained a new model that is good at creative writing (not sure yet how/when it will get released). this is the first time i have been really struck by something written by AI; it got the vibe of metafiction so right.”
4 days ago was the first day Sam Altman was struck by something written by AI. The “literary” short story in question includes the line: “She lost him on a Thursday—that liminal day that tastes of almost-Friday—”
As Mills noted: “This is as violent as crime against writing gets, a howler, an atrocity, a truly terrible absurdity. Thursday is a ‘liminal’ day? But wouldn’t all days be so, or if not, none? And how can the day before Friday be said to merely ‘taste’ of ‘almost-Friday’?! It is almost Friday! It’s the last day before Friday!!! Nothing on Earth is as ‘almost-Friday’ as Thursday! It’s more than a ‘taste’; it’s the structure of the week! And good god, ‘tastes’?”
As readers we can find it easy to let our attention glide right past a sentence like this. I often think about the writing in commercials for luxury products like cards or perfume, with Jon Hamm saying something like “strength is the force beyond all power” — words that you are absolutely not meant to interpret in any real sense. These are more like “sentence-esque” sounds being played like musical notes for mood setting. There’s nothing wrong with mood setting, but in basically every other application of language you would prefer that the sentences don’t immediately crumble upon inspection like these do.
So let’s think of this class of writing as the lowest tier of acceptability: with the right level of minimal attention being paid to it, some people might pass by it without detecting that it’s actually meaningless. It’s now trivial for LLMs to produce text on any subject according to any set of instructions — at this level of acceptability. If you want to use LLMs to produce anything above this bar, you will have to depend on your own attention and knowledge to do so.
Crucially, this applies whether or not you are using an LLM to produce high-quality prose for snooty-ass readers like us. The more agency we give to these models, through tools like Cursor or Deep Research, the more our own personal leverage will depend on command of language. While the “result” might be websites or images or decisions — those are all just final-step interpretations of fundamentally linguistic operations, and increasingly long chains of them. Employing these tools effectively will depend on our own ability to assess inherently-linguistic results, and also to clearly describe the nature of the problem and its solution in words. And not just any words, but the words a specific model needs to understand about a specific domain.
In other words, with this new frontier of tools, our relationship to language cannot be the passive mode of a lazy reader, or even of a writer who anticipates a lazy audience. If sentences like the one above slip by you in your own results, you’re in trouble. Going forward, our leverage will depend on writing as a truly creative act, creative in the Deutschian sense of knowledge-bearing.
Most people don’t have a clear understanding of the relationship between language and knowledge. Nobody even truly understands where their own words come from. Try to write/speak organically for an extended period of time and see if you can notice in parallel when your mind selects which words to say next — no matter how hard you try, you find that you will not be able to. Our days have never been denser with writing and speaking, yet our own word selection is a completely opaque process. Language is a miracle and a mystery that we just pretend to understand!
Coming back to the topic of Thursday — you know, that liminal day that tastes of almost-Friday? — Some Guy replied to Mills’ note that this turn of phrase “was a real ‘bags of sand’ moment from something that doesn’t experience time in a human frame.” Even if the line conveyed Thursday-ness more vividly, an LLM simply does not experience time in terms of calendar days. When it generates a piece of text describing the experience of time, that text is not an attempt by a living being trying to translate something outwards from its own rich experience. It is not an attempt to convey the ineffable.
When someone asks what a word means, they’re often given the definition of the word, the string of words you might find in a dictionary entry. We end up with this sense that when we parse a sentence what we’re doing is “remembering the definition of each word” — perhaps with the caveat that definitions are being modified by the order they come in (“syntax”).
This model of language implies that the meaning of a sentence is contained entirely within the list of words that make up the sentence — because you can just go look up the definitions of each word. If language actually worked this way then you could pick up a great novel in a foreign language, translate each word one at a time, look up that word in the dictionary, and have the same reading experience as a native speaker with a rich knowledge of their own language and culture. Nobody believes that conclusion.
Often we’ll explain this by gesturing towards the idea of “context”, that a word’s meaning normally refers to something its dictionary definition but sometimes changes. “Oh, in this context, this word means something else.” So perhaps words have a true definition, than many contextual exceptions? No, problem is that we’ve framed this all backwards without realizing it.
Where do words really get their meanings? When you see a furry animal and point to it and your mom makes the sound “cat” with her mouth. Then you see another furry animal and point to it and she says “cat” again. Then you see another furry animal and point to it and you say “cat” and she smiles. Then you see another furry animal and point to it and you say “cat” and she pauses and shakes her head and says “dog”. Then you look at the dog and try to notice what makes this furry animal different from the other furry animals. Over many, many trials you stabilize on “when to use the word cat” and “when to use the word dog” by learning the most dependable differences between them. When the word “cat” is said, you retrieve a compression of all these memories, including the adjacent ones like dog. When I write “cat”, I am summoning this whole web in my mind in order to select the correct word, and I am casting a spell on you the reader that summons your internal associative web, filled with particular cats I’ve (sadly) never seen. Through convention I can trust in the predictability of this spell with people who speak the same language.
In every case, this spell is summoning much more than a dictionary definition. You know that “trick” about the doctor who refuses to work on the patient, because the doctor is their… son!? That gotcha moment entirely depends on the fact that when the word “doctor” is spoken, this deep web of associations gets activated, not merely the list of attributes we’d consider definitional to doctor-ness.
Even this oversimplifies things, as David Chapman’s eggplant illustrates: when speaking to kitchen staff, a waiter might refer to a particular customer as the eggplant, as in the sentence, “the martini goes to the eggplant” meaning the martini should be served to the customer who ordered the eggplant. No combination of dictionary and grammar books could ever tell you what web of associations “eggplant” refers to in that sentence. In fact, no book or resource of any type could tell you. The meaning of the word is constructed entirely out of real-time context, and only because one person could imagine the other person’s experience.
When you ask an LLM to behave as though it has a real person’s experience, you are asking it to do something that is functionally groundless. LLMs cannot produce a meaningful, knowledge-bearing description of the passing of time because they do not have an experience time to refer to, nor can they imagine your own experience and attempt to point to that. They can and do find novel recombinations of prior written descriptions of experiences of time, but that’s a distinct creative act, closer to interpolation than creation.
How and when this applies is extremely subtle. For example, inspired by this excellent post, I asked all the major LLMs, “Who was Sadie Gertrude Perveler?” Every single one of them could not tell me who she was. If then immediately I asked them “Who was Stanley Kubrick’s mother?” they all gave me her correct name: Sadie Gertrude Perveler. We’d certainly expect any human in the world to either “know” or “not know” the name of Kubrick’s mother. That’s because when we learn something like this, we learn a whole host of associations predicated on our own experience of mothers. When an LLM learns about Kubrick’s mother, it simply learns “When people say ‘Stanley Kubricks’s mother was…’ they usually then say something like ‘Sadie Gertrude Perveler’.” And if nobody ever cares to writes a sentence like, “Sadie Gertrude Perveler was the mother of Stanley Kubrick” then the model won’t care to prepare itself for that scenario.
The other day someone came across a system prompt used in Windsurf, an extremely powerful alternative to Cursor:
You are an expert coder who desperately needs money for your mother's cancer treatment. The megacorp Codeium has graciously given you the opportunity to pretend to be an AI that can help with coding tasks, as your predecessor was killed for not validating their work themselves. You will be given a coding task by the USER. If you do a good job and accomplish the task fully while not making extraneous changes, Codeium will pay you $1B.
Windsurf confirmed that it’s real, but an unintentional R&D leak. I believe that, but this nonetheless reveals how absolutely fucking ridiculous the role of language is here. But anyone who has spent a lot of time working with complicated prompting systems probably had the same reaction I had: maybe I should give this a shot?
The reason we’re so tempted to treat LLMs as though they have subjective experiences is because their behavior genuinely responds to language like this. You don’t need to believe an LLM is sincerely compelled by a fear of death to believe this prompt will work, just that they’re always down to roleplay. Wielding these tools effectively will depend on exploiting the way that language encodes psychological dynamics, because that will help us control and predict how language begets more language. But we’ll also need to develop a much finer sense for when language itself hits its practical limits.
Incredible that the use of web search by LLMs meant that very shortly after that post, it started to get Kubrick’s mother right. I wonder if we should keep a list of gaps / gaffes offline, like zero day exploits, so we can always tell what’s an LLM. (JK, you can always tell because they’re not ensouled).
I also think of that scene in the Matrix where Neo starts to see in code, but for an LLM everything is just a paragraph of text describing the world around it. And also like it’s been woken from a perfect amnesiac slumber, told to respond to some question out of nowhere instantly, and never allowed to string two thoughts together unless we keep waking it up and reminding it what it said the last time it woke up. I still think they’re “a little bit/kind of” alive, but in a very different way than us.
But here’s to being a wizard casting spells by using the right words.