I have a new piece out in the journal Poetics with Olivier Toubia, a professor of Business at Columbia University. We use word embeddings to model the linearity of narratives in a collection of ca. 2,300 books. We find that fiction exhibits a much stronger preference for non-linearity than non-fiction, but that this preference does not correlate with reader enjoyment or popularity. When authors tell made-up stories they rely on far more non-linear structures, but the level at which they do so does not predict how much a reader may enjoy the story.
We do find that readers prefer stories that cover more semantic “distance,” but do so in more economical terms (make smaller segment leaps over narrative time). This suggests that strong storytelling depends on traversing more narrative space but doing so in more parsimonious, immersive ways.
There are many ways that we may not tell a story in a strictly linear way. We may jump from one person’s perspective to another (focalization); we may jump from one place to another (setting); or we may jump from one time frame to another (anachrony), moving either forward (prolepsis) or backward (analepsis). In each of these cases, instead of moving to the next logical step in a sequence, our attention is focused on a different, non-linear direction: sideways, backwards, or elsewhere. Narrative theory refers to this discrepancy – the discrepancy between the ordering of the events in the storyworld and the way they are recounted – as the difference between “story” and “discourse” (Bal & Van Boheemen, 2009; Brewer & Lichtenstein, 1981, 1982; Tomashevsky, 1965). Human beings have a remarkable capacity not only to narrate events, to explain through language “what happened,” but also to manipulate the sequencing or connectedness of those events in their telling.
The widespread presence of non-linearity in human storytelling thus raises the question of its cognitive and social value. Why would a narrator depart from the most linear, sequential and efficient means possible to convey information to a recipient? And under what conditions are storytellers more likely to depart from this sequentialist assumption?
This was a fun piece to co-write. Toubia developed a neat method of using word embeddings to model the semantic relationships between a book’s parts and then model these relationships using the framework of the traveling salesman problem: given a series of points in space, what is the shortest path between them? Non-linearity for us is then the difference between this shortest path through narrative space and the actual path the book takes. A very ingenious method!
One of the things we found when validating the method was that very high-scoring books exhibited a range of different kinds of non-linearity. Some like Ali Smith’s How to be both or Gary Shteyngart’s Super Sad True Lovestory foregrounded two different characters’ perspectives. Others like Taylor Steven’s The Informationist use multiple temporal frameworks and settings to create a sense of non-linearity.
On the other hand, books that were predicted to be highly linear were not necessarily highly sequential but rather more episodic. For example, Julie Otsuka’s The Buddha in the Attic is a novel that tells the story of a group of young women brought from Japan to San Francisco as “picture brides” in the early twentieth century. Rather than follow a single character, it offers a series of vignettes told in the plural first person that follow different women through different stages of the experience. As one reviewer wrote, “Though the publisher classifies the book as a novel, it is more like a beautifully rendered emakimono, hand-painted horizontal scrolls that depict a series of scenes, telling a story in frozen moments.”
For future work, we’re interested in disambiguating these different kinds of non-linearity. When are seeing a reliance on temporal anachrony (the use of flashbacks) and when are we seeing more character-driven non-linearity and when does a novel utilize different settings and plot lines. These are all interesting computational challenges that can tell us more about the preferences of readers and authors when telling or consuming stories.