Excited to share a new piece with Hao Xu and Eric D. Kolaczyk of McGill that is appearing in this year’s Computational Humanities Workshop Proceedings (CHR). In it we tackle the question of how to model the idea of “narrative revelation.” For us, revelation is question of how information is revealed over narrative time.
Narrative revelation thus sits among a variety of narrative modeling questions that treat narratives as time series problems. Here we’re interested in the idea that in a story you go from a state of no knowledge about a story to an end state of complete knowledge about a story. What are the patterns that such revelation takes?
In particular, we’re interested in seeing whether fictional narratives behave differently when it comes to patterns of narrative revelation and also whether such patterns exhibit periodic behavior, i.e. whether there are rhythms of how information is revealed. We might think of this as the difference between exploring and exploiting, where narratives can explore new information or spend time going over (exploiting) already introduced information.
To measure revelation we use Kullback-Leibler divergence between vocabulary distributions of sequential windows of text. So imagine you have a window at time T we measure how much information is gained by comparing this with the window at T-1, etc. We then end up with time series of KLD values for every window for each book. Here is an example of what these look like in practice.
What we find using our CONLIT data is that fictional narratives exhibit lower overall levels of novel information. They tend to be less surprising and indulge in higher levels of exploitation. Also interesting is the way they tend to decrease their revelation over time in more significant ways than non-fiction. Fiction seems to zero in on given material and dive deeply into it over narrative time. The figure below gives some idea of what these differences look like.
That final uptick is also intriguing. It appears that across most genres narratives exhibit a preference for increasing the amount of local information they reveal. Whether this is absolutely novel (surprise) or a return to prior information (circular) remains to be seen in our next project.