Quantifying the Weepy Bestseller

I have a new piece out that is appearing in The New Republic. In a number of recent book reviews, literary critics and novelists arrive at the consensus that to be a great writer, one must avoid being “sentimental.” One famous novelist describes it as a “cardinal sin” of writing. But is it actually true? Using a computer science method called “sentiment analysis,” we tested this claim on a large corpus of novels from the early twentieth century to the present, and found the opposite. Writers who win book prizes and get reviewed in the New York Times are not any less sentimental than novelists who write popular fiction, such as romances or bestsellers. The only group for whom this was not true were the 50 most canonical novels ever written since about 1950. Our analysis tells us that if you want to write one of the most important books of the next half century, then you should tone down the sentiment. But if you want to be reviewed in a major newspaper, sell books, or win prizes, go ahead and emote away.

But the larger point for us is the way our cultural taste-makers are often wrong or extremely biased in their assumptions about what matters. We found that a computer, ironically, can paint a more nuanced picture of what makes great literature.

Here is a an excerpt:

If you want to be a great writer, should you withhold your sentimental tendencies? The answer for most critics and writers seems to be yes. Sentimentality is often seen as a useful way of distinguishing between serious literature and the not-so-serious, probably best-selling kind. “Sentimentality,” James Baldwin wrote, is “the ostentatious parading of excessive and spurious emotion…the mark of dishonesty, the inability to feel.” While sentimentality is false, grandiose, manipulative, and over-boiled, high literature is subtle, nuanced, cool, and true. As Roland Barthes, the dean of high cultural criticism, once remarked: “It is no longer the sexual which is indecent, it is the sentimental.” This sentiment (yes sentiment) has been around since at least the early twentieth century and is still a subject of debate in the review pagesof numerous media outlets today. But is it true? Whether you are for subtlety or against sentimentality, is this a good way to think about writing your next novel?

Read more here.

Where did all the love go? Feelings in the novel.

I have been increasingly focusing on the history of feeling in the novel, especially as a way of differentiating feeling from sentiment analysis. Emotions aren’t the same as sentiments, as they are commonly defined today (and usually only in binary fashion — happy/unhappy or positive/negative). Instead, I was interested in the ways different kinds of emotions change as the novel evolves and what kinds of configurations one might find — the way words for sadness move and reposition in relation to words for joy or love or even anger.

I’m still working on this, but I did want to share an interesting insight about the overall decline of emotionality in the novel. While I know we associate Romanticism with the pathetic and the bathetic, I was still surprised at how much the vocabulary of emotions declined as a percentage of words within the novel overall. We have indeed gotten colder — at least the list of canonical novels has.


These numbers are based on novels in English with the dictionaries drawn from both thesaurus.com and Johnson’s 1755 online dictionary (to try to capture historical change of language). Each emotion consists of roughly 100 synonyms for an overall dictionary size of roughly 500 words. I found thesaurus.com able to give a very robust range of words used to capture different emotions. I expected the contemporary bias to favour more recent texts, but these dictionaries were very good at tracking the historical use of emotions as well.


Things really fall off the charts if you compare contemporary novels with the novel of the long nineteenth century. These are novels published between 2012-2014 and reviewed in the NY Times Book Review — suggesting that they have some kind of highbrow (but not too high) identity. Here are the different averages for all emotions in the novel:

19C Novel = 0.015117854

Contemporary Novel = 0.008936122

Look again at those numbers. The first one isn’t twice as much. It’s 17x as much. The contemporary novel is seventeen times less defined by a vocabulary of feelings than its predecessors of the long nineteenth century (1770-1930). That’s insane (and usually grounds for suspecting the data is off, but I keep checking it…Keep in mind too that the words are coming from a contemporary thesaurus). The Young Adult novel interestingly gets closer (avg = 0.01030404), suggesting an answer as to where the novel’s feelings are hiding these days.

Last, for the Romanticists out there, I was intrigued by the centrality of anger to the Romantic period, where most of the highest scoring angry novels are located during this period. So things Romantic are not just more emotional, but also more volatile, fitting for a period of revolutionary and post-revolutionary unrest.


I’m still wondering where all those feelings went for the adults?