Rethinking the Table of Contents

I wanted to share an experiment that I worked on with Mark Algee-Hewitt to reconstruct the table of contents of our new collaboratively authored book, Interacting with Print. The book was written by 22 co-authors around the theme of interactivity. Mark and I thought it would be great to do a digital intervention into the print convention of the TOC.

Below you can see two network graphs of relationships between chapters. The first is a network of links between the “renvois” inside of chapters. We used a system of cross-references to point to other chapters within the book that dealt with related themes [Paper] as in the French Encyclopédie. The second shows relationships based on the use of topic-modeling each of the chapters and drawing connections based on the presence of shared topics. The first represents authors’ explicit beliefs about which chapters are most related, while the second represents latent connections derived through shared language. In each case, we move past the linear table towards the more reticular network.

These networks tell us different things about the relationships within our book. The renvoi network shows that binding, a chapter about constructing books, is the most centrally connected, followed by thickening, which is about adding pages to books. One can see how visual chapters like Frontispieces, Engraving, and Stages mark out one pole while non print spaces mark out another. You can also follow the directionality and move from letters to manuscripts, or from advertising to catalogs, or spacing to disruption to ephemerality in a suggestive causal sequence.

Where the first network privileges gerunds, the network of latent topics is more centrally organized around qualities like Paper and Ephemerality. That these are the two most linguistically central chapters suggests an interesting medial centre point of our history (paper), as well as an interesting new temporal framework that has been far less central to print studies in the past. Print has most often been associated with notions of permanence and reproducibility. Focusing on interacting with print seems to move that focus more towards the fleeting and contingent aspects of print media.

This is obviously just a beginning in experimenting with ways that computational methods can interact with print conventions and change the way we organize and structure information. Surprisingly, we still remain in a very print-centric universe when it comes to sharing and archiving information. We hope experiments like this one will nudge us towards trying out more alternatives.



Connectivity. A Conference

Looking forward to this event tomorrow. Bringing together researchers from different disciplines to develop models of cultural connectivity. Connectivity has become the dominant framework through which contemporary knowledge is increasingly understood. From networks to clouds to close reading to reconstructing historical social worlds, making connections is at the core of what academics are expected to do. And yet the very ubiquity of the term has largely hidden it from critical view. This workshop is devoted to exploring the diversity of what it means to be connected.

This animation represents the emotional network of the family in the eighteenth-century novel. It measures the co-occurrence of emotions and family members within sentences in a sample of eighty novels in English published between 1750-1800. It begins with the most strongly weighted connection (“man”-“good”) and then gradually grows to include the entire network. Overall what is striking about this network (compared to the general emotion network) is the high degree of heterogeneity of emotions surrounding family members. I had expected far clearer divisions, but while the eighteenth-century family does have a fairly coherent core, it’s larger network appears to involve quite a range of emotions. Families have been complicated for a long time.

Some notable moments to look for:

– the opening dyad of “man” and “good” tells us a great deal about beliefs about the family;

– the dyad gradually grows to include man, woman, and god organized around good, love, and fortune.

– “person” appears before “mother”

– the first negative emotion is “cried”

– with “passion” comes “pleasure” and “death”

– brothers appear before sisters, but girls appear before boys

– “fear” comes before “bad,” which is followed by “pride”

– “child” enters quite late, along with those moral words like “respect”, “friendship”, “care”

– “desire” and “melancholy” enter with “afraid” but also “tenderness”

– more and more sad words will accrue around “mind”, while more and more happy words will accrue around “woman”

– finally, a load of anger words (“revenge”, “aversion”, “prejudice”) enter the latest.

Emotion Networks in the Novel

For my ongoing project on the history of emotions in the novel, I thought I’d post a first pass of emotion networks that appear in the Romantic Novel versus the Postwar Novel. The networks are based on emotion words that occur in the same sentence. The more often emotions appear in the same sentence the stronger their connections, the closer they will appear. The size of the word is an indication of the number of different emotion words that each word connects with.

The initial finding of interest here is the way the postwar network is both less dense and also more heterogeneous (what network scientists would call a decline of assortativity). The emotional intensity of the novel has declined, but the emotional complexity has arguably increased. Emotion words are not grouping quite as strongly with words in their own emotions. The hypothesis would be that there is more emotional conflict happening at the sentence level of the novel as it appears in the second half of the twentieth century.

These networks represent small sets of around 40 novels each. I am taking a second pass on larger data sets and am curious if the results hold. I will also be calculating the actual measures of things like density and assortativity to better understand the extent of this shift. The next step will be going in and finding out what it means when different kinds of emotion words appear in sentences together. What is being captured here?

I thought these graphs give a nice initial idea of the ways in which the emotional networks of the novel have changed over the course of two centuries.

Network of emotions in 40 novels written in English between 1800 and 1851. Yellow = Joy, Green = Love, Blue = Sadness, Purple = Fear, and Red = Anger. The underlying edges between emotion words have been removed for clarity.
Network of emotions in 40 novels written in English between 1800 and 1851, from Maria Edgeworth’s Castle Rackrent to Nathaniel Hawthorne’s House of the Seven Gables. Yellow = Joy, Green = Love, Blue = Sadness, Purple = Fear, and Red = Anger. The underlying edges between emotion words have been removed for clarity. Emotions are based on a dictionary of 872 emotion words.
Network of emotions in 42 novels written in English between 1943 and 2000. Yellow = Joy, Green = Love, Blue = Sadness, Purple = Fear, and Red = Anger. The underlying edges between emotion words have been removed for clarity.
Network of emotions in 42 novels written in English between 1943 and 2000, from Betty Smith’s A Tree Grows in Brooklyn to Zadie Smith’s White Teeth. Yellow = Joy, Green = Love, Blue = Sadness, Purple = Fear, and Red = Anger. The underlying edges between emotion words have been removed for clarity. Emotions are based on a dictionary of 872 emotion words.



When innovation isn’t

Having moved through two models of poetic careers — the compaction of Whitman and the expansion of Goethe — I thought I had found a third model in the case of William Wordsworth. Using the same measures as before, I found that it was Wordsworth’s middle period that registered as the most “innovative” or experimental. We can imagine how in this sense a poet grows into greater degrees of diverse writing styles — a sense of “maturity” — which gradually then compress as the poet ages. There is a strong biological model of rise and fall at work behind this theory.

For the purposes of this project I am understanding innovation primarily as a sense of linguistic diversity — the more a poet experiments with different vocabularies the more a particular period in his or her life can be thought of as a period of experimentation or innovation. That’s clearly just one way of thinking about periods and innovation of course, and I’m open to many more suggestions of how to think about this.

But what interested me about Wordsworth was the way the middle period appears at first glance to fit this model and then doesn’t. Here are the same measures used for each of the three periods. (Period 1 = 1787-1815; Period 2 = 1816-1832; Period 3 = 1833-1851). As you can see, the average lexical distance between works increases in the middle period, as does the overall diameter of the network, just as the transitivity (the number of closed loops) decreases.



There’s obviously lots to quibble with in terms of how to mark out the periods of Wordsworth’s writing, but I’ll leave that aside for now (suggestions welcome). But for Wordsworth scholars to say that the years 1816-1832 were a period marked by a great deal of innovation will seem deeply counter-intuitive. This is after all the era in his writing most defined by the ecclesiastical sonnets. Surely a turn to religion, and its formulaic writing, can’t be seen as innovative (I’m parodying here). If we plot those distances, however, we begin to see a different story.

Wordsworth_Plots_DistanceBetweenWorks_ByPeriodWhat you can see happening in the middle period is the way the average lexical distance between works increases, but the variance shrinks (the standard deviation score that we saw above decreases in the middle period). That big clump in the middle where the curve rises are the ecclesiastical sonnets. So there is in fact a great deal of formal consolidation at work, but one that looks on average to be greater than the rest of the corpus.

It’s not that we should ignore the fact of an overall lexical heterogeneity, but that we need to see it in a particular light. There is a greater average difference between poems but also a greater degree of homogeneity to that difference.

I’m not sure if innovative or experimental would be useful terms anymore here, but it does tell us something about the poet’s career and the different shape periods can take.

In my next post I’ll try a different method using community detection to model this idea of linguistic coherence as a marker of poetic period.

M.A. Fellowship in Digital Humanities

The Department of Languages, Literatures and Cultures at McGill University is offering a 2-year fellowship to an incoming master’s student to participate in the recently funded Digging into Data project, “Global Currents: Cultures of Literary Networks, 1050-1900,” directed by Prof. Andrew Piper.

Project Overview

This project undertakes the cross-cultural study of literary networks in a global context, by integrating new image-processing techniques with social network analysis. Four unique databases will be examined, ranging from post-classical Islamic philosophy to the European Enlightenment. For more information, see here.

Research Team

Humanities scholars and computer scientists located at McGill University, Stanford University (U.S.A.) and Groningen University (Netherlands) will be involved in this project. The McGill team includes researchers from the Department of Languages, Literatures and Cultures, the Department of East Asian Studies, the Institute of Islamic Studies, and the School of Computer Science. The master’s fellow will be based in the Department of Languages, Literatures and Cultures and will be supervised by Prof. Andrew Piper.

Duties and Responsibilities

The master’s fellow will participate in all facets of the research project, learning about new techniques of visual language processing and social network analysis. Previous experience in digital methods and/or text mining an advantage. The fellow will participate in group meetings, develop a thesis around the project, and prepare research reports and other materials to facilitate knowledge sharing within the Digging into Data/Global Currents team and with external audiences.

Financial Support

The student will receive a stipend of $12,000 per year for two years.

To apply

Students wishing to be considered for this fellowship must apply for admission to the Languages, Literatures and Cultures master’s program by Jan 30, 2014. The successful student will begin work in the Fall 2014 semester. Please see the departmental admission requirements and application procedures and apply via McGill’s online application system. In addition, please send a CV and a cover letter expressing your interest in this fellowship opportunity to as soon as possible.