Rethinking the Table of Contents

I wanted to share an experiment that I worked on with Mark Algee-Hewitt to reconstruct the table of contents of our new collaboratively authored book, Interacting with Print. The book was written by 22 co-authors around the theme of interactivity. Mark and I thought it would be great to do a digital intervention into the print convention of the TOC.

Below you can see two network graphs of relationships between chapters. The first is a network of links between the “renvois” inside of chapters. We used a system of cross-references to point to other chapters within the book that dealt with related themes [Paper] as in the French Encyclopédie. The second shows relationships based on the use of topic-modeling each of the chapters and drawing connections based on the presence of shared topics. The first represents authors’ explicit beliefs about which chapters are most related, while the second represents latent connections derived through shared language. In each case, we move past the linear table towards the more reticular network.

These networks tell us different things about the relationships within our book. The renvoi network shows that binding, a chapter about constructing books, is the most centrally connected, followed by thickening, which is about adding pages to books. One can see how visual chapters like Frontispieces, Engraving, and Stages mark out one pole while non print spaces mark out another. You can also follow the directionality and move from letters to manuscripts, or from advertising to catalogs, or spacing to disruption to ephemerality in a suggestive causal sequence.

Where the first network privileges gerunds, the network of latent topics is more centrally organized around qualities like Paper and Ephemerality. That these are the two most linguistically central chapters suggests an interesting medial centre point of our history (paper), as well as an interesting new temporal framework that has been far less central to print studies in the past. Print has most often been associated with notions of permanence and reproducibility. Focusing on interacting with print seems to move that focus more towards the fleeting and contingent aspects of print media.

This is obviously just a beginning in experimenting with ways that computational methods can interact with print conventions and change the way we organize and structure information. Surprisingly, we still remain in a very print-centric universe when it comes to sharing and archiving information. We hope experiments like this one will nudge us towards trying out more alternatives.



Congratulations to Victoria Svaikovsky ARIA Intern for 2017

McGill hosted its annual event to showcase the work of undergraduate summer research projects. Among the many amazing projects was .txtLAB’s Victoria Svaikovsky who led a research project with two other students, Anne Meisner and Eve Kraicer-Melamed, on studying the intersections of race and Hollywood film using computational analysis.

Their project aimed to better understand the racial inequality that has long been identified in academic and popular criticisms of Hollywood. Focusing on questions of visible and audible marginalization as well as linguistic tokenization with respect to visible minorities, Svaikovsky and her team have produced an impressive study of over 800 screenplays.

Their work will be coming out as a lab white paper in the near future. Keep an eye out!

Reviewer Interviews: Megan McLennan


Megan McLennan is in her third year of undergraduate studies at Western University, pursuing an Honors Specialization Degree in Creative Writing and English Language and Literature. McLennan is known for her Youtube channel called “The Teen Book Addict” that has over 980 subscribers, on which she regularly shares video critiques and reviews of novels.


It’s so frustrating to tell someone that . . . it sounds like your kid will like the book, but you’re refusing to buy it because you’re gendering the content.

Ms.McLennan, what drew you to studying literature and dedicating your post-secondary degree to the literary field?

In high-school, I had English classes, but I felt indifferent towards the books we were required to read, they were all focused on male authors-specifically, white male authors. I wanted to study English because I love reading and I wanted to get graded on talking about books from a more diverse spectrum. That’s partially why I started my YouTube Channel, the BookTube community of YouTube was a place where I could freely talk about books that interested me with other readers.

Let’s talk about your YouTube Channel, when did you make the transition from a consumer to producer?

I started my channel in late 2013. Originally, I would come across a book I wanted to read, and then would go and look at book reviews online to see if it was something I would like. This lead to me finding blogs of book reviewers which then lead me to their YouTube channels. I only had one friend in my immediate friend group who liked to read, but she wasn’t interested in the same books as me so I went looking for people who wanted to talk about what I wanted to. Even if I came across a book that had a two-star rating from a famous reviewer, I would look up what other, everyday readers thought of it before and after I read it, to see if we had the same opinions or were critiquing the same things.

As a consumer of the reviews on BookTube, did you notice any correlation regarding gender and genre? Were there more female authors being reviewed in genres typically seen as belonging to a woman’s sphere, such as romance?

Most of the books I was reading when I first started integrating myself into the book reviewer world were Young Adult, fantasy novels. Most of those were written by women, and it’s often female teenagers or young adults reading and reviewing them, so there was a drive for female authors. But I also saw many people looking down on those books because they were “Young Adult.” Many people suggested that because they were written and consumed by girls therefore weren’t “serious” books. However, as I got older and started reading the “Adult” novels, the majority of authors were males, which meant that all the books that were being reviewed were by men. I noticed quite a big shift once I branched out from “Young Adult” to “Adult,” that even if I stuck to Fantasy, male authors had suddenly become dominant across all genres.

You can’t have a white person write a book about what it’s like to be a Black child, you want someone from that community to represent and tell their own story.

With that understanding, when you became a producer of book reviewing content, did you make an effort to recognize the demographics of the authors you were reviewing?

I’ve been trying to do it more, trying to find the genuine voice and be more conscious about the voice and the authenticity of the perspective. I have a responsibility to be more aware of it, especially being an English student where the majority of the works we study are written by white men. I’m trying to stick to books that push for our “own voices,” you can’t have a white person write a book about what it’s like to be a Black child, you want someone from that community to represent and tell their own story. I think that’s something that’s important to think about when reading and reviewing books. On a more general note, there were some books I reviewed that I was genuinely interested in and wanted to read, but there were also a lot where I was like this is new, this is what everyone’s talking about. As a reader you don’t want to have FOMO – the fear of missing out – so because the majority of the BookTube community is reading it, I want to see what the hype is about and form my own opinion.

BookTube seems like a great platform for you to share those opinions on, did you ever interact with your consumers, did you notice any trends in their demographics, specifically with regard to gender?

Booktube is a community, and because it’s such a welcoming one I was definitely able to interact with my viewers and other YouTubers. Most of the people I interacted with were other self-identifying women because most of BookTube is predominantly women. Obviously, there are some male Youtubers, but often it is women and I think that’s something that BookTube in itself addresses, because a large portion of consumers are female, and thus a large portion of the creators are female. There were quite a few other YouTubers that started their channels around the same time as me, from there I would like and comment on their videos, and they would do the same for me. One time I did a “Buddy Read” and a live show, which really helped facilitate BookTube as a reading community rather than me just posting and waiting for people to watch my videos without any feedback or discussion.

Even if you don’t have a personal connection to the producer or the consumer, you’re still connected by this shared love of reading which in itself is personal. Reading is personal.

YouTubers are often criticized for being too capitalistic, would you say BookTube as a community is as well? You mentioned the FOMO of not reviewing the most popular books, do you think then, that the presence of capitalism affects what books are being reviewed?

I think for the most part with more popular YouTubers no matter what it can be capitalistic. If you can make money doing something you love wouldn’t you capitalize on it? But, to be honest, the BookTube community seems to just be about people having someone to talk to about the books they want to talk about. Most people, including myself, don’t have people in their immediate life where they can directly talk about the books they want to as in-depth as they want to. BookTube felt very genuine, no matter what, YouTube videos will have a capitalistic shadow but on BookTube, it feels like people are there more for the books and their love of reading. Even if you don’t have a personal connection to the producer or the consumer, you’re still connected by this shared love of reading which in itself is personal. Reading is personal, reviewing is personal.

Aside from your YouTube Channel, you’ve also worked at a Bookstore for years recommending books to the public. Can you tell me about any notable demographic assessments on the consumers you interacted with?

Yes! I’ve been working at a popular bookstore for four years now, interchanging between two branches. With consumer demographics, usually the consistent customers are women, there were some customers who I knew by name and some who would come in weekly and purchase books. This proves that women are huge consumers of literature but from what I’ve noticed only certain stereotypical books are targeted towards them. My friends that are interested in reading are predominantly women, the customers that come in are predominantly women, and they’re all interested in such a diverse spectrum of genres! It’s so odd to me that publishers are not looking at what women are reading because the majority of women come in looking for mysteries and thrillers but all the marketing is done to point them to romance and fantasy. This is a problem.

At Bookstores, parents are often buying books for their children, have you ever come across cases where gender division exists both explicitly and implicitly with regard to youth?

There is so much gender division, with parents and sometimes even with young readers. Often, I’ll talk to a parent who is looking for a recommendation to try to get a sense of what the child is interested in, then I’ll suggest a book and they’ll right away ask “is it a boy book, or is it a girl book?” If my answer isn’t what they want to hear, often they won’t buy it. It’s so frustrating to tell someone, that from what it sounds like your kid will like the book, but you’re refusing to buy it because you’re gendering the content. The author most likely didn’t write this book for one specific demographic, they just want to reach as many readers as they can. To be honest, this rarely happens when parents are looking for books for their daughters, it’s usually when someone is coming in and looking for a book for their sons. It’s honestly this fragile masculinity, “if I give them this book, what does it mean?” Even if the book just has a picture of the girl on the cover people will dismiss it and consider it as a “girl book.” Not all parents are like this though, but the majority are.

Is there a difference in the books that are being marketed as gendered? Are the books that are being suggested to young girls the same as those that are being suggested to young boys with regard to structure and content?

Something I have noticed is that the books that are marketed for girls are heavy on character development, often including a romance and are emotionally driven and lack a substantial plot. Whereas, the books that are marketed for boys are often heavily plot-driven with cardboard characters. There are people that prefer heavy character development and there are people that prefer heavy plot construction, so why does it need to be gendered because a preference for plot or character is just a preference it has nothing to do with gender, by marketing it as gendered you are socializing that content to be gendered. It’s especially saddening when you see this happening to young children. There are some books – the ones that aren’t marketed as gendered that do extremely well, for example, Harry Potter and the Percy Jackson series. Both series have well-developed characters and strong plots, however, they are the minority as the majority of other books falls into gendered demographics.

We’ve talked about quite a few issues in the literary world that need addressing, but in your opinion what do you think is the largest and most immediate problem, specifically with regard to reviewing?

Right now, I think the biggest problem is regarding the difficulty that people of colour, especially women of colour have with being published and being recognized as getting published and having their work reviewed. Their work is just as good, and often even better than the books that are being published. Like we talked about earlier, there is and needs to be a push for “own voices” so that people from minority communities can take control and share their own stories and perspectives. Hopefully, a push for this will create less whiteness in authorship and more diversity in the literary world. I can only imagine how frustrating it must be to see these people who are profiting off of your experience to make capital, and when you try and tell your story just to tell it, you’re turned down. I can’t imagine how awful that must be. I think there’s starting to be a turnaround, there’s been a contemporary novel called “The Hate U Give” by Angie Thomas that has been on the New York Times Bestselling list for weeks and has been doing so incredibly well it’s being made into a movie. So I think we’re starting to see a shift because now the demand is being noticed, but there’s still a long way to go for representation, especially at the higher end of who chooses who gets to be published and what books are being reviewed.

A lot of the change making has to do with the ground level. It’s the support of readers; readers continuing to support authors who are part of these marginalized communities by promoting their books and talking about their books.

If the problem then is the lack of women and minority women that are being published, how do we go about getting the attention of publications so that we can make this change?

A lot of the change making has to do with the ground level. It’s the support of readers; readers continuing to support authors who are part of these marginalized communities by promoting their books and talking about their books because if it’s being talked about it forces the publishers to have to hear it. Read books by women, and review them, include them in the conversation. There’s this excuse that “the stories of marginalized people don’t sell” that gets tossed around a lot as an excuse for publishers and reviewers to ignore the work of minority groups. They don’t want to take a chance on it but they’ll take a chance on a white author who is writing about the same thing, or something of lesser quality. By having these books on the New York Times Bestselling list, reviewing them and making them into movies, it forces publishers and other reviewers to see it. For them, I’m sure it’s about capital and money. It’s all systematic, the racism, the sexism and homophobia-it’s all institutionalized. The value of the book has to be shown through money, it’s as if  “Oh, it sold a lot maybe we should make more books like this”. It’s the responsibility of readers to drive the push and also the responsibility of those at the higher publication levels to listen and work with these women and be willing to edit and help them on their stories in order to start making the change.

To keep up to date with Megan and her reviews, check out her social media platforms:

YouTube: []

Twitter: []


Gender and Equity in Publishing

The Just Review team held an inspiring event last night. It was a roundtable of six women discussing their experiences with academic and literary publishing. It was an amazing conversation covering many different perspectives. We had two academics, one editor, one publisher, a novelist and a poet. Here are some of the themes they touched on.

Cultivating Confidence

Putting oneself forward was a theme that kept recurring. Whether it was the confidence to send off your manuscript or speak up at a literary festival or reach out to a mentor, many of the panelists discussed how they consistently had to work against their own inner inhibitions. Based on their success as individuals you would never guess that this is something they wrestled with. But something they strongly emphasized was cultivating the confidence at an early an age as possible to take risks, speak out, and put oneself forward.

Prioritizing Carework and Generosity

Another key theme was about avoiding the myth of scarcity, by which they meant seeing gender and job competition as a competition or zero sum game. Instead, they encouraged all of us to think about how to cultivate the work of others and how, in the words of one participant, “to take up less space.” This might seem in contradiction to the first point about putting oneself out there, but it offers another way to think of literary work. Not only find your place, but do the work to make it possible for others, especially others who may have less privilege than you, to find their place. Generosity and empathy were two states of mind that were strongly emphasized.

Creating Parastructures

Finally, a core theme that kept emerging was the importance of creating peer-networks and “collectives.” Inevitably as a woman you will be subject to some kind of bias or discrimination in your career. These extra-institutional structures can be an important way of finding more rewarding spaces to work and create and find more open feedback loops to help improve your work. Creating these networks takes time. But the participants emphasized just how valuable such spaces have been in their lives and careers, whether it was creating independent presses, writing groups, or women-led gaming communities.

Much more was discussed over the hour and a half event that I can’t cover here. But I think it was a really crucial conversation to have and one that I hope inspired the many students who were present. I know I learned an incredible amount.

Investigating Topic Bias and Gender Representation in Syllabi

We were disheartened to find that courses at McGill upheld gender bias, and across all faculties, male authors largely overwhelmed female authors.

Topic bias occurs when certain topics are disproportionately associated with specific genders. For example, we know that in book reviews female authors are more often associated with topics that are traditionally coded as feminine, such as love, family and relationships, while male authors are more likely to be reviewed when writing about science, politics, and economics. This is not to say that there are no female authors in these other more “public” fields, but that male authors have a much higher chance of being perceived as experts on these topics. Topic bias is important because it shapes public perception about who can be an expert in a given area.

We decided to study topic bias at McGill by examining undergraduate syllabi from subjects in five faculties: arts, science, management, engineering, and education. We were interested in measuring topic bias at McGill to see the extent to which courses are coded through gender. Selecting courses within the last three years, we calculated the ratio of male and female authors for each syllabus. Below are some graphs that consolidate our findings. We were disheartened to find that courses at McGill upheld gender bias, and across all faculties, male authors largely overwhelmed female authors. Gender studies was the only subject dominated by female authors, suggesting that women are only perceived as majority experts on issues related specifically to women. Furthermore, we were unable to find a single female author in any of the engineering syllabi we looked at. We were also surprised to see that the social sciences (economics, political science) were more male-dominated than certain fields in the sciences.

This is not to say that professors intentionally draw from certain authors when putting together syllabi for the year, but rather that the books and print materials that receive the most attention for certain fields are linked directly to the gender of authors.

Take a look at the reading materials on your syllabi. Is there a gender skew? Do they uphold current gender norms elsewhere? If you have a syllabus that contradicts our results and challenges topic bias, send it our way!

Our results:

Engineering, where n=12:

Education, where n=12:


Arts, where n varies depending on major:

Management, where n varies depending on major:

Science, where n=5:


Data comparison:

Are novels getting easier to read?

I’ve been experimenting with using readability metrics lately (code for the below is here). They offer a very straightforward way of measuring textual difficulty, usually consisting of some ratio of sentence and word length. They date back to the work of Rudolf Flesch, who developed the “Flesch Reading Ease” metric. Today, there are over 30 such measures.

Flesch was a Viennese immigrant who fled Austria from the Nazis and came to the U.S. in 1933. He ended up as a student in Lyman Bryson’s Readability Lab at Columbia University. The study of “readability” emerged as a full-fledged science in the 1930s when the U.S. government began to invest more heavily in adult education during the Great Depression. Flesch’s insight, which was based on numerous surveys and studies of adult readers, was simple. While there are many factors behind what makes a book or story comprehensible (i.e. “readable”), the two most powerful predictors are a combination of sentence and word length. The longer a book’s sentences and the more long words it uses, the more difficult readers will likely find it. Flesch reduced this insight into a single predictive, and somewhat bizarre formula:


W = # words, St = # sentences, Sy = # syllables


According to Flesch’s measure, Rudyard Kipling’s The Jungle Book has a higher readability score (87.5) than James Joyce’s Ulysses (81.0). Presidential inaugural speeches have been getting more readable over time. The question that I began to ask was, have novels as well?

The answer, at first glance, is yes. Considerably so. Below you see a plot of the mean readability score per decade for a sample of ca. 5,000 English-language novels. These novels are drawn from the Stanford Literary Lab collection and Chicago Text Lab. The higher the value the more “readable” (i.e. less difficult) a text is assumed to be. The calculations are made by taking 20 sample passages of 15-sentences from each novel and calculating the Flesch reading ease for every passage. Then for every decade I use a bootstrapping process to estimate the mean reading ease for that decade. Error bars give you some idea of the variability around the mean per decade. What this masks is a very high variability at the passage level. Nevertheless, despite this the overall average is clearly moving up in significant ways.

One question that immediately came to mind was the extent to which these scores are being driven by an increase in dialogue. Dialogue is notably “simpler” in structure with considerably shorter sentences, and potentially shorter words to capture spoken language. I wondered whether this might be behind this change.

Below you see a second graph with the percentage of quotation marks per decade. Here I simply calculated the number of quotation mark (pairs) per novel and used bootstrapping to estimate the decade mean. As you can see, they rise in very similar fashion, though with a noticeable break where two data sets are joined together. Mark Algee-Hewitt has a lot to say on this issue of combining data sets. It’s interesting that typographic things like quotation marks are way more problematic for this issue than something more complex like “readability.” A lot also depends on my very simple model of modelling dialogue. It could just be that they get more standardized and thus appear more frequent, but I don’t think that’s entirely the case. Either way, this could definitely use improvement.

With these caveats in mind, there is a very strong correlation between the number of quotation marks used per decade and the readability of novels (r = 0.86). It suggests that dialogue is a big part of this shift towards more readable novels.

But what if we remove dialogue? Are novel sentences outside of dialogue getting simpler, too?

I don’t have an answer to that yet. And while it will be an important facet in order to nuance this issue, either way what we are seeing is how the novel, as represented in these two collections, follows a very straightforward trajectory towards simpler sentence and word lengths over the past two centuries. Much of that can be explained by greater reliance on dialogue, but that too is an important part of the readability story.

Why has this been the case? Commercialization, growth of the reading public…I don’t know. I think these are potential explanations but they require more data to show causality. What I can say is that based on the work I’m doing with Richard So on fan fiction is that fan-based writing — non-professional, yet high volume — does not exhibit significantly higher readability scores than “canon” does (i.e. the novels on which fanfic is based). In other words, in this one case expanding the user/reader base doesn’t correlate with simpler texts like you might expect.

It also looks as though readability has plateaued. Perhaps we’re seeing a cultural maximum being achieved in terms of the readability of novels. Then again, only time will tell.


* The other nice thing about readability is there is a great R package called koRpus to implement it. You can access the code through GitHub here.

An Open Letter to the MLA

Dear Prof. Taylor,

I am writing to you as a member of the MLA who has concerns about the practices and policies relating to the society’s data and its impact on research. This is an issue that effects many scholarly organizations. For this reason I have chosen to write an open letter.

The MLA has emerged as an important champion of the principles of open access scholarship. The creation of the MLA Commons represents a recent positive example of such pro-active work.

It is all the more troubling to realize that such open access does not apply to the MLA’s own data. I was recently served with a take-down notice by my university library for publicly sharing data and code used in a recent publication with the PMLA. The data was drawn from the MLA database and represented two years worth of records, one collection from 2015 and one from 1970. When I contacted the MLA to ask for the data outside of such corporate mediation I was refused. Here we have a case where data from the MLA was used to support an article published by the flagship publication of the MLA that is now being repressed from public view.

The MLA database is an essential source of knowledge about the practices within our field. As we have begun to learn, metadata alone can reveal a great deal of information about the behaviour of a community. In my own work I am interested in studying the concentration of attention surrounding literary authors, especially with respect to gender and racial diversity and how such concentration has changed (or not) over time.

Below I attach a screenshot of the licensing agreement that my university has signed with ProQuest, who distributes the data for the MLA to our university library. As you can see, principles i-k all violate essential norms of research. Not being able to mine a database (i) means that it has been walled off from standard research practices. Not being able to communicate materials received from the service (j) means that the evidentiary bases of claims using the data cannot be publicly shared or externally validated. And not being able to download parts of the service in a systematic manner (k) means that we cannot study the contents of the database in any responsible fashion. These are all principles that favour a mode of interaction with information that is both out of date and prohibitive in terms of the accepted norms of academic research today.

The MLA, and it should be added numerous other scholarly organizations, have contracted out the organization and access to their data to third parties, most of whom are private, for-profit initiatives. These parties’ business models are in direct conflict with the scholarly mission of the society, indeed any academic society. While this may have been an arrangement that was initially convenient, not to mention profitable, it is no longer an acceptable way of curating data within an academic context. Libraries need to stop signing license agreements that limit access to data in the library. And scholarly organizations need to stop signing license agreements that limit access and the public circulation of their data. Anything short represents a serious abrogation of scholarly responsibility.

I would be happy to work with you to craft data policies that are more in line with the values and norms of scholarship. The MLA has an opportunity once again to take the lead in this important matter.


Andrew Piper