Tag: text mining

A Sense of an Ending: Poetry and Periods

A Sense of an Ending: Poetry and Periods

As part of my on-going fascination with punctuation, in Enumerations I look at the words that are most likely to be followed by a period in a collection of 75,000 twentieth-century poems. What we see are the very pronounced ways that poems tend to end […]

Where’s the data? Notes from an international forum on limited use text mining

Where’s the data? Notes from an international forum on limited use text mining

I’m attending a two-day workshop on issues related to data access for text and data mining (TDM). We are 25 participants from different areas, including researchers who do TDM, librarians who oversee digital content, and content providers who package and sell data to academic libraries […]

The Replication Crisis I: Restoring confidence in research through replication clusters

The Replication Crisis I: Restoring confidence in research through replication clusters

Much has been written about the so-called “replication crisis” going on across the sciences today. There are many ways that these issues impact literary and cultural studies, but not always in the most straightforward way. “Replication” has a complicated fit with more interpretive disciplines and […]

Are novels getting easier to read?

Are novels getting easier to read?

I’ve been experimenting with using readability metrics lately (code for the below is here). They offer a very straightforward way of measuring textual difficulty, usually consisting of some ratio of sentence and word length. They date back to the work of Rudolf Flesch, who developed […]

LLCU 255: Intro to Literary Text Mining — New Syllabus 2017

LLCU 255: Intro to Literary Text Mining — New Syllabus 2017

Less but better. That’s the essentialist’s motto and that’s the one I use every year when I revise my syllabus. I keep removing things and students keep learning more every year. While there is clearly a ceiling for this approach, it works remarkably well as […]

Just Review, a student led project on gender bias in book reviewing

Just Review, a student led project on gender bias in book reviewing

For years, women have been aware that their books are less likely to get reviewed in the popular press and they are also less likely to serve as reviewers of books. Projects like VIDA and CWILA were started to combat this kind of exclusion. Over […]

LIWC for Literature: Releasing Data on 25,000 Documents

LIWC for Literature: Releasing Data on 25,000 Documents

Increasing emphasis is being placed in the humanities on sharing data. Projects like the Open Syllabus Project, for example, have made a tremendous effort in discovering, collecting, and cleaning large¬†amounts of data relevant to humanities research. Much of our data, however, is still locked-up behind […]


%d bloggers like this: