Tag: text mining

Gettin’ into GitHub

Gettin’ into GitHub

One of my New Year’s Resolutions was to get stuff off my computer and onto GitHub. I know I’m late to the party. But better late than never. GitHub is an amazing resource where researchers all over the world share code. When we talk about 

The difference Queer FanFic makes

The difference Queer FanFic makes

Fanfic isn’t all about sex. It’s about connection. Two students in our lab, Nikoo Sarraf and Jennifer Chen, have a new lab collaboration paper out that explores the way queer fanfiction differs from mainstream publishing. As they write in their introduction: Fanfiction is a powerful 

A Sense of an Ending: Poetry and Periods

A Sense of an Ending: Poetry and Periods

As part of my on-going fascination with punctuation, in Enumerations I look at the words that are most likely to be followed by a period in a collection of 75,000 twentieth-century poems. What we see are the very pronounced ways that poems tend to end 

Where’s the data? Notes from an international forum on limited use text mining

Where’s the data? Notes from an international forum on limited use text mining

I’m attending a two-day workshop on issues related to data access for text and data mining (TDM). We are 25 participants from different areas, including researchers who do TDM, librarians who oversee digital content, and content providers who package and sell data to academic libraries 

The Replication Crisis I: Restoring confidence in research through replication clusters

The Replication Crisis I: Restoring confidence in research through replication clusters

Much has been written about the so-called “replication crisis” going on across the sciences today. There are many ways that these issues impact literary and cultural studies, but not always in the most straightforward way. “Replication” has a complicated fit with more interpretive disciplines and 

Are novels getting easier to read?

Are novels getting easier to read?

I’ve been experimenting with using readability metrics lately (code for the below is here). They offer a very straightforward way of measuring textual difficulty, usually consisting of some ratio of sentence and word length. They date back to the work of Rudolf Flesch, who developed 

LLCU 255: Intro to Literary Text Mining — New Syllabus 2017

LLCU 255: Intro to Literary Text Mining — New Syllabus 2017

Less but better. That’s the essentialist’s motto and that’s the one I use every year when I revise my syllabus. I keep removing things and students keep learning more every year. While there is clearly a ceiling for this approach, it works remarkably well as