Tag: data

Hathi1M: Introducing a Million Page Historical Prose Dataset in English from the Hathi Trust

Hathi1M: Introducing a Million Page Historical Prose Dataset in English from the Hathi Trust

Really pleased to announce the release of a new data set that I’ve been working on with my collaborator Sunyam Bagga. In it we build on the prior work of Ted Underwood and his team to develop parallel corpora of fiction and non-fiction writing over 

Introducing the World Literature Data Collective

Introducing the World Literature Data Collective

Welcome to the next moonshot. Together with a growing and dynamic group of researchers I am extremely proud to announce a new initiative aimed at understanding human storytelling across numerous world cultures. The goal of globalizing our understanding of storytelling has long been a dream 

Can We Be Wrong?

Can We Be Wrong?

I have a new book out. It’s called “Can We Be Wrong? The Problem of Textual Evidence in a Time of Data.” The goal of the book is to change the terms of debate surrounding the place of computational literary analysis within the field literary 

Where’s the data? Notes from an international forum on limited use text mining

Where’s the data? Notes from an international forum on limited use text mining

I’m attending a two-day workshop on issues related to data access for text and data mining (TDM). We are 25 participants from different areas, including researchers who do TDM, librarians who oversee digital content, and content providers who package and sell data to academic libraries 

An Open Letter to the MLA

An Open Letter to the MLA

Dear Prof. Taylor, I am writing to you as a member of the MLA who has concerns about the practices and policies relating to the society’s data and its impact on research. This is an issue that effects many scholarly organizations. For this reason I 

Data, data, data. Why Katherine Bode’s new piece is so important and why it gets so much wrong about the field

Data, data, data. Why Katherine Bode’s new piece is so important and why it gets so much wrong about the field

Katherine Bode has written an excellent new piece asking us to reflect more on the data we use for computational literary studies. Her argument is that many of the current data sets available, which rely on date of first publication as a criteria for selection, 

Why your dissertation needs data

Why your dissertation needs data

Dear Future Graduate Students, It’s that time of year to start thinking about grad school. It’s one hell of a big decision. In fact, it is, without doubt, one of the biggest decisions you will ever have to make in your life. This decision can