Introducing the CONLIT dataset of contemporary literature

Introducing the CONLIT dataset of contemporary literature

Excited to announce the release of a new data set curated by my lab. Special thanks go to Joey Love and Eve Kraicer for their work in helping bring this to fruition. This dataset includes derived data on a collection of ca. 2,700 books in 

Hathi1M: Introducing a Million Page Historical Prose Dataset in English from the Hathi Trust

Hathi1M: Introducing a Million Page Historical Prose Dataset in English from the Hathi Trust

Really pleased to announce the release of a new data set that I’ve been working on with my collaborator Sunyam Bagga. In it we build on the prior work of Ted Underwood and his team to develop parallel corpora of fiction and non-fiction writing over