Introducing the World Literature Data Collective

Introducing the World Literature Data Collective

Welcome to the next moonshot. Together with a growing and dynamic group of researchers I am extremely proud to announce a new initiative aimed at understanding human storytelling across numerous world cultures.

The goal of globalizing our understanding of storytelling has long been a dream of literary studies. And yet the development of what has come to be known as “World Literature” has been hampered by some severe limitations. First, there are a lot of human languages! No one researcher can command even a small fraction of the total. Second, there are a lot of stories! No one researcher can command even a fraction of the total of one language in even one period. Third and finally, there is no standard corpus, collection or dataset of world literature. There is no agreed upon definition let alone list of the books that should be included or studied.

This is where the World Literature Data Collective (WLDC) comes in. We are a volunteer collective aimed at pooling different language expertise to digitize small collections of stories across a large number of languages and cultures. Our goal is to model World Literature as the set of regional literatures across the world, thereby promoting a comparative approach to understanding human storytelling.

You can read all about the collective at our new website as well as the criteria for getting involved. We’re looking for experts in different regional literatures to help us select books and we will bear the costs of digitizing them. We are also looking for experts in NLP to help build analytical systems across multiple languages.

Together we can create a diverse data set to study human storytelling in a more comparative, multilingual way.