Introducing The Fish and the Painting: a new open access handbook-in-progress on data-driven humanities research

Introducing The Fish and the Painting: a new open access handbook-in-progress on data-driven humanities research

It’s like hitting a painting with a fish. This was the quip that a well-known British novelist made when asked about using computation to study literature. You could, but why would you?

I think it’s becoming increasingly clear to people all of the ways data is giving us insights into how literature and culture works.

What’s not totally clear is how we make this transition institutionally and pedagogically.

There are still a ton of roadblocks for adapting these methods in the humanities. Some of those are ideological (insert obvious examples), but many are simply practical. Like, how do you create courses that can help students with little to no quantitative or programming skills learn to apply these methods to studying culture? If you’re a faculty member, journalist, or independent researcher interested in studying a large collection of documents, how do you know where to begin?

This is where The Fish and the Painting comes in. I’ve been teaching a course like this for several years now and have learned many of the challenges students face, but also seen the rewards and the exciting outcomes when things finally click. Watching students become competent data scientists and have their thinking about culture transformed has been truly transformative for me as a teacher.

I don’t pretend to have all the answers about the best ways to introduce students or peers to these new methods. But my hope is that this handbook can be a useful tool to help teachers create syllabi that give students hands-on access to data-driven research or give faculty an overview of the type of thinking that goes into successful data-driven research.

This part is key: it’s not just a programming how-to guide. There are plenty of those out there. It’s meant to be a guide to thinking like a data scientist (or data humanist), which for me is still the biggest missing piece of the puzzle. How can we frame questions and think about data collection and operationalizing theoretical constructs? These are some of the key questions we need to answer first.

But it’s also give meant to give readers an introduction to the practical problems surrounding accessing new programming skills. These can be challenging and I’ve tried to do so in a way that is lighthearted and hopefully not too boring. But I’ve also tried to do so in a way that elaborates on key issues and problems surround each step. It’s not just click and go. Finally, there will be a bunch of case studies thrown in drawn from the latest research in the field. It’s not all hypothetical anymore. There’s great stuff happening out there that can and should inspire others.

“Access” is of course a hot topic in the academy right now and I have been thinking more and more how to tackle it. In this book, I try to address it in 3 significant ways. First, access is about format and cost. This is why I wanted to do this as an open web-based book in progress, and then as an open-access textbook after that. I want people to have as few material mediations as possible to the contents of the book. It won’t change the broken ecology of academic publishing, but it can hopefully move us in the right direction.

I also want people to have access to the content in the sense that they can drive its direction. I’ve turned on all the comments sections ideally so people can provide feedback about bugs or missing pieces of the research puzzle. We can turn our research inside out where instead of magical things popping out of a black box, we can invert the process and reveal things as they go, and ideally, learn and be transformed in the process. The open science movement provides an amazing blueprint for a very different way of doing research in the humanities, one tied far more to principles of the visibility of knowledge that we are studying in another project.

Finally, access is also about conceptual access too. One of the biggest issues surrounding the adoption of computational methods is that they can seem opaque to people with no training in this area. Just making something “free” doesn’t make it “accessible.” It takes a lot of work and a long time to get comfortable with a discipline’s terminology and methodologies. I’m still working at it for sure. So the book is also really geared towards facilitating conceptual access to these methods for a wider audience, whether this be advanced undergrads, grad students or faculty peers.

We need to continue to work to break down the barriers and mythologies surrounding data-driven research. I hope The Fish and the Painting can contribute to that.