» SEASR Grant Matthew L. Jockers

This month a group of researchers at Stanford, University of Illinois, University of Maryland, and George Mason were awarded a $790,000 grant from the Mellon Foundation to advance the prior work of the SEASR project. I’ll be serving as the overall Project Director and as one of the researchers in the Stanford component of the grant. In this phase of the SEASR project, we will focus on leveraging the existing SEASR infrastructure in support of four “use cases.” But “use case” hardly describes the research intensive nature of the proposed work, nor does it capture the strongly humanistic bias of the work proposed. Each partner has committed to a specific research project and each has the expressed goal of advancing humanities research and publishing their results. I’d like to emphasize this point about advancing humanities research.

This grant represents an important step beyond the tool building, QA and UI testing stages of software development. All too often, it seems, our digital humanities projects devote a great deal of time, money, and labor to infrastructure and prototyping and then all too frequently the results languish in the great sea of hammers without a nail. Sure, a few journeymen carpenters stick these tools in their belts and hammer away, but all too often it seems that more effort goes into building the tools and then the resources sit around gathering dust while humanities research marches on in the time-tested modes with which we are most familiar.

Of course, I don’t mean this to be a criticism of the tool builders or the tools built. The TAPOR project, for example, offers many useful text analysis widgets, and I frequency send my colleagues and students there for quick and dirty text-analysis. And just last month I had occasion to use and cite Stefan Sinclair’s Voyeur application. I was thrilled to have Voyeur at my finger tips; it provided a quick and easy way to do exactly what I wanted.

But often, the analytic tasks involved in our projects are multifaceted and cannot be addressed by any one tool. Instead, these projects involve “flows” in which our “humanistic” data travels though a series of analytic “filters” and comes out on the other end in some altered form. The TAPOR project attempts to be a virtual text analysis “workbench” in which the craftsman can slide a project around the bench from one tool to the next. This model works well for smallish projects but is not robust enough for large scale projects and, despite some significant interface improvements over the years, remains, for me at least, a bit clunky. I find it great for quick tasks with one or two texts, but inefficient for processing multiple texts or multiple processes. Part of the TAPOR mission was to develop a suite of tools that could be used by the average, ordinary humanist: which is to say, the humanist without any real technical chops. It succeeds on that front to be sure.

SEASR offers an alternative approach and what it provides in terms of processing power and computational elegance it gives up in terms of ease of use and transparency. The SEASR “interface” is one that involves constructing modular “workflows” in which each module corresponds to some computational task. These modules are linked together such that one process feeds into the next and the business of “sliding” a project around from one tool to another on the virtual workbench is taken over by the workflow manager.

In this grant we have specifically deemphasized UI development in favor of output, in favor of “results” in the humanities sense of the word. As we write in the proposal, “The main emphasis of the project will be on developing, coordinating, and investigating the research questions posed by the participating humanities scholars.” The scholars in this project include myself and Franco Moretti at Stanford, Dan Cohen at GMU, Tanya Clement at University of Maryland, Ted Underwood and John Unsworth both of UIUC. On the technical end, we have Michael Welge and Loretta Auvil of the Automated Learning Group, of the National Center for Supercomputing Applications.

As the project gets rolling, I will have more to post about the specific research questions we are each addressing and the ongoing results of our work. . .