Building Digital Classics

Contributed by Hugh Cayless Programmer/Analyst at NYU Digital Libraries Technology Services
May 07, 2011
hcayless's picture
Part of the Cluster:

Making Room

Digital Classics

“Digital Classics” is not a recognized sub-discipline of classics; it is more of an underground movement. The Digital Classicist site deliberately avoids defining it, though it could be characterized as the application of digital methods, tools, and “ways of knowing” to the field of classics, broadly understood.

The Digital Divide in Classics

Classics has the appearance to outsiders of being an extremely "plugged in" discipline, thanks to some very high-profile centers and projects. The truth is more complex, and much more problematic for the digital future of the discipline. The distribution of technical expertise and interest is extremely uneven in classics. There are a few practitioners who are far (decades) ahead of everyone else, but the typical attitude toward digital methods as ways of knowing in the discipline is downright hostile in many quarters. Mainstream classics is largely concerned with literary criticism, and while there are sub-disciplines where digital methods are a better fit, the focus of the discipline has long since turned away from "discovery" towards "insight". Put another way, much modern classical scholarship works mainly on the interpretation of texts, and such research (while it benefits greatly from searchable digital copies of the canon) does not require or have much use for digital methods.1 So you have a situation where digital praxis meshes very well with a praxis that has been, since the 1980's at least, rather deprecated in mainstream classics. Much of the work necessary to implement digital research projects looks like "Lower Criticism" to established practitioners.2 The practical upshot of this situation is that mainstream scholars working in classics don't need to engage very deeply with the digital world in order to do their work, and indeed, focusing on digital praxis might well be seen by one's peers to detract from "real" research. This situation is perhaps made paradoxically worse by the highly visible successes some digital classics projects have had. Since it is possible for a small number of people (and many of those traditional classics faculty) to make a great deal of progress without much engagement from the community, that community can wait to reap the benefits of their research and development without much pressure to engage with it.

The typical classicist is quite happy to use tools that are built (or are offshoots of research) by the digital few, but probably does not view that kind of effort or experimentation as real “research.” As a result, graduate students interested in digital classics are discouraged from that sort of work. They do not go on to become classics faculty, though alternate-academic careers are certainly open to them. I don't know of anyone of my generation who had a digital classics focus in graduate school and who now has a faculty job. We all have #alt-ac or non-academic careers. Alumni of the Perseus Project (one of the flagship centers of digital classics) seem to go into computer science instead of classics, which is telling.

This kind of tension between the traditional humanities and digital humanities is far from unusual. It exists across many disciplines, but the relatively small size of classics means that there are fewer spaces for "liminal" people, like digital classicists, to exist in a traditional setting. It is becoming ordinary to find faculty in English or History with a digital orientation, but this is much rarer in classics. Most of the innovative digital work in classics comes out of centers rather than departments, places like Perseus at Tufts, the Harvard Center for Hellenic Studies, and the Institute for the Study of the Ancient World at NYU. Because this work is concentrated around a few individuals, there is inherent instability. The death of Ross Scaife in 2008, for example, probably set digital classics back by 5-10 years. One reason for the primacy of centers in digital classics is that the majority of research in classics very much fits the "lone scholar in an ivory tower" mold. Collaboration is a rarity, and since digital humanities projects are nearly always collaborative in nature, they do not fit the mode of work classicists are used to. Perhaps centers lack the cultural baggage that impedes the adoption of collaborative work in the departments.

There are sub- and related disciplines of classics where the digital has found a foothold. In classical archaeology, for example, digital techniques are manifestly useful (and collaboration is inevitable); epigraphy and papyrology have long relied on databases as research tools and are clear-headed enough to realize that the creation of these requires people comfortable with both scholarship and technology. My day job involves working on digital papyrology, and the reception the project is getting from the community seems universally positive.

My own perspective should be made clear here: I gave up on pursuing a faculty position after I earned my Ph.D. for a variety of reasons, some of them the usual ones: I didn't want to move around the country randomly for a few years, I was burned out and frankly having trouble thinking of what to do next, research-wise, and I was ready to settle down and start a family. Some of them were probably less usual: I'm a hacker. I have an engineer's perspective on theory: present me with a research problem, and rather than theorize about it, I'm immediately going to start figuring out methods to solve it. If there were digital classics positions out there that I could have applied for, I probably would have. So my perspective should be taken with a grain of salt: the academy didn't have a place for me to do exactly what I wanted to do. I'm a frustrated outsider looking in and wishing I could do all sorts of research that my job as a programmer doesn't give me scope for. On the other hand, I love what I do: I get to hack on ancient texts all day, and I have more ideas than I know what to do with. Whatever regrets I might have for not going after that classics faculty job, I think I'm much better off.

So what should a current graduate student interested in digital classics do? The first thing I'd say is that you should feel free to tilt at the windmills—just realize that that is what you're doing. Be aware that if you really want to use digital methods in your work, you'll have to justify its usefulness to people who don't understand it (and might be actively hostile to it). Know that you'll be facing the same reaction when search committees read your sample dissertation chapter. But also be aware that there's life outside classics: there are plenty of careers where your digital skills will be useful (some of them even involving doing digital classics) and you will have something many of your peers lack—a good fallback position. To put it in perspective: of my (small) year of matriculating classics grad students, only one of us works as a professor, and not on the tenure track. Your chances of success were slim to start with.

This all sounds a bit glum, and I don’t really mean to be. Upcoming classicists who want their primary focus to be the application of digital tools and methods to classical philology will have a hard time getting jobs in classics departments until the current scholarly fashion loosens its sway. This may take another generation or two, but it will happen. And that doesn't mean they can't be productive members of the academy. Classicists are particularly well-suited to becoming digital humanists. They have already faced the tough intellectual battle of mastering ancient languages, so Java or Python need hold no terror for them. They will already have confronted complex issues of character encoding, just to be able to type in Greek. And since classics is inherently multidisciplinary, they will be used to applying knowledge from multiple domains in the pursuit of solutions to scholarly questions.

So what skills should a DH developer possess?

Becoming a Digital Humanities Developer

I will start by saying that I don’t believe there is any particular single “entry card” into DH development. Projects are built in many languages and on many platforms. To say “you must learn Java,” or PHP, or XML, in order to become a DH developer is just wrong. What I would say, however, is that flexibility and the ability quickly to acquire new skills is crucial. What will help more than anything is to learn more than one programming language. Classicists tend to have a facility with human languages, since a Ph.D. in the subject typically requires acquiring at least a pair of ancient languages (usually Greek and Latin) and a pair of modern languages (French and German are the usual suspects). Programming languages are not human languages, but the process of learning to understand and use them is not wholly dissimilar, particularly since a graduate student is usually interested in learning to read, rather than speak, these languages.

While I don’t believe it really matters where one starts, some advice on where to begin is likely to be helpful. It is very easy to feel overwhelmed and give up when confronted with the universe of possibilities in programming. The best thing to do is to pick a project, and figure out what you need to learn in order to make it happen. You will be helped in this by the realization that you don’t need to know everything about a language or piece of software in order to make it work. A lot of programming involves making different software packages work together. For all the arguments about the superiority of one language over another, most programming languages are broadly equivalent in their capabilities, though there are certainly differences in syntax and approach. Acquiring your first programming language will be the hardest task. Look at what’s available in your environment. If you want to do a web-facing project, you might have one or more of PHP, Ruby, or Python available to you. Ruby and Python are, in my opinion, better designed languages, but PHP is easy to set up, and is perhaps still slightly more likely to be available in a standard web server setup. It takes about a year to become “fluent” in a language. When you begin, you will be constantly looking at a language reference, and will frequently know what you want to do, but not how to look it up. Just like when you’re learning a new human language and must constantly resort to the dictionary when reading, or know what you want to say, but not how to say it idiomatically. Plough through it. Fluency will come.

Once you’ve mastered a language, you should try learning another. If you’re doing web development, you may not have a choice: you will likely have to learn some Javascript, for example, while you’re developing your PHP project. It is easy to fall into a programming rut once you see how much stuff you can do with your new language. You will feel proud of your abilities, and will resent it when other programmers of different languages question your language’s syntax, power, speed, etc. Languages are tools, like screwdrivers and hammers are tools. Both screwdrivers and hammers are great at what they are designed to do, but neither of them work very well in the other’s domain. Programming languages, and software tools in general, are more flexible than most physical tools, but they are not universal in their application. Once you’ve learned PHP, you will be able to see how you would do nearly anything in PHP. But PHP may not be the best tool for the job. You won’t learn how to choose a better tool until you acquire some breadth. So pick another language to learn. Something different. If you learned PHP first, maybe pick Java next. The same process will apply, but you will acquire some facility with the language faster than you did the first time. After you’ve learned a couple of languages, you’ll find you can pick up a new one in a day (though fluency still takes time).

Classics is largely concerned with text, and there is a strong tradition of using Text Encoding Initiative (TEI) XML markup to encode those texts for publication. So a good place to engage with DH for a budding digital classicist might be to work with the texts published by (for example) the Perseus Project, or papyri.info. Looking at XML is a good way to unpack some of the complexity involved in DH development. Any language you choose to work with will have decent to great support for XML processing, but the picture is complicated by the nature of the medium. XML by itself is merely a way to mark up a document for publication or as a kind of database for asking questions of. For publication, your targets will probably be HTML or PDF. In order to convert XML to something like HTML, the most common method is to use XSLT (XML Stylesheet Language Transformations), which is a programming language in its own right. For asking questions of a corpus of XML documents, there is another language, called XQuery. PHP, Ruby, and Python all rely for their XML support on a library written in C called libxml2, and there is an associated library called libxslt that handles XSLT. While these are excellent, they support only the 1.0 version of XSLT, and XQuery not at all. The only Open Source implementations of XSLT 2.0 is written in Java and the .NET platform. So if you want to use the latest technology, you have to use something based either on Microsoft’s proprietary development platform or on the Java Virtual Machine. Interestingly, this does not necessarily mean you can’t use your favorite language: there are Python and Ruby implementations for .NET and the JVM. Even with this flexibility, however, you are looking at learning TEI, XSLT, and another programming language as a basis for doing fairly simple web publishing of texts. For something more complex, you might have to learn XQuery and how to run an XML database, like eXist.

One of the most useful qualities a DH developer can possess is one that is nurtured by the grad school experience: being able quickly to pick up enough knowledge about an unfamiliar subject to do something useful in it. I’m certainly not downplaying the value of expertise and depth, but if you need, for example, to acquire enough background in Hellenistic Philosophy to write a paper for a class in two weeks, or present on it to your colleagues, then you do it. This is not a dissimilar intellectual activity from learning enough about a piece of software to modify it for your own purposes—not to thoroughly understand all its depths—just enough to get something done. Study in classics certainly helps develop the mindset you need in order to do this, and it also makes one familiar with the experience of tackling and mastering subjects that are genuinely hard.

This ability will help you with our hypothetical XML project: you don’t need to learn all of XSLT, because there are already stylesheets out there on the web that you can pick up and customize. You need only figure out what you need to change. You will need to learn how to install, configure, and query an XML database, but you won’t have to write one from scratch. If you are using a language you’re familiar with on the JVM, such as JRuby, you will need to learn how to get it running, and read Java API documentation to find out what libraries to call, and how to call on them, but you needn’t necessarily become a Java expert. Being good, as graduate students have to become, at acquiring sufficient knowledge quickly, is a tremendous boon.

Since classics tends to deprecate the development (though not the use) of digital tools and methods, acquiring programming skills in graduate school may be difficult. I did it in two ways. First, I was already a hacker (in a small way). I learned to program in BASIC on my first computer, when I was twelve. But I really cut my digital classics teeth on building an elaborate Hypercard Greek and Latin flashcard system, starting in my senior year as an undergraduate, and continuing through graduate school. I used it to study for my MA comprehensive exams. Second, and later, I worked as a technical trainer for the university’s IT organization, did technical support for the Psychology department, built databases for various departments on campus and for the National Humanities Center Library, and worked on image databases for the classics and history departments. The latter project led to a job working for the College of Arts and Sciences as an “Academic Applications Developer.” Formal programming classes may or may not help you very much. I’ve certainly found them helpful, but not life-changing.

My digital side projects and jobs brought in a good deal of income during the latter part of my graduate career, enough that I did not have to go into debt to finance my Ph.D. On the other hand, they probably did affect my focus somewhat. Given the way my subsequent career developed, it was clearly a good thing that I spent so much time on digital projects, but I have to acknowledge that that focus probably pushed me away from a career as a classics professor.

Being a Digital Humanities Developer

Working for Libraries

In general, libraries are a really good place to work as a programmer, especially doing DH projects. I've spent the last three years working in digital library programming groups. There are some downsides to be aware of: libraries can be very hierarchical organizations, and if you are not a librarian then you are probably in a lower "caste." You will likely not get consistent (or perhaps any) support for professional development, conference attendance, etc. Librarians, as faculty, have professional development requirements as part of their jobs. You, whose professional development is not mandated by the organization (merely something you have to do if you want to stay current and advance your career), may not get an adequate level of support and may not get any credit for publishing articles, giving papers, etc. This is infuriating when it happens, and is in my opinion self-defeating on the part of the institution, but it is an unfortunate fact.

There do exist librarian/developer jobs, and this would be a substantially better situation from a professional standpoint, but since librarian jobs typically require a Master's degree in Library and/or Information Science, libraries may make the calculation that they would be excluding perfectly good programmers from the job pool by putting that sort of requirement in. These are not terribly onerous programs on the whole, should you want to get an MLIS degree, but it does mean obtaining another credential.

It's not all bad though: in a lot of ways, being a DH developer in a library is a DH developer's nirvana. You will typically have a lot of freedom, loose deadlines, shorter than average work-weeks, and the opportunity to apply your skills to really interesting and hard problems. If you want to continue to pursue your academic interests however, you'll be doing it as a hobby. Many libraries don't want your research agenda unless you're a librarian.

Working for a .edu IT Organization

My first full time, permanent position post-Ph.D. was working for an IT organization that supports the College of Arts and Sciences at UNC Chapel Hill. I was one of a handful of programmers who did various kinds of administrative and faculty project support. It was a really good environment to work in. I got to try out new technologies, learned Java, truly understood XSLT for the first time, got good at web development and had a lot of fun. I also learned to fear unfunded mandates, that projects without institutional support are doomed, and that if you're the last line of support for a web application, you'd better get good at making it scale.

IT organizations typically pay a bit better than, say, libraries and since it's an IT organization they actually understand technology and what it takes to build systems. There's less sense of being the odd man out in the organization. That said, if you're the academic/DH applications developer it's really easy to get overextended, and I did a bad job of avoiding that fate, "learning by suffering" as Aeschylus wrote.

Working in Industry

Working outside academia as a developer is a whole other world. Again, DH work is likely to have to be a hobby, but depending on where you work, it may be a relevant hobby. You will be paid (much) more, will probably have a budget for professional development, and may be able to use it for things such as attending DH conferences. Downsides are that you'll probably work longer hours and you'll have less freedom to choose what you do and how you do it, because you're working for an organization that has to make money. The capitalist imperative may strike you as distasteful if you've spent years in academia, but in fact it is a wonderful feedback mechanism. Doing things the right way (in general) makes the organization money, and doing them wrong (again, in general) doesn't. It can make decision-making marvellously straightforward. Companies, particularly small ones, can make decisions with a speed that seems bewilderingly quick when compared to libraries, which thrive on committees and meetings and change direction with all the flexibility of a supertanker.

Another advantage of working in industry is that you are more likely to be part of a team, all working on the same stuff. In DH we tend to only be able to assign one or two developers to a job. You will likely be the lone wolf on a project at some point in your career. Companies have money, and they want to get things done, so they hire teams of developers. Being on a team like this is nice, and I often miss it.

There are lots of companies that work in areas you may be interested in as someone with a DH background, including the semantic web, text mining, linked data, and digital publishing. In my opinion, working on DH projects is great preparation for a career outside academia.

 Funding

As a DH developer, you will more likely than not end up working on grant-funded projects, where your salary is paid with "soft money." What this means in practical terms is that your funding will expire at a certain date. This can be good. It's not uncommon for programmers to change jobs every couple of years anyway, so a time-limited position gives you a free pass at job-switching without being accused of job-hopping. If you work for an organization that's good at attracting funding, then it's quite possible to string projects together and/or combine them. However, there can be institutional impedance mismatch problems here, in that it might be hard to renew a time-limited position, or to convert it to a permanent job without re-opening it for new applicants, or to fill in the gaps between funding cycles. So some institutions have a hard time mapping funding streams onto people efficiently. These institutions aren't too hard to spot because they go though "boom and bust" cycles, staffing up to meet demand and then losing everybody when the funding is gone. This doesn't mean "don't apply for this job" — just do it with your eyes open. Don't go in with the expectation (or even much hope) that it will turn into a permanent position. Learn what you can and move on. The upside is that these are often great learning opportunities.

In sum, being a DH developer is very rewarding. But I'm not sure it's a stable career path in most cases, which, if nothing else, is a shame for DH as a "discipline." It would be nice if there were more senior positions for DH "makers" as well as "thinkers" (not that those categories are mutually exclusive). I suspect that the institutions that have figured this out will win the lion's share of DH funding in the future, because their brain trusts will just get better and better. The ideal situation (and what you should look for when you aim to settle down) is a place

  • that has a good track record of getting funded,
  • where developers are first-class members of the organization (i.e. have "researcher" or similar status),
  • where there's a team in place and it's not just you, and
  • where there's some evidence of long-range planning.

Programming is often viewed as a young person’s game, though there are many examples that refute the stereotype. It remains to be seen whether DH development is for junior people only. The limiting factors—dependency on funding cycles, lack of institutional support, and lack of status—may tend to push away senior developers. There is some danger that DH development could become just another way in which the academy exploits graduate labor. It will be a shame if that becomes an established pattern. The antidote is almost certainly the digital humanities center, with its ability to bring together and support faculty, researchers, and developers alike. In the future, academic departments in the humanities may develop in the way their counterparts in the sciences have, and encompass, alongside full-time faculty, staff who can support (or even lead) research efforts.

  • 1. See W. R. Connor, Scholarship and Technology in Classical Studies, Scholarship and Technology in the Humanities, Mary Katzen ed. (1990) for some insightful remarks on the reaction of classics to the new possibilities of digital research.
  • 2. Jerome McGann’s analysis in "Our Textual History", Times Literary Supplement No. 5564 (20 November 2009): 13-15, while not directed at classics, provides a useful update to Connor.