Collaboration and the Construction of Archives

Tim Lockridge's picture

If you have read Steven Levy’s (1984) Hackers: Heroes of the Computer Revolution or similar histories of the computing industry, you might know the story of Bill Gates’ “Open Letter to Hobbyists.” If not, the short version goes something like this: In 1975 a young Bill Gates struck a deal to write a BASIC programming language interpreter for the Altair 8800—a then-popular hobbyist computer. A pre-release version of that interpreter found its way to a meeting of the Homebrew Computer Club (a hobbyist group), at which point the BASIC interpreter was promptly copied and distributed. The hobbyists, as Levy describes them, were early free software advocates, and they believed that the code for the interpreter should be shared. Gates disagreed and responded with the now-famous “Open Letter to Hobbyists” in which he argued that the theft/sharing of code might prevent the development of quality software. “Without good software and an owner that understands programming,” Gates wrote, “a hobby computer is wasted. Will quality software be written for the hobby market?”

Screenshot of a github page

While Gates’ concerns remain relevant nearly 40 years later, several communities and start-ups have built systems to address the concerns of sharing, ownership, and citation. Today, for example, Github (a site for “social coding”) offers a formalized structure for the sharing of code, and the service is very much tied to identity: Github contributors have profile pictures and graphs of recent activities, and their changes to code are situated in discussion and are attributed to the author who composed and submitted them. Github, as a site, seems to argue that even (and especially) in a world of shared code and collaborative work, authorship and acknowledgment matter.

I invoke the Gates narrative not because I see code repositories like Github as the specific solution to his concerns, but rather because I think collaborative code repositories can teach us how a digital community might address a technical and historical problem. Via centralized code sharing, hobbyist (and professional) developers can collaborate, license, and gain credit for their work. Additionally, because a service like Github contains a record of each “commit” to the service, it also functions as a robust digital archive, connecting identity to a history of collaboration.

Here, I offer my question for this MediaCommons thread: What might the scholarly parallel of Github look like?

Connected to that question are, I think, two significant issues. The first, as Dave Parry has noted, is that a tremendous amount of scholarship is locked up in “knowledge cartels.” Parry asks us to consider how “If we are to imagine that what we at the university do is attempt to serve the public, then we are called upon to ask whether the current system of knowledge distribution serves that public interest or benefits a small protected group.” Like the Bill Gates of 1975, many individuals connected to scholarship (publishers, authors, and readers alike) are thinking through the professional and institutional repercussions of dramatically shifting our means of storing and transmitting research. This is a complicated issue full of challenges and stakeholders, and I know that I can’t do it justice in the space of short-form text. Yet if we are to consider the social, professional, and legal stakes with sharing online, we must begin by looking at how our own work is stored and circulated. [1]

Beyond the space of archives, however, we must also take a critical look at our tools for composing. The .doc(x) and PDF have become the default currency of the humanities, and our work isn’t documented at a granular level. Instead, revisions often run through Microsoft’s track changes and are flattened upon publication. At best, our current archives could be populated with marked-up print copies or multiple files created with the “Save As…” command. In comparison, code archives like Github are viable because the tools for composing code work in tandem with the archive. If we examine the typical tool for humanities textual production (Microsoft Word), it is no surprise that services like JSTOR or ERIC are our default archives. To imagine a different approach to sharing and archiving, we need to rethink our means of composing. [2]

I should note, of course, that there are many spaces—communities like HASTAC, multimodal/open-access publications like Kairos)—that challenge the traditional norms. However, the Github model moves me to think even more dramatically about how we store and share texts. Given the right tools, a new approach to the textual archive could be grounded in collaboration, citation, and a richer understanding of revision.

Just as Gates once wondered, “Will quality software be written for the hobby market?”, I would like to today ask if we can dream and build a living repository of research that extends beyond the printed (or print-like) artifact.

[1]: Kathleen Fitzpatrick’s Planned Obsolescence is, I think, an important point of departure for such a conversation.

[2]: I should also note that the use of MS Word isn’t all-inclusive, and I know a number of colleagues composing in a variety of environments: Scrivener, Markdown, and LaTeX, among others. I don’t believe, however, that these tools are the norm, and there is a direct correlation between our archival problems and our tools of production.


Kristopher Purzycki's picture

Individual Responsibility and Copyleft

While typing up the copyright page of my master's thesis, I had initially applied a boilerplate statement of ownership to my work. But I immediately wondered if there was a standard for opening use of these texts through a Creative Commons license. I changed the rights to one that is open to attributed sharing in hopes that this will better spark conversations with future researchers (presuming that my work is of interest). The statement below was most resonant: 

This license opens up many possibilities in the academic world such as free online course readers, zero cost educational multimedia, gratis online tutorials-even the price of paper textbooks could be drastically reduced. Perhaps more important than cost, however, by using Creative Commons you are essentially “paying it forward” by sharing your intellectual output with the academic community because future generations of scholars will have greater access to your work."

Perhaps it exists but I look forward to a platform (I haven't had a chance to explore HASTAC) where publishable, rhysomatically hyperlinked collaborations flourish and new partnerships are found. This would be an incredible stage for promoting new thought and getting in touch with academic heavy-hitters. 

Matthew Beale's picture

Parcipatory archiving

This piece encouraged me to think about the subject of digital archiving with regard to last month's topic of the maintenance of digital communities. As online collaboration becomes easier and easier with tools such as Google docs, it becomes more difficult to keep track of "who did what when" in the vein of the 'track changes' and 'comment' options that were necessary on multiple doc. file drafts. The future of the digital archive, whatever it might be, seems to me that it should emphasize this collaborative nature of scholarship rather than each individual's contributions. I couldn't say what this algorithm should look like, but it should ideally be one that prioritizes the participatory components of building scholarly work that moves beyond the tracking of drafts.

Tim Lockridge's picture

Granular Archiving

I also wonder how archives and composing processes might change if our tools offered something similar to the "commit-based" system of a code repository. So revision/archiving isn't based around a completed "draft" but rather focused on a smaller & more incremental level—paragraph, sentence, outline, etc. The history functions of tools like Google Docs and Mediawiki already offer something like this granular look, but that history is typically constrained within a timeline. Chronos frames process. (Likewise, I think it's interesting that Google Docs determines what is a "major" and "minor" edit.) Rather than relying on algorithmically-generated history, I wonder how my composing would change if I had to determine when to "commit" an idea.