Great news for EMiC scholars: The MLA has released its guidelines for evaluating digital scholarship. I encourage everyone to look through this important document as a way of thinking about your own projects. http://www.mla.org/guidelines_evaluation_digital. Does your individual project meet these basic guidelines? How can EMiC help you attain the goals set forth in this document?
In other news: in the next few weeks we’ll be launching a new “resources” page put together by Kaarina Mikalson. We’ll make sure these guidelines are a part of this new resource.
I have been a Postdoctoral Research Fellow with Editing Modernism in Canada for just over a year now, so it gives me great pleasure at this midpoint in my position to announce two major partnership agreements signed last week. First, EMiC has finalized it contract with Islandora at the University of Prince Edward Island to build our very own Digital Humanities module. Second, EMiC has partnered with another DH project with which I am involved: The Modernist Versions Project. Both partnerships promise to provide resources, training, and infrastructure not only EMiC scholars, but to the DH community as a whole.
1. Integrated Digital Humanities Environments: Islandora
Anyone who has been in DH for a while knows that there is a long history of tool-creation for our scholarly endeavours. Some of these projects have been successful (The Versioning Machine, Omeka, etc.), and some, unfortunately, have not. One “problem” we face as DH’ers is that there is simply so much to do. Some of us are interested in visualization software and network relations (Proust Archive), some are interested in preserving disintegrating archives (Modernist Journals Project), and others of us are firmly rooted in TEI and textual markup. Moreover, with the growth of GIS software, mapping texts has become a great way to have students interact with texts in spatial terms and to communicate with a non-academic public using a language most of us are familiar with: maps.
But what happens in DH when we move into the classroom?
I recently read a stunning syllabus created by Brian Croxall at Emory University, in which he provides his students with a solid (and diverse) introduction to the Digital Humanities. But one thing researchers and teachers like Brian, or any other DH’er faces, is providing students integrated learning environments where they can edit texts in a common repository AND have all the tools they need at their disposal in the browser. If you want to teach TEI right now, you have to buy Oxygen (a life-saving program when it comes to XML markup); For versioning, you must install Juxta or The Versioning Machine. For publication/exhibition you must install Omeka. But what if we had ALL of those things in one learning environment, in one common and open system? This is what we’re trying to accomplish with the EMiC Digital Humanities Sprout.
EMiC Digital Humanities Sprout
An issue EMiC faces in providing tools for our researchers is the sheer diversity of work being undertaken right now by EMiC scholars who have varying levels of experience with digital environments. EMiC needed to find a way to allow its members to preserve, edit, and publish digital editions of archival material in an intuitive way; moreover, we wanted to make to sure our archival practices conformed to international standards. Moreover, most of us are teachers too. How do we teach our students what we are doing in our research? Enter Islandora.
Islandora
Nine months ago, I Googled the phrase “TEI, ABBYY, XSLT” on a whim (actually, I was being lazy: I was looking for an XSLT sheet that would transform ABBYY HTML to simple TEI). The first result listed was a page from the University of Prince Edward Island—just down the road so-to-speak. Not knowing much about Prince Edward Island outside of L. M. Montgomery, I keep browsing, and to my amazement, found that the library at UPEI had created a project called “Island Lives,” a resource developed using the home-grown Islandora digital repository. Mark Leggott, Donald Moses, and others, had built precisely what I was looking for: a digital asset management system using a Fedora Commons repository wrapped in Drupal shell. Islandora allows users to easily upload an image of text to its database, edit that image (TEI), and then “publish” a complete text (book, pamphlet, etc.) to the web. Dean Irvine and I realized that if we could expand this system to fit EMiC’s needs, we could create a Digital Humanities module that would serve our members perfectly. We decided to focus on the core issues facing EMiC editors: Ingestion (including OCR based on Tesseract), Image Markup, TEI editing, Versioning, and Publication (for the full list of what we’re building, see below*). Moreover, Islandora is tested and true and is being used by NASA, the Smithsonian, among many other institutions.
Thank You, DH.
We have years of successful work to emulate for this DH module. And just as the DH community has given to us, we expect the give back to the DH community by keeping the DH module open to use. Yes, we plan on creating an EMiC/Islandora DH install that you can download and use in your classrooms.
If you’re interested in what we’re building, please email Dean Irvine or Matt Huculak with your questions.
As part of this initiative, I have moved to Prince Edward Island to work with the Islandora crew as we develop this module. There’s some other news about what I’ll be digitizing there to “test” our system—but you’ll have to wait to hear about that. In the meantime, we are planning unveiling our functioning module at DHSI2012.
2. Modernist Versions Project
If you haven’t been to the Digital Humanities Summer Institute hosted by Ray Seimens at the University of Victoria, do plan on going! It is an incredible week of DH training, and it is one of the most memorable “unconferences” I have ever attended. One wonderful result of this year’s camp was the creation of the Modernist Versions Project (MVP), an international initiative to provide online resources for the editing and display of multiple witnesses of modernist texts. In what was truly a conversation over coffee, Stephen Ross shared with me his desire to create the MVP. Having served the Modernist Journals Project (MJP) at the University of Tulsa and Brown University for over six years, I said, “Stephen, let’s do this!” And we did. With the help of James Gifford, Jentery Sayers, and Tanya Clement (who along with Stephen and I serve as the Board of the MVP), we have secured tremendous support for a major SSHRC application this fall. The MVP promises to be an important project in the field of Digital Humanities and modernism.
But what does this have to do with EMiC?
I am impressed by two aspects of EMiC. First, the recovery of modernist Canadian texts in our project is truly spectacular. Second, the training EMiC facilitates at the University of Alberta, Dalhousie University, The University of Victoria, and Trent University (among many other institutions) is edifying. Just look at our graduate student editors who are engaged in serious textual editing projects across Canada: http://editingmodernism.ca/about-us/. We are really building the future of Canadian studies here.
As an international scholar, I am concerned, like many of you, with the networking of Canadian modernism across the globe. How does Canadian modernism fit into the greater narrative of modernity across the world? (this is a topic we’ll be exploring in Paris 2012: http://editingmodernism.ca/events/sorbonne-nouvelle/).
The Modernist Versions Project is one way of creating networks of modernist textual criticism and production across the world; that is, the MVP is interested in the editing and visualization of multiple textual witnesses no matter where those witnesses were created. Though located in Canada, the MVP’s scope is much larger, and EMiC’s partnership with the MVP will allow EMiC scholars interested in “versioning” to use MVP resources as they are developed. The MVP has already developed partnerships with the Modernism Lab at Yale University, Modernist Networks at Chicago, and NINES, which is letting us use and develop their Juxta software for periodicals and books.
Dean Irvine has been very generous in allocating my Postdoctoral hours towards the formation of the MVP. Once again, EMiC is nurturing young projects and helping create a truly global network of digital modernist studies. And I think I’ll end on this note: EMiC’s primary focus has been collaboration: collaboration among peers, and now collaboration among projects. And by collaborating with other projects around the world, we hope to create tools that will last, be useful, and really change the face of modernist studies.
Welcome to EMiC. Let’s go build something.
*Details of the EMiC Digital Humanities Sprout
Existing Islandora Code
1. Islandora Core
a. Integration with the Fedora repository and Drupal CMS
b. Islandora Book Workflow
c. Islandora Audio/Video
d. Islandora Scholarly Citations
New/Enhanced Functionality for the EMiC Module
1. Smart Ingest
a. Use open source Tesseract OCR engine
b. Integration of TIKA
2. Image Markup Tool
Proofs of concept and models:
Image Markup Tool (IMT)
Text-Image Linking Environment (TILE)
3. TEI Editor
Proofs of concept and models:
Canadian Writing Research Collaboratory (CWRC) – CWRC Writer
Humanities Research Infrastructure and Tools (HRIT) – Editor
4. Collation Tool
Proofs of concept and models for development:
5. Version Visualization Tool
Proofs of concept and models:
On the Origin of Species: The Preservation of Favoured Traces
6. Dynamic Version Viewer
Models:
Hypercities database: Transparent layers interface
7. Digital Collection Visualization Tool
Proof of concept:
On June 9, 2010, Wired.com ran a story announcing the intention of DARPA, the experimental research arm of the United States Department of Defense, to create “mission planning software” based on the popular tax-filing software, Turbotax.
What fascinated the DoD was that Turbotax “encoded” a high level of knowledge expertise into its software allowing people with “limited knowledge of [the] tax code” to negotiate successfully the complex tax-filing process that “would otherwise require an expert-level” of training (Shachtman). DARPA wanted to bring the power of complex “mission planning” to the average solider who might not have enough time/expertise to make the best decision possible for the mission.
I start with this example to show that arcane realms of expertise, such as the U.S. Tax Code, can be made accessible to the general public through sound interface design and careful planning. This is especially pertinent to Digital Humanities scholars who do not always have the computer-science training of other disciplines but still rely on databases, repositories, and other computer-mediated environments to do their work. This usually means that humanities scholars spend hours having awkward phone conversations with technical support or avoid computer-mediated environments altogether.
With the arrival of new fields like Periodical Studies, however, humanities scholars must rely on databases and repositories for taxonomy and study. As Robert Scholes and Clifford Wulfman note in Modernism in the Magazines, the field of periodical studies is so vast that editors of print editions have had to make difficult choices in the past as to what information to convey since it would be prohibitively expensive to document all information about a given periodical (especially since periodicals tended to change dramatically over the course of their runs). Online environments have no such limitations and thus provide an ideal way of collecting and presenting large amounts of information. Indeed, Scholes and Wulfman call for “a comprehensive set of data on magazines that can be searched in various ways and organized so as to allow grouping by common features, or sets of common features” (54).
What DARPA and Turbotax realize is that computer-mediated environments can force submission compliance with existing “best practices” in order to capitalize on the uneven expertise levels of the general population. Wulfman and Scholes call for the creation of a modernist periodical database where modernist scholars can work together and map the field of periodical studies according to agreed upon standards of scholarship. By designing a repository on a Turbotax model of submission compliance, the dream of community-generated periodical database that conforms to shared bibliographic standards is readily attainable.
Because of the vastness of its subject matter, Periodical Studies is inherently a collaborative discipline—no one scholar has the capacity to know everything about every periodical (or everything about one magazine for that matter). Thus, the creation of periodical database is necessary to map the field and gather hard data about modernist periodical production. The problem is that not every periodical scholar has the computer expertise to create or even navigate the complexities of database/repository systems. Nor does every scholar know how to follow the best metadata and preservation practices of archival libraries. We are now at a point where we can utilize the interests and expertise of humanities by creating a repository that forces proper “input” along the lines of Turbotax.
Challenge
I use the example of periodical studies to challenge the greater field of Digital Humanities. Our discipline has now reached a mature age, and think we can all agree that the battle between “Humanities Computing” and “Digital Humanities” should be put to rest as we move into the next phase of the field: designing user-friendly interfaces based on a Turbotax model of user input. For example, even at this stage of Digital Humanities, there doesn’t appear to be a web-based TEI editor that can link with open repositories like Fedora Commons. In fact, the best (and most stable) markup tool I’ve used thus far is Martin Holmes’s Image Markup Tool at the University of Victoria. Even this useful bit of software is tied to the Windows OS, and it operates independently of repository systems. That means a certain level of expertise is needed to export the IMT files to a project’s repository system. That is, the process of marking up the text is not intuitive for a project that wishes to harness the power of the many in marking up texts (by far, the most time-consuming process of creating a digital edition). Why not create a Digital-Humanities environment that once installed on a server, walks a user through the editing process, much like Turbotax walks a user through his/her taxes? I used to work as an editor for the James Joyce Quarterly. I experienced many things there, but the most important thing I learned is that there is a large community of people (slightly insane), who are willing to dedicate hours of their time dissecting and analyzing Joyce. Imagine what a user-generated Ulysses would look like with all of that input! (we would, of course, have to ban Stephen Joyce from using it–or at least not tell him about it).
Digital Humanities Ecosystems
The story of Digital Humanities is one littered with failed or incomplete tools. I suspect, save for the few stalwarts working under labs like Martin Holmes, or our colleagues in Virginia and Georgia, and elsewhere, that tools are dependent on stubborn coders with enough time to do their work. I find this to be a very inefficient way of designing tools and a system too dependent on personalities. I know of a handful of projects right now attempting to design a web-based TEI editor, but I’m not holding my breath for any one of them to be finished soon (goals change, after all). Instead of thinking of Digital Humanities development in these piecemeal terms, I think we need to come together as a federation to design ECOSYTEMS of DH work–much like Turbotax walks one through the entire process of filing taxes.
I think the closest thing we have to this right now is OMEKA, which through its user-base grows daily. What if we took Omeka’s ease-of-use for publishing material online and made into a full ingestion and publication engine? We don’t need to reinvent the wheel after all: Librarians have already shown us how we should store our material according to Open Archival Standards. There is even an open repository system in Fedora Commons. We even know what type of markup we should be using: TEI and maybe RDF. And Omeka has shown us how beautiful publication can be on the web.
Now, Digital Humanists, it is our time to take this knowledge and create archives/databases based on the Turbotax model of doing DH work: We need to create living ecosystems where each step of digitizing a work is clearly provided by a front end to the repository. Discovery Garden is working on such an ecosystem right now with the Islandora framework (a Fedora Commons backend with a Drupal front end), and I hope it will truly provide the first “easy-to-use” system that once installed on a server will allow all members of a humanist community to partake in digital humanities work. If I’m training students to encode TEI, why can’t I do so online actually encoding TEI for NINES or other projects? I’ve been in this business for years now, and even I get twitchy running shell scripts—my colleagues and students are even more nervous. So let’s build something for them, so we they can participate in the digital humanities as well. Everyone has something to gain.
I am attempting to harness the power of the crowd with “the Database of Modernist Periodicals,” to be announced this summer. I’ll let you know how it goes.
I end with this caveat: We need to prepare for the day when the “digital” humanities will simply be “the humanities,” and that means democratizing the digital (especially in our tools). Even I was able to file my taxes this year.
Before I became the newest EMiC Postdoctoral Fellow this past fall, I regularly discussed with my colleagues the lack of a simple editing and publication engine for Digital Humanities scholars and teachers. My field of research, modernist Periodical Studies, is rapidly expanding, and the digital environment promises new ways of archiving and accessing magazines that have been scattered across university libraries around the world. Organizations like the Modernist Journals Project are doing wonders in delivering complete digital editions of periodicals to scholars, but there is no place a professor can go to teach a student how to digitize, OCR (Optical Character Recognition), markup, and publish a magazine or book for a class project. This was a major problem for David Earle at the University of West Florida who was teaching an undergraduate section on modernist magazines and wanted his students to produce a digital edition at the end of the course. Earle realized that he had to negotiate a complex field of proprietary software and web expertise to make his course viable. With a bit of elbow grease, Earle started, with his students, “The Virtual Newsstand from the Summer of 1925.” His class was asked to help “recreate” a 1920s American newsstand—that is, what magazines and papers would the average New Yorker have seen in one of the little kiosks on a warm summer afternoon in 1925? As you can see, the project was a great success, and I hope it is something we can help our EMiC team do too in the classroom.
My primary task this term has been to set up the EMiC Digital Coop and Digital Commons. The Coop will be a closed repository where you will be able to upload everything you have scanned for the EMiC community. The Commons will be the place where you can publish your own digital editions. This will be a public space, so only material that is in the public domain, or material with which you have secured rights, may be published here. I’ve had two questions in mind: what type of system can we use that will be easy to use for the ingestion of material to the EMiC repository, and what system can we use to publish that material once it is ready? We also want to make sure that our repository uses the best open archival practices available to us today. This ensures that EMiC (your work and mine) will be compatible with other university systems and repositories for many years to come; for example, Susan Brown is in the process of creating the CWRC (Canadian Writing Research Collaboratory, or “quirk”), which promises to be one of the greatest archives in Canada once it goes online. Brown will be building the CWRC on the Fedora Commons framework at the University of Alberta, and thus it is important for EMiC to be able to create a repository that will work well with this future archive. To this end, we have decided to build our repository on the Fedora Commons framework as well.
Now that we have a framework, how do we create a system that is convenient and easy-to-use? This has proven to be a very difficult question. As many of you know, the Center for New Media and History at George Mason University has released a powerful publication and exhibition tool called Omeka. Though Omeka is a powerful tool (and it only promises to become even more powerful), it does not provide all the tools we need to run an agile and powerful repository. After much research, I came across Islandora, a content management system created at the University of Prince Edward Island built upon Drupal. Islandora provides us with an easy-to-use system that allows us to upload an image file and have an automated workflow create OCR, PDF and XML files (including TEI) upon ingestion. The team at PEI, including Mark Leggott, Donald Moses, Joe Velaidum, and Kirsta Staplefeldt are committed to building open tools for the digital humanities community at large (and they have a digitization lab to die for). We are very impressed with their scholarly model, and we hope that they will use their experience with EMiC to collaboratively build a repository specifically geared towards digital humanists (Islandora is already hard at work in museums and universities around the world).
But before we commit to a system, we need to run vigourous tests to make sure the system we build for EMiC will last long into the future. In order to ensure this, Dean Irvine has allocated funds for a study into Islandora and Omeka at the University of Alberta for EMiC use (and if all goes well, perhaps for the CWRC as well). By the end of January, EMiC should be able to announce our findings. Our goal is to provide a complete editing and publication engine not only for our community, but the world at large as well. How will this happen when we have great tools like TILE, Omeka, Islandora, which weren’t built specifically to work with one another?
As many of you know, Meagan Timney, the other Postdoctoral Fellow at EMiC, is a talented programmer and teacher committed to the Digital Humanities (and I’m told, she is also the person who championed the idea of EMiC before it was even a proposal in the Director’s eye). She has agreed to code the necessary APIs to allow our new system to work with Fedora Commons (and thus Islandora) and Omeka. Her work will provide the Digital Humanist community and important plugin so users of Omeka and Islandora will be able to edit images on the web and in the repository. For those of you attending DHSI this summer at the University of Victoria, she will be teaching a course on “Digital Editions” (http://editingmodernism.ca/training/summer-institutes/demic/) where students will get to use this new tool in creating their own digital work. I encourage you to sign up for her course if you would like to learn how to digitize, edit, and publish a text to the web.
So, where does this leave us? Well, we hope that by the end of spring someone like David Earle will never have to look for an editing and publication engine ever again. This also means that we will be ready to start ingesting the material you have all been scanning directly into the repository. We hope EMiC will provide our community with the archive and tools it needs to start producing the texts you want to create from your various archives. We are truly on the cusp of creating an entire framework that will help scholars around the world produce and edit texts that will be nurtured in an open-source and secure repository for many years to come.
The Canadian Writing Research Collaboratory (CWRC) seeks a project manager to play a vital role in developing online infrastructure for humanities scholars. CWRC will produce a virtual research environment for the study of writing in Canada, in partnership with other open-source software initiatives and with literary researchers. It will build a repository, a layer of services for the production, use, and analysis of repository and federated materials, and an interface. More information about the project is available at: http://cwrc.cs.ualberta.ca/index.php/General:CWRC.
Reporting to the Project Leader, the project manager will be a full participant in project development. We require a dedicated team member, with depth of knowledge in literary studies and familiarity with the digital humanities, to play a key role in planning infrastructure development, coordinating user needs analysis, assisting in developing specifications for contract work and partnership agreements, evaluating work in progress and completed, and managing one or more subprojects in the software development process. The project manager will help represent the project to the University, project partners, and external communities. We need someone with strong prioritizing, organizational and problem-solving skills, excellent communication and interpersonal skills, the confidence to work independently, and the ability to establish and maintain quality relationships with CWRC researchers, partners, employees, and contractors.
The position is a full-time Trust/Research academic staff position, with benefits, at the University of Alberta for a minimum of three years. The deadline for application in the form of a c.v. and a letter of interest is Dec. 10.
Full details can be viewed at http://www.careers.ualberta.ca/Competition/A110413063/