Editing Modernism in Canada

Community

August 15, 2011


TEMiC – Week 2, Day 5 – Present and Future

August 12 – The last day of TEMiC at Trent University

Morning Session

For the last day, we started with a discussion based on a  forthcoming essay by Zailig Pollock and Emily Ballantyne entitled “Respect des fonds and the Digital Page.”  To begin, Zailig told us that the essay was ready for press over a year ago and much has changed since then. For instance, his practice of genetic coding has been modified and now differs from the examples offered in the essay.  The essay was inspired by Zailig’s SSHRC  proposal and conversations with Catharine Hobbs (at Libraries and Archives Canada). Many archival practices and theories, of which editors should be aware but are not, require contemplation. This understanding led Zailig to explore mediation and archaeology in relation to archives. Discursive remarks may be included in the online project to highlight the physical aspect of the archive and the mediation involved in making a digital version of the physical materials.

To reflect on some past practices and current best practices, we selected some websites to view, comment on, and critique. We looked at Woolf Online (www.woolfonline.com) – which is about three or four years old – and discussed the clean design and user-friendly interface. The title is misleading because the website is not comprehensive and only includes a part of “Time Passes” (indicated in the subtitle at the bottom of the home page). The transcription attempts to recreate the appearance of the page. When we looked at the digital images of the manuscript pages, Zailig pointed out that the images of the manuscript pages, typescript pages, and the transcriptions are not well integrated, because users are forced to move back and forth between two screens. No final reading text or genetic transcription is available. The website provides the material for an edition, but doesn’t actually do the editorial work itself. It is a good example of prototyping, of producing a digital edition at one phase (i.e. in terms of functionality). Other things we considered included: users must pass through many levels to get at the materials; the splash page works very well, suggesting a design consultant was involved, but the content management lacks the same detailed focus; usability studies will improve the functionality of the site; website prototypes become the finished products, rather than early test stages that are later redesigned after use; intuitive navigation is crucial (i.e. why is “Stephen Family” placed next to “Gallery”?); and a person unfamiliar with Woolf’s writings may have difficulty navigating the site and.

Next, we looked at the website, based on Versioning Machine, for Baroness Else von Freytag-Loringhoven (www.lib.umd.edu/dcr/collections/EvFL-class/bios.html). The original versioning software dates back about a decade. While different versions of a text may be displayed side by side, there isn’t enough space on the screen to have more than two versions. Again, Zailig pointed out that the website doesn’t offer an edition, but only a transcription that doesn’t do the work of comparison. We looked at the latest version of the site (http://digital.lib.umd.edu/transition) that uses Versioning Machine 4.0. A basic problem is whether or not it is possible to read the display offered by Versioning Machine. It may be that tabs simply offer a superior view. In the newest version, highlighting the text in one version highlights the corresponding or comparable texts in the other displayed versions. An indication of revision points may be a superior way of displaying and it is necessary to provide some sense of direction in terms of the navigation of versions.

We looked at the Shakespeare Quartos Archive (http://quartos.org/main.php) that is developed out of the University of Virginia. There is a proof of concept for a transparency viewer. One question was why none of the tools are available in full-screen. This version doesn’t allow the transparency viewer to be functional, making it an attractive feature that isn’t altogether useful yet.

We also considered another intriguing idea that might, in the future, be modified for textual editing. At Hypercities Beta 2 (http://hypercities.ats.ucla.edu) multiple historical maps may be overlaid as transparencies. While looking at the site, Zailig told us of a simple tool he often uses: a wooden dowel that holds a paper version of the text directly under an electronic version on the screen. By rotating the dowel line by line in front of the lines on the screen, he avoids eye skip and catches more errors. A screen version of this technique may use a second frame very narrowly configured.

One of the key problems with using the word “archive” is the difficulty of representing an author’s entire fonds online: a single scholar is inadequate to the task. Modularity – selecting one particular part of the whole – is a key to developing a satisfactory and usable online version. Working on minutiae, on a very small set of texts, allows for the step-by-step development of technology and tools that work well and meet fundamental needs. These smaller prototypes will contribute to the development of larger and interconnected projects.

 

Afternoon Session – Dean Irvine’s presentation

What follows here are some important points Dean made during his presentation.

Every editorial project should begin with a sound theoretical reflection on the practices on which the process is based. The process should be to theorize, then to design, and then to implement.

The Commons: EMiC is not producing digital libraries or archives, but a different kind of repository – the digital commons. This is a collective and social production of resources, not attributable to a single creator, but rather to the many different workers who participate in its making. This understanding draws on the work done by Michael Hardt and Antonio Negri in their books Empire (2000), Multitude (2004), and Commonwealth (2009). The commons is predicated on sociality and the production of social goods. The products result from “immaterial labour,” an excess that is available to the commons, for redistribution among those who participate in the commons. Approximately 80% of the participants in EMiC are women and we must think about the gendering of the commons. Commonwealth addresses the gendering of work and argues that the division of productive and reproductive labour breaks down. Hardt and Negri borrow Foucault’s term “biopolitical” – the conditions under which, in capitalism, all elements of life are contained in biopolitics. In the commons the materiality is subordinate to its sociality. All of the immateriality of the work going into the commons is indicative of a broader global process of the feminization of labour.

The large number of women collaborating in EMiC is in no way representative of modernist studies in general. Previous projects have been dominated by male intellectuals. In Canada, female scholars produce the majority of modernist activities. While the majority of co-applicants or leaders of projects are male, the emerging scholars and graduate students are female. In terms of the distribution and allocation of resources, 80% of EMiC’s funds are going to support the work of women. This signals a change in the field because a generation of workers will come into positions or power, thereby transforming modernist studies and digital humanities.

The problem with the term “archive”: The commons is not an archive, though it might reproduce some of the contents of archives. It is two removes away from the institutional archive. We have an intermediary space – a repository of raw materials for the production of editions. This space will be called the “coop” – produced by cooperative labour and openly shared among the participants. From the coop, users may produce digital editions published in the commons. All the participants are buying in: users must contribute to the commons to make use of the material in the commons. Digital objects cannot be exclusively restricted by any one individual or group of scholars; they are available for use by any and all the participants. However, there are issues of permissions and access – some specific objects may be closed, for good reasons. For example, for the P.K. Page project, the availability of the correspondence is crucial for all the other aspects of the project: access to the letters will be restricted to the editors on the Page project. A person who accesses the archive will be expected to contribute by adding new content.

The material of the commons can be transformed and the product can become the intellectual property of the individual user. The restrictions and permissions will be transparent (and roughly equivalent to those usually found in archives).

The Workflow: how we get material into the commons. Here is a brief overview:

1. Scanning: EMiC has little control over the actual scanning – it cannot provide uniform structures or facilities and each institution may be different – which indicates that the beginning is one of its weakest points. There is infinite variability in terms of infrastructure. EMiC is a networking and training project, not an infrastructure project. Everyone should have access through the university libraries to some scanning instrument.

2. Image Correction: the scanned image may not be exactly ready for production or reproduction. Most often, a person is working with Photoshop and correcting the image by cutting unnecessary parts and/or straightening the image and/or adjusting brightness.

3. Ingest Process Diagram: (On the screen Dean displayed an Ingest Process Diagram, which is roughly described in the following comments.) The repository has two parts: the
interface and the database. The database is Fedora, a broadly used open-access repository for storing digital objects. The interface is based on a content management system (CMS) called Drupal. Drupal is a php interface that allows users to manage the content in the Fedora repository. The integration of Drupal and Fedora has been undertaken by digital librarians working out of the University of PEI: it is called Islandora (a project partnered with EMiC). The Ingest Process Diagram represents the current capability of Islandora. A user starts by creating a parent object, an originator, a corrected image file. This is saved in a Tiff format – it has the least amount of data degradation. There is no lossless digital object. (And all digital objects are slowly degrading.)

An object in the repository must be made locatable by attaching metadata to it. This allows users to search for and recall the object from the repository. Users will fill in a form to create the metadata. File upload sounds easy, but it is not. The system is sensitive to the orientation of the objects the user is uploading. At the ingest stage, there must be a level of image detection and rejection. The repository calls upon the image and creates a new digital object. The original digital object will always exist as a backup: the user is never altering the original. The child object goes through an automated character recognition system. Open access Optical Character Recognition (OCR) software should be available by the end of the year. At the metadata stage, users identify the files that go through OCR or bypass the process. Then user does proof-reading and correcting, basic markup, contextual markup, followed by preview on the web and finally publication on the web. With an authority list, the user might automate the markup of some aspects of the image. Currently, the original image is placed beside the transcription on the screen. This is all web-based.

4. Image Markup Tool: the workflow will have a web-based Image Markup Tool (IMT). A desktop-based tool has been integrated into the workflow. This eliminates the necessity of leaving the web-based environment to complete the IMT.

5. Text Editor – CWRC Writer: CWRC is a web-based TEI editor being developed out of the U of Alberta. In an intuitive interface, the user will be able to do the TEI markup, at least partly by using an authority list. An authority list includes everything that the project editors are tagging in the documents. When a certain word has already been tagged, the authority list provides a suggestion based on previously used terms appearing in other documents. The authority list can be pre-populated for all users of the editorial team. The CWRK is an intuitive interface that helps users with TEI and saves time.

6. Collation Tool: An intuitive interface will compare multiple versions of texts. JUXTA (www.juxtasoftware.org) allows users to compare texts and markup variants. Users can also produce a list of variants. The collation tool might do the work of up to 80% of manual labour (and save time). This desktop-based tool will be made into a web version that will be integrated into the Islandora workflow. (Long lists of genetic textual variant lists are unreadable to most readers of a scholarly critical edition. The assumption until now is that other readers can reconstruct other versions of poems from the printed list of variants in the book.)

Visual representations of the commons will be useful. At http://benfry.com/traces/  Ben Fry has created a visual representation of the transformation through time of six different versions of The Origin of the Species. The visualization exists because a variorum edition of the works of Darwin’s works is already available online. The software is open-access. Dean is asking Islandora to include at least one visualization tool. An abstract visualization tool can be supplemented with a literal representation of textual transformation.  Stefanie Possavik – www.itsbeenreal.co.uk – has produced visual representations of the transformation of Darwin’s text. The commons can be represented in terms of visual abstraction, such as this example in a video of the Visual Archive: http://visiblearchive.blogspot.com/.

What will an image viewer for the commons looks like? Dean presented a sketch of a proposal for a viewing environment. The same viewer might be used for multiple purposes at different stages of the process of ingestion, markup, etc. Users will use one interface and change the function it performs at each time depending on the present task.

For every piece of the puzzle one part has already been developed. Leveraging code allows EMiC to work with 5 different institutions and a dozen collaborators, all of which are contributing parts to a much larger project. Digital librarians, working in collaboration with software companies, are central to the development of the EMiC commons.

One of our imperatives may be to generate new practices of reading. The field is unstable and quickly changing because of the multitudinous processes of creation, making it difficult for any one person to comprehend its many strands. Each individual will structure a different narrative comprised of a selection of these strands. How does all of this change our pedagogical principles and the way we present literary studies to our students in the classroom.

EMiC is channeling resources to its principle base: a community of scholars and researchers. Rather than an empire, EMiC is building a community collecting the ideas of many or most of its participants and contributors.

The proofs of concept are not set in stone. The prototypes can be shaped to our own ends.

And the presentation ends.


Leave a Reply

You must be logged in to post a comment.