Editing Modernism in Canada


Posts Tagged ‘TEI’

August 23, 2010

TEI & the bigger picture: an interview with Julia Flanders

I thought those of us who had been to DHSI and who were fortunate enough to take the TEI course with Julia Flanders and Syd Bauman might be interested in a recent interview with Julia, in which she puts the TEI Guidelines and the digital humanities into the wider context of scholarship, pedagogy and the direction of the humanities more generally. (I also thought others might be reassured, as I was, to see someone who is now one of the foremost authorities on TEI describing herself as being baffled by the technology when she first began as a graduate student with the Women Writers Project …)

Here are a few excerpts to give you a sense of the piece:

[on how her interest in DH developed] I think that the fundamental question I had in my mind had to do with how we can understand the relationship between the surfaces of things – how they make meaning and how they operate culturally, how cultural artefacts speak to us. And the sort of deeper questions about materiality and this artefactual nature of things: the structure of the aesthetic, the politics of the aesthetic; all of that had interested me for a while, and I didn’t immediately see the connections. But once I started working with what was then what would still be called humanities computing and with text encoding, I could suddenly see these longer-standing interests being revitalized or reformulated or something like that in a way that showed me that I hadn’t really made a departure. I was just taking up a new set of questions, a new set of ways of asking the same kinds of questions I’d been interested in all along.

I sometimes encounter a sense of resistance or suspicion when explaining the digital elements of my research, and this is such a good response to it: to point out that DH methodologies don’t erase considerations of materiality but rather can foreground them by offering new and provocative optics, and thereby force us to think about them, and how to represent them, with a set of tools and a vocabulary that we haven’t had to use before. Bart’s thoughts on versioning and hierarchies are one example of this; Vanessa’s on Project[ive] Verse are another.

[discussing how one might define DH] the digital humanities represents a kind of critical method. It’s an application of critical analysis to a set of digital methods. In other words, it’s not simply the deployment of technology in the study of humanities, but it’s an expressed interest in how the relationship between the surface and the method or the surface and the various technological underpinnings and back stories — how that relationship can be probed and understood and critiqued. And I think that that is the hallmark of the best work in digital humanities, that it carries with it a kind of self-reflective interest in what is happening both at a technological level – and it’s what is the effect of these digital methods on our practice – and also at a discursive level. In other words, what is happening to the rhetoric of scholarship as a result of these changes in the way we think of media and the ways that we express ourselves and the ways that we share and consume and store and interpret digital artefacts.

Again, I’m struck by the lucidity of this, perhaps because I’ve found myself having to do a fair bit of explaining of DH in recent weeks to people who, while they seem open to the idea of using technology to help push forward the frontiers of knowledge in the humanities, have had little, if any, exposure to the kind of methodological bewilderment that its use can entail. So the fact that a TEI digital edition, rather than being some kind of whizzy way to make bits of text pop up on the screen, is itself an embodiment of a kind of editorial transparency, is a very nice illustration.

[on the role of TEI within DH] the TEI also serves a more critical purpose which is to state and demonstrate the importance of methodological transparency in the creation of digital objects. So, what the TEI, not uniquely, but by its nature brings to digital humanities is the commitment to thinking through one’s digital methods and demonstrating them as methods, making them accessible to other people, exposing them to critique and to inquiry and to emulation. So, not hiding them inside of a black box but rather saying: look this, this encoding that I have done is an integral part of my representation of the text. And I think that the — I said that the TEI isn’t the only place to do that, but it models it interestingly, and it provides for it at a number of levels that I think are too detailed to go into here but are really worth studying and emulating.

I’d like to think that this is a good description of what we’re doing with the EMiC editions: exposing the texts, and our editorial treatement of them, to critique and to inquiry. In the case of my own project involving correspondence, this involves using the texts to look at the construction of the ideas of modernism and modernity. I also think the discussions we’ve begun to have as a group about how our editions might, and should, talk to each other (eg. by trying to agree on the meaning of particular tags, or by standardising the information that goes into our personographies) is part of the process of taking our own personal critical approaches out of the black box, and holding them up to the scrutiny of others.

The entire interview – in plain text, podcast and, of course, TEI format – can be found on the TEI website here.

July 12, 2010

TEI @ Oxford Summer School: Intro to TEI

Thanks to the EMiC project, I am very fortunate to be at the TEI @ Oxford Summer School for the next three days, under the tutelage of TEI gurus including Lou Burnard, James Cummings, Sebastian Rahtz, and C. M. Sperberg-McQueen. While I’m here, I’ll be providing an overview of the course via the blog. The slides for the workshop are available on the TEI @ Oxford Summer School Website.

In the morning, we were welcomed to the workshop by Lou Burnard, who is clearly incredibly passionate about the Text Encoding Initiative, and is a joy to listen to. He started us off with a brief introduction to TEI and its development from 1987 through to the present (his presentation material is available here). In particular, he discussed the relevance to the TEI to digital humanities, and its facilitation of the interchange, integration, and preservation of resources (between people and machines and between different media types in different technical contexts). He argues that the TEI makes good “business sense” for the following reasons:

  • re-usability and repurposing of resources
  • modular software development
  • lower training costs
  • ‘frequently answered questions’ — common technical solutions for different application areas
  • As a learning exercise, we will be encoding for the Imaginary Punch Project, working through an issue of Punch magazine from 1914. We’ll be marking up both texts and images over the course of the 3-day workshop.

    After Lou’s comprehensive summary of some of the most important aspects of TEI, we moved into the first of the day’s exercises: an introduction to oXygen. While I’m already quite familiar with the software, it is always nice to have a refresher, and to observe different encoding workflows. For example, when I encode a line of poetry, I almost always just highlight the line, press cmd-e, and then type a lower case “L”. It’s a quick and dirty way to breeze through the tedious task of marking-up lines. In our exercise, we were asked to use the “split element” feature (Document –> XML Refactoring/Split Element). While I still find my way more efficient for me, the latter also works quite nicely, especially if you’re using the shortcut key (visible when you select XML Refactoring in the menu bar).

    Customizing the TEI
    In the second half of the morning session, Sebastian provided an explanation of the TEI guidelines and showed us how to create and customize schemas using the ROMA tool (see his presentation materials). Sebastian explained that TEI encoding schemes consist of a number of modules, and each module contains element specifications. See the WC3 school’s definition of an XML element.

    How to Use the TEI Guidelines
    You can view any of these element specifications in the TEI Guidelines under “Appendix C: Elements“. The guidelines are very helpful once you know your way around them. Let’s look at the the TEI element, <author>, as an example. If you look at the specification for <author>, you will see a table with a number of different headers, including:

    the name of and description of the element

    lists in which modules the element is located

    Used By
    notes the parent element(s) in which you will find <author>, such as in <analytic>:

    <author>Chesnutt, David</author>
    <title>Historical Editions in the States</title>

    May contain
    lists the child element(s) for <author>, such as “persName”:

    <author persName=”Elizabeth Smart”>Elizabeth Smart</author>

    A list of classes to which the element belongs (see below for a description of classes).

    Example and Notes
    Shows some accepted uses of the element in TEI and any pertinent notes on the element. On the bottom right-hand side of the Example box, you can click “show all” to see every example of the use of <author> in the guidelines. This can be particularly useful if you’re trying to decide whether or not to use a particular element.

    TEI Modules
    Elements are contained within modules. The standard modules include TEI, header, and core. You create a schema by selecting various modules that are suited to your purpose, using the ODD (One Document Does it all) source format. You can also customize modules by adding and removing elements. For EMiC, we will employ a customized—and standardized—schema, so you won’t have to worry too much about generating your own, but we will welcome suggestions during the process. If you’re interested in the inner workings of the TEI schema, I recommend playing around with the customization builder, ROMA. I won’t provide a tutorial here, but please email me if you have any questions.

    TEI Classes
    Sebastian also covered the TEI Class System. For a good explanation what is meant by a “class”, see this helpful tutorial on programming classes (from Oracle), as well as Sebastian’s presentation notes. The TEI contains over 500 elements, which fall into two categories of classes: Attributes and Models. The most important class is att.global, which includes the following elements, among others:


    All new elements are members of att.global by default. In the Model class, elements can appear in the same place, and are often semantically related (for example, model.pPart class comprises elements that appear within paragraphs, and the model.pLike class comprises elements that “behave like” paragraphs).

    We ended with an exercise on creating a customized schema. In the afternoon, I attended a session on Document Modelling and Analysis.

    If you’re interested in learning more about TEI, you should also check out the TEI by Example project.

    Please email me or post to the comments if you have any questions.

    June 11, 2010

    Standardisation & its (dis)contents

    At lunch today a few of us met to talk with Meagan about strategies for standardising our projects, including personographies and placeographies, so as to make our various editions as interoperable as possible and to avoid duplicating each others’ labour. By happy chance we were joined by Susan Brown, who mentioned that CWRC is also working towards a standardised personography template which it might make sense for us to use too, given that EMiC will be one of the projects swimming around in the CWRC ‘fishtank’ (or whatever the term was that Susan used in her keynote).

    One outcome of doing this is that our EMiC editions and authors could then be more easily connected by researchers to literatures outside Canada – eg. through the NINES project – which would be brilliant in terms of bringing them to the attention of wider modernist studies.

    Meagan and Martin are, unsurprisingly, way ahead of TEI newbies such as me to whom this standardisation issue has only just occurred, and they are already working on it, in the form of a wiki. But, as Meagan said, they would like to hear from us, the user community, about what we would like to see included. Some things will be obvious, like birth and death dates, but might we also want to spend time, for example, encoding all the places where someone lived at all the different points in their life? That particular example seems to me simultaneously extremely useful, and also incredibly time-consuming. It also seems important to encode people’s roles – poet, editor, collaborator, literary critic, anthologist etc – but we need to have discussions about what that list looks like, and how we define each of the terms. Then there are the terms used to describe the relationships between people. What does it mean that two people were ‘collaborators’, for instance? (New Provinces has six people’s names on the cover but the archive makes it very clear that two of them had much more editorial sway than the others.) And how granular do we want to get with our descriptions?

    As for placeographies: as I’ve already said on the #emic twitterfeed, one very easy way to standardise these is to ensure we all use the same gazetteer for determining the latitude and longitude of a place when we put in our <geo> codes. I suggest this one at The Atlas of Canada. Once you have the latitude and longitude, there are plenty of sites that will convert them to decimals for you (one example is here).

    As Paul pointed out, it’s worth making the most of times when we meet face-to-face, because as we go along, our projects will change and our analytical interests will be clarified, and the things we need to encode will only make themselves clear gradually. So let’s take advantage of the summer institues and conferences to talk about the changing needs of our projects, and our evolving research questions, because it’s often quicker to have these conversations in person.

    Perhaps others who were around the table could chime in with things I’ve forgotten or misrepresented. And for everyone: what are your wish lists of things that you’d like to see included in our -ographies?

    June 8, 2010

    Thoughts so far on TEI

    After two days of TEI fundamentals, I have come to a few conclusions.

    First, the most interesting thing about TEI is not the things that you can do, but the things that you cannot do. TEI is, as far as I understand it, only concerned with the content of the text, ignoring everything else (paratextual elements, marginalia, interesting layout, etc.) – stuff some of us find extremely valuable. On top of this, the coding for variants is messy, complicated, and would be next to impossible for complicated variants. For an example of TEI markup for variants, check out: http://www.wwp.brown.edu/encoding/workshops/uvic2010/presentations/html/advanced_markup_09.xhtml.

    As a result, I am glad that EMiC is developing an excellent IMT – this should solve many of the limitations of TEI, at least from my perspective.

    Finally, while participants can all learn basic TEI and encoding, the next step of course would be to establish a CSS stylesheet. It seems to me that EMiC, like all publishing houses, should establish a single organization style, and design a stylesheet that all EMiC participants are free to use. This would ensure a consistent design and look of all EMiC orientated projects in their digital form. Maybe this could be something discussed at a roundtable at next-years speculated EMiC orientated course at DHSI.

    Something to ponder.