DDHI Encoding Guidelines

Applying the Basic Layer of the DDHI Encoding Schema to Oral History Interview Transcripts

These guidelines provide users of the Dartmouth Digital History Initiative with a method to digitally encode the textual content of the transcript of an oral history interview. This method involves the application of an encoding schema designed specifically for use with oral histories. Members of the DDHI team, including our Dartmouth student research associates, began developing this schema during the fall of 2019. Our schema is based on the Text Encoding Initiative (TEI), an XML-based encoding standard used by many digital humanists. Since our schema remains very much a work in progress, these guidelines will be revised and updated as we learn more about TEI and how to apply it to oral history interviews.

In designing the DDHI oral history encoding schema, we have adopted what we refer to as a "layered" approach. Rather than trying to encode all of the interesting and relevant data contained in an oral history interview--a hopeless task!--our schema is based on the application of multiple layers of encoding, with different layers for different categories of data. DDHI users will be able to select the particular layers that they think will be most useful for their research and archiving purposes. 

The DDHI encoding schema begins with what we refer to as the basic layer. The basic layer begins with the encoding of utterances--the individual strings of word and sentences spoken by interview participants during the interview. In addition to utterances, the basic layer also includes the encoding of five categories of data contained in the text of the interview: people, places, organizations, dates, and events.

One important caveat: the encoding of an oral history interview is more than just the straightforward application of an objective set of rules. As oral historians know well, oral history transcripts are complex texts. Particular words and phrases in a transcript often carry ambiguous or multivalent meanings. Moreover, historians often find meaning in the absences or silences in a transcript; the things unspoken during an oral history interview may have great significance. As a result, encoders will inevitably have to use their subjective judgment when making decisions about whether or how to encode particular portions of a transcript. They will also have to pay close attention to the context in which a given word or phrase appears in the text of an interview. Thus, users of these guidelines should keep in mind that they are in fact guidelines, rather than a collection of hard and fast rules that will always and only be susceptible to a single "correct" interpretation.


Turning an interview transcript into a TEI file: the OH Encoder

To apply the DDHI encoding schema to an oral history transcript, the file containing the transcript must first be transformed or converted into an XML file. To do this, the DDHI has developed a tool known as the OH Encoder (where "OH" stands for "oral history"). The OH Encoder is actually a bundle of software tools that uses an oral history transcript stored in Microsoft Word format (.doc/.docx) or in plain-text format (.txt) to create a new file in XML format (.xml). The XML version of the transcript is the one that will be encoded; the original transcript file is left unchanged. 

In addition to creating an .xml version of the transcript, the OH Encoder also ensures that the newly-created file is well-formed according to the TEI encoding standard. This means that the file is configured in such a way to allow the document to be compared to the general TEI encoding rules. The process of XML checking the document to see if it actually conforms to the TEI rules is known as validation

In configuring the XML file as a well-formed TEI document, the OH Encoder adds new data that was not present in the original transcript file. Some of this new data appears in an entirely new section of the document known as the TEI Header that appears at the top of the document. The actual text of the interview is contained in the body section of the document, which appears after the header.


The TEI Header

The TEI header of a DDHI-encoded transcript contains valuable metadata about the oral history interview, as well as other important information needed for the encoding process. The header provides general information about the DDHI basic layer schema; it also identifies the title of the interview, the participants in the interview, the length of the audio recording of the interview, and the date on which the interview was conducted. 

One important category of data contained in the header is specified by the TEI attribute known as xml ids. An xml id is a digital marker that is used to refer to a particular person, place, or other entity referenced in a text. The OH Encoder assigns xml ids to all of the participants in the interview. Many oral history interviews have just two participants (an interviewer and a narrator/interviewee) but in some cases there may be more than two participants. To assign the XML ids, the OH Encoder uses the <particDescr> element in conjunction with the <person> element.  In the case of an interview conducted by an interviewer named Tim Harrison with a narrator named James Zien, the code in the header could look like this:


    <person xml:id="ZIEN"/>

    <person xml:id="HARRISON"/>


By including this code in the TEI header, the OH encoder establishes the xml ids that can then be used throughout the text of the interview to identify the text spoken by each of the two interview participants.


The Body Section: Encoding the text of the interview

When working with an XML file produced by the OH Encoder, you will spend most of your time on the body section of the file. This is the section of the file which contains the actual text of the interview. As you will see when you look at the file, the OH Encoder automatically inserts TEI elements ("tags") into the text of the interview. In addition to encoding each utterance of spoken text recorded in the transcript, the OH Encoder also attempts to identify and encode five additional types of data entities as they appear in the text. 



TEI element: <u>

An oral history transcript can be thought of as a textual record of a series of utterances--the strings of words that were spoken by the interview participants to each other during the course of the interview. In our schema, the <u> element is used to encode each utterance recorded in the interview transcript. To identify the speaker of the utterance, the <u> element is always used in conjunction with the who attribute. By using this attribute, the utterance can be connected back to the xml ids that were defined in the TEI header to establish the identity of the participants in the interview. In the example below, taken from the Harrison-Zien interview referenced earlier, Harrison is identified as the speaker of the first utterance and Zien as the speaker of the second:

<u who="HARRISON">What did your parents do—what did your father do while you were growing up?</u>

<u who="ZIEN">He joined the family company, which was a mechanical contracting company that my grandfather had started as a journeyman plumber back in the ’20s. This was a standard-issue company that did plumbing, heating, air conditioning and was—it was a growth time after the war with the suburbs being built and a lot of people needing help with their basic systems in their homes. So my father and his three brothers ultimately took over the company from my grandfather, their father, and ran that company for many, many, many years.</u>

If the original oral history transcript clearly identifies the speakers of each utterance, the OH Encoder should be able to encode each utterance with a high degree of accuracy.


Other Data Entities encoded in the basic layer: The Big Five

In conceptualizing the basic layer of the DDHI encoding schema, we wanted to focus on categories of data that are likely to be found in almost any oral history interview. The five categories that we selected for inclusion in this layer are: (1) real people who are identified by name; (2) real places on Earth which can be geolocated; (3) calendar dates; (4) organizations or institutions; and (5) well-known events. In selecting and defining these five categories, we were guided by our experience in the field of oral history and by our knowledge of some of the types of data that oral historians often examine in their research. In addition to paying attention to the references to the particular people, places, and institutions mentioned in interviews, oral historians are often interested in what oral history interviews reveal about time, chronology, space, and memory--hence our decision to include dates and events.

The five categories included in the basic layer (and the corresponding TEI elements used to encode them) are summarized in the following table:


Data Category

TEI Element/Attribute Example




<persName>Colin Powell</persName>


<orgName>Vietnam Veterans Against the War</orgName>

<date when="yyyy-mm-dd">

<date when="1968-01-31">January 31, 1968</date>

<name type ="event">

<name type="event">Vietnam War</name>



Guidelines for applying the five elements in the basic layer


TEI element: <persName>

In our schema, the <persName> tag is used to encode references to a real person identified by name in an interview. 

In order to use this tag, the interview must contain the individual’s legal name and/or the name that is associated with that person in formal, official, or popular discourse. The names "Bob Dylan" and "Jimmy Carter" would be tagged, even though the legal names of those individuals are Robert Allen Zimmerman and James Earl Carter, respectively. 

The <persName> tag should be applied only to references to single individuals. Thus, "Kim Kardashian" could be tagged at the basic layer, but "the Kardashians" would not be tagged in the basic layer. If an individual is identified by full name and then identified on subsequent reference by a nickname or abbreviated version of the name, all forms of the name can be tagged. The <persName> tag will not be applied to pronouns, even when the pronoun refers unambiguously to a single person.

The <persName> tag can be applied both to famous individuals and to individuals who are relatively unknown. However, the interview must contain enough information such that the individual in question can be identified with a reasonably high degree of precision. If an interview narrator states "I had a friend in high school named Julie" but does not provide Julie’s family name or otherwise identify her in the interview, the reference should probably not be tagged. On the other hand, if the interviewer did provide Julie’s full name later in the interview, or if Julie’s full name and identity can be inferred or determined from other available information, then all named references to Julie throughout the interview can be tagged. 

Named references to actual historical figures such as Cleopatra or John F. Kennedy can be tagged at the basic layer. In general, fictional characters will not be tagged in this layer, even if they are well known.  Characters who exist only in literature or legend (for example: King Arthur) will not be tagged in this layer. References to an individual mentioned in religious scripture or in sacred traditions will generally not be tagged, unless the individual’s historical existence has been documented via other sources. Thus, figures such as Jesus Christ or the Virgin Mary would not be tagged in the basic layer.

When using the <persName> tag, all components of a person’s name should be included in the tag. If a transcript contains added information about a person’s name placed in brackets, the bracketed portion of the name would be included. For example: 

<persName>Cybil [Parker]</persName>

If a title, class, prefix, or suffix is associated with the name in the interview, include it in the tag. For example: 

<persName>Dartmouth College President Phil Hanlon</persName>. 



TEI Element: <placeName>

In the basic layer, the <placeName> tag is applied to a real-world place that is geo-locatable--that is, a place which can be located on the surface of the earth using latitude and longitude coordinates.

Many different kinds of places are taggable in the basic layer. These include: human settlements (cities, towns, villages, hamlets, etc.); countries and their territorial administrative elements (provinces, states, districts, etc.); some kinds of infrastructure (named or numbered roads, buildings, canals, the Great Wall of China, etc.); and named landforms (mountains, plateaus, glaciers, lakes, rivers, bays, etc.).

In general, we avoid tagging continents and oceans at the basic layer, because their large size relative to the entire surface of the earth makes it difficult to include them in visualizations containing data on smaller geographic features and places. (Australia can be tagged on the grounds that it is a country as well as a continent.) We also avoid tagging regions, including both transnational regions (East Asia) and regions of individuals countries (the US Midwest). However, when the boundaries of a region are legally or administratively defined, it can be tagged; for example, a reference to the region of Tonkin in French Indochina should be tagged, since it was one of the administrative units of the French colonial governance structure. It is also acceptable to tag references to nation-states whose de jure claims of territorial control were larger than the territory under their de facto administration. Thus, references to "South Vietnam" or "East Germany" could be tagged and associated with the boundaries of the territory that they actually administered during the period that they existed as separate states.

One recurring challenge associated with applying the basic layer of our schema to an interview has to do with distinguishing between places and organizations--especially when an organization or institution is based in (or associated with) a particular location. This problem can be seen, for example, in references to universities. When an interview narrator refers to "Dartmouth," is she referring to the Dartmouth College campus as a physical/geographical location, or to Dartmouth College’s identity as an educational institution? In such situations, the encoder must use context and inference to decide which tag is most appropriate. For example, if the narrator mentions Dartmouth while referring to an event that occurred on Dartmouth’s campus, the <placeName> tag may best capture the meaning of the term. Alternatively, if the narrator is referring to "Dartmouth culture," then the <orgName> tag may be a better choice. 

Another example of the above: narrators may use the colloquial shorthand term "the VA" to refer either to the Veteran’s Administration (which is a U.S. government-sponsored organization) or to a specific Veteran’s Administration-affiliated hospital (which, depending on the context, could be a place).

Such decisions will often turn on the judgment of the encoder, and on the encoder’s interpretation of how the entity in question was used in the interview.


TEI Element: <date>

  • Attribute: when=""
  • Format: <date when="YYYY-MM-DD">, <date when="YYYY-MM">, or <date when="YYYY">

In the basic layer of our schema, the <date> tag is applied to any reference to a specific year or to any reference to a specific month or day within a specific year. This tag is modified with the "when" attribute so that the actual date is included within the tag.

A reference to a year can be tagged with <date> when it is in conventional four-digit form or abbreviated two-digit form:

I started college in <date when="1975">1975</date>.

I started college in <date when="1975">’75</date>.

If an interview references a span of time within a year other than a day or a month then only the year will be tagged. For example, seasons will generally not be tagged:

I started college in the fall of <date when="1975">1975</date>.

In order to tag a reference to a day, the interview must contain sufficient information to determine the day, month, and year. In some cases, the date, month, and year may all appear within a single tag. For example:

My first day of junior high school was <date when="1967-09-14">September 14, 1967<date>. I remember that because it was my birthday.

In other cases, some relevant information about a date may not be stated explicitly in the reference, but can still be inferred from information provided elsewhere in the interview.

I graduated from high school in the spring of <date when="1966">1966</date> and I enlisted in the army about a week after my graduation. I went to boot camp to begin my training later that summer. After completing boot camp, I had a few weeks of leave at home before I had to report for duty in Fort Benning in Georgia. A few months after I arrived at Fort Benning, I got word that my unit was shipping out to Vietnam. We arrived in Vietnam on <date when="1966-11-25">November 25</date>, which was exactly six months to the day after my graduation.


TEI Element: <orgName>

In the basic layer, the <orgName> tag is applied to named organizations or institutions that meet the following criteria:

  1. An entity with a membership (or a body of affiliates or supporters) that is defined by more than just the self-identification of its members;
  2. An entity that engages in administrative practices. Such practices may include budgeting/monetary expenditures; elections; governance; recruitment; communications; establishment/maintenance of a bureaucratic structure; or organized advocacy on behalf of some cause, idea, or policy.

Examples of entities that may be included in this category in the basic layer include: educational institutions such as schools or universities; business enterprises; non-profit organizations; governments; military services; political parties; and sports teams.

Note that this definition of organization does not include broadly defined identity categories organized around notions of race, gender, religion, nationality, or other markers of human difference. For example, references to "Catholicism" or "Catholics" would generally not be tagged with <orgName> in the basic layer. However, a reference to "the Catholic Diocese of Miami" would likely be tagged, since the diocese is an organization that engages in a range of administrative and bureaucratic practices. Similarly, a reference to "African Americans" would likely not be tagged at the basic layer, but a reference to the "National Association for the Advancement of Colored People" would be tagged.

With regards to companies and businesses: the names of those entities will be tagged when they are referring to the business enterprise, its operations, or its employees and/or managers. However, we generally do not tag company or business names when they are referring to a product of the company or business. In the sentence, "my father worked in a Ford plant for 30 years," the name "Ford" would be tagged, since the reference is to someone’s employment at the company. But in the sentence "My first car was a Ford," the name "Ford" would not be tagged.

When applying the basic layer, a user may tag organizations that are part of larger organizations. For example, most national military forces exist as organizations within national governments; moreover, most military forces are themselves composed of a hierarchical array of individual units, agencies, and offices, each of which has its own administrative structure. In the case of the United States military: the U.S. Department of Defense is an organization within the U.S. government, and the five main military services (the Coast Guard, Army, Navy, Air Force, and Marines) are organizations within the Department. Each service contains a shifting array of military units; these units include divisions, fleets, commands, regiments, squadrons, battalions, companies, platoons, etc. In addition, the services also contain various offices and agencies, many of which are staffed by a combination of uniformed and civilian personnel. Since virtually all of these entities will meet the definitional criteria of an "organization" specified above, they can be tagged at the basic layer. 

Note: some very small military units--such as a "squad" or a "detachment" of soldiers that does not have any dedicated administrative or support elements associated with it--will not meet the basic layer definition of an organization, and therefore should not be tagged at this layer.

References to a person’s membership in an organization (or the affiliation of a group of people with an organization) are often taggable with <orgName>. For example, in the sentence "I looked over and saw twenty Marines advancing up the next hill," the term "Marines" would be taggable, since it specifies the membership of a group of men in the United States Marines, a taggable organization. Similarly, in the sentence "my mother’s family was a very Republican family," the term "Republican" would be taggable, if it is deemed to refer to affiliation with the Republican Party.

Leagues, alliances and conferences are taggable if they meet the definition of "organization" (for example: the Ivy League).

Ships or other watercraft can be considered an organization if they (1) have formal names or official designations; and (2) house an on-board crew or staff. These types of organizations may include military vessels (the HMS Ark Royal) or civilian ships (the Titanic). 

In most cases, musicians or music groups (such as the Beatles or TLC) will not be tagged as organizations. There may be other music-related entities which do meet the criteria for tagging as an organization, such as a record label, or a music association. (For example, Indiana State School Music Association is probably taggable). Note that references to individual musicians (Paul McCartney) can often be tagged as people, using <persName>.



TEI Element:  <name>

  • Attribute: type= "event"
  • Format: <name type="event">
  • Note: the type attribute here is used to specify that the <name> element is being applied to an event.

In the basic layer, we will only tag events which are considered to be "well-known." We will avoid tagging events that are particular to the narrator’s experience and would not be familiar to a larger audience. For example, "World War II" will be tagged, but "my high school graduation" will not. 

If a well-known event is commonly known by reference to a date it should be tagged as an event, not a date. For example, both "9/11" (as a reference to the terrorist attacks of September 11, 2001) or "the October Revolution" (as a reference to the events that took place in Russia in October 2017) should be tagged as events, not dates. 

The "event" category is often challenging to apply, because participants in oral history interviews will often refer to events using vague, nondescript nouns. In some cases, a speaker may use a more specific name to reference the event when they first mention it, and then use a more generic term for subsequent references. For example, an interview narrator might mention "the Vietnam War" at the beginning of an interview but then refer thereafter to the same event as simply "the war." In most cases, only the specific reference to the event ("the Vietnam War") will be tagged, and the more generic reference ("the war") will not be tagged.

Note that there may be cases in which the names of places are used in a way that refers to an event. In such cases, the encoder will have to investigate the context in which the place name is used to determine how to code it. For example, in the sentence "His uncle was killed at Pearl Harbor," the term "Pearl Harbor" could plausibly be tagged as an event, insofar as it was evidence that the term was referring to the famous battle that took place on December 7, 1941.

Special cases and judgment calls

While the five categories of data specified above might appear to be easy to distinguish from one another, in practice such distinctions can be difficult to draw. In such cases, encoding decisions are sometimes hard to make. The sections below call attention to some of the more difficult recurring problems that come up in the application of the DDHI basic layer, along with some suggestions of how to handle them.

Paying attention to grammar: adjectives, nouns, and noun phrases

When applying the DDHI basic layer to an interview, it is important to pay attention not only to the meaning of individual words but also grammar and parts of speech. The encoder should take particular care when encountering a noun phrase consisting of an adjective and a noun.

In the case of a noun phrase, the encoder must first determine if the entire noun phrase is a taggable entity, according to the rules of the basic layer. If it is, then the entire noun phrase--both the adjective and the noun--should be included in the tag. Consider the sentence "My father served for four years in the United States Army." This sentence contains the noun phrase "United States Army" consisting of both an adjective (United States) and a noun (Army). Since the United States Army meets the basic layer definition of an organization, the sentence should be tagged as follows:

My father served for four years in the <orgName>United States Army</orgName>.

Note that this case, the adjective portion of the noun phrase "United States" would not be tagged as a place, since the entire noun phrase of which it is a part is a taggable entity.

The principle above applies even when the adjective is part of a prepositional phrase. For example, in the phrase "the army of the United States," the entire noun phrase "army of the United States" would be taggable as an organization and "United States" would not be tagged as a place.

In some cases, a noun phrase will not refer to a taggable entity but the adjective contained in the noun phrase will be taggable (according to the basic layer rules). For example, in the sentence "My father was a World War II veteran," the noun phrase "World War II veteran" is not a taggable entity, since a generic reference to "veteran" does not fit with any of the five categories included in the basic layer. However, the adjective "World War II" is taggable, because it refers to a well-known event. Thus, the sentence would be tagged as follows:

My father was a <name type="event">World War II</name> veteran.

In the example above, the adjective "World War II" is easily identifiable as a reference to a well-known event, and the tagging decision is relatively straightforward. However, there may be other cases involving adjectives in noun phrases in which the encoding decision may be much more context specific.

Consider the adjective "Japanese." In some use cases, it may be appropriate to tag this term as a place, insofar as the speaker is referring in reasonably straightforward fashion to the country known as Japan. For example, in the sentence "Domestic Japanese rice production was going up during the 1960s," one could make a good case that "Japanese" is referring to rice production in the geolocateable place known as Japan, and that therefore "Japanese" in this case is taggable as a place. 

However, in many other cases, the term "Japanese" might not be making such a direct reference to the country of Japan. For example, in cases in which "Japanese" is used to refer to the Japanese language (such as "she spoke Japanese well") the encoder would probably choose not to tag "Japanese" as a place, since the Japanese language is not spoken exclusively in Japan. 

Yet another example is provided by the noun phrase "Japanese families." Depending on how this phrase is used in an interview, the word "Japanese" may or may not be taggable. For example, if a narrator said "There were a lot of Japanese families in the neighborhood of Los Angeles where I grew up," the encoder might conclude that "Japanese" was a reference to ethnic or racial identity rather than a reference to Japan as a place.

In general, if the encoder concludes that a place-derived adjective such as "Japanese" or "European" is being used to specify a racial, ethnic, national, or cultural identity, the term should not be tagged as a place in the basic layer.


Brackets and other added text

In oral history transcripts, bracketed text is often inserted by a transcriptionist to clarify the meaning of spoken text, or to provide additional factual information. If the bracketed information is adjacent to a taggable entity and is relevant to understanding the meaning of that entity, it should be included in the tag. Examples: 

<persName>President [Dwight D.] Eisenhower</persName>; 

<placeName>[Dartmouth] Green</placeName>

We do not tag non-verbal utterances in brackets, such as [chuckle] or [laughter] at the basic layer. Also, if an entity appears only within a bracket--if there is no explicit reference to the entity in the actual spoken record of the interview--the entity will generally not be tagged.