The OH Encoder

To make the process of encoding interviews more efficient, the DDHI is working with Agile Humanities to develop the OH Encoder, a software bundle that turns a “raw” plain-text transcript into a well-formed TEI document. The OH Encoder also partially automates the process of encoding the interview by recognizing and tagging the five categories of data specified in the basic layer of our schema (people, places, organizations, events, and dates). By using the OH Encoder with a custom-designed API (also developed by Agile), users of the DDHI will be able to easily and quickly encode interviews according to the DDHI encoding schema.

The current version of the OH Encoder produces well-formed TEI documents with automated tags that adhere to our basic layer schema. Future iterations of the OH Encoder will enable the process of linking the tagged entities in an encoded interview to external sources of data, to facilitate the creation of the authority lists needed for data visualization purposes.