Digital Edition

Creating a Digital Edition of Cassiodorus' Variae

Creating a digital edition of Cassiodorus’ Variae in TEI/XML means to provide scholars with a standardized version of the text (Text Encoding Initiative). This allows for the letters not only to be read and analysed by various digital tools but also to be converted into different formats (e.g. PDF or HTML).

For this purpose, all of Cassiodorus’ 470 texts will lemmatized, annotated, tagged with metadata and converted into TEI/XML.

The starting point of the JGU’s work is a digitalized edition of the Variae, originally produced by Theodor Mommsen (MGH Auct. Ant. XII, Berlin 1894). The plain text is being lemmatized with the help of the EHuDesktopand with the support of the Computational Historical Semantics team of the Johann Wolfgang Goethe-Universität Frankfurt(headed by Prof Dr Bernhard Jussen).

Lemmatizing means to define every word form’s semantic and morphologic properties and to link it to its base form (Lemma or Superlemma): For instance, Lat. regem(Engl. “the king”, accusative) will be linked to Lat. rex(Engl. “the king”, nominative).

EHuDesktop, Lemmatisation Editor

Among other aspects, defining a word’s semantic properties is necessary to distinguish homographs, such as:

  • pōpulus (Engl. “the poplar”) and Lat.populus (Engl. “the people”)
  • regis (Engl. “of the king”) and Lat. regis (Engl. “you rule“)
  • legis (Engl. “of the law”) and Lat. legis (Engl. “you read”)
  • mālus (Engl. “the apple tree”) and Lat. mālus (Engl. “the mast”) and Lat. malus (Engl. (“bad”)

Defining morphological properties means to link word forms to their base, to determine their part of speech as well as case, number, gender or person, tense, modus, genus verbi…

  • Lat. artium: genitive, plural, feminine of Lat. ars (Engl. “the art”)
  • Lat. iunxerunt: 3rdperson, plural, perfect, indicative, active of Lat. iungere (Engl. “to connect”)

…it also means to take spelling variations into account:

  • Lat. popvlvsand Lat. populus (Engl. “the people”)
  • Lat. copoand Lat. caupo (Engl. “the innkeeper”)

So far, two of the twelve books of the Variaeas well as its praefatio(Liber I and III, 100 documents) have been completely lemmatized. As all texts have been converted into TEI/XML, the entire Variaeare ready for further editing and for various steps of analysis: https://hudesktop.hucompute.org/index.jsp(Computational Historical Semantics of the Johann Wolfgang Goethe-Universität Frankfurt).

Completely lemmatised letter, in this example Cassiod. Var. 1.1 (EHuDesktop, Lemmatisation Editor)

With the help of the virtual research environment TextGridLab and the application QAnnotate, all 470 letters of the Variae will be commented on and annotated with images, digitalized manuscripts, geographical data, and historical/philological information.

Commentary Cassiod. Var. 3.1 (TextGridLab)

Commentary on person “Alaric II” (QAnnotate)

Furthermore, the documents are being tagged with metadata, including information as to the letters’ senders, addressees, dates of issue, and topics. We also tag the places, regions and persons mentioned in the works as well as the ethnic, religious and social groups in question. So far, the Liber Tertius of the Variae (comprising 53 letters) has been completely tagged with metadata.

Metadata for Cassiod. Var. 3.1 (QAnnotate)

All files (Latin text with metadata, lemmatization and annotations) will be converted into TEI/XML.

Cassiod. Var. 3.1 in TEI/XML-coded form (Oxygen-Editor)