While I worked on the creation of my XML tree model, I realized that some recurrent elements of the letters will be more difficult to encode, mainly because they are neither mentioned nor explained on the TEI Guidelines. While the guidelines contain many sections and cover a lot of different writing structures (manuscript, speech, drama, etc.), information about the encoding of correspondence are still really scarce. This has been improved several years ago by the creation of a unit: the TEI Special Interest Group (SIG) on Correspondence. Among the elements they helped create were the <correspDesc>1 and the Correspondence Metadata Interchange Format (CMIF), that led to correspSearch2, which provides a correspondence network of editions and projects and allows better research, linking and analysis. They also developed the use of basic elements of correspondence, which can be further explore on Encoding correspondence.
The main part that I struggle with on the letters from my corpus is the opener, which, according to the TEI, “groups together dateline, byline, salutation, and similar phrases appearing as a preliminary group at the start of a division, especially of a letter3”.
Paul d’Estournelles de Constant, while writing his letters, put some data in the beginning of his letters, that are not included in the correspondence section of the TEI Guidelines, so I had to do some extra thinking and research to find a way to encode the more complicated parts of the opener. I used some of the data that I gather from the different websites I mentioned earlier to encode my transcription and although some of it helped me, I still had troubles for other parts. Ultimately, it comes down to an individual choice that I do for the encoding and I will simply have to discuss and verify it with my project managers.
Five issues arose while I made the transcription: two were fixed pretty easily, three demanded more research to find the right solutions. Among those five issues, four were due to the letter writing style of d’Estournelles and one was due to an action made by the archivist in charge of the collection. They can all be observed on the picture below and I will detail those issues, one by one, each with an illustration and the solution I came up with.
Clik here to view.

Opener from the letter 9bis of November 16th, 1914
The title
Image may be NSFW.
Clik here to view.
During his correspondence, Paul d’Estournelles de Constant sent many letters to Murray Butler that were each time about a definite subject, pointed out clearly by d’Estournelles by the use of a title. It is present on every letter, is consistently written the same way (uppercase and underlining) and is always positioned at the same place on the letter (just before the salute).
If a title is not common on most letters, it is however a very common component to have on a text, so that the TEI guidelines allow, among the elements that can be featured in the opener, the use of a <title> element.
Therefore, the encoding of that part is pretty easy and it will look like this:
<opener>
...
<title rend="align(center)"><hi rend="underline">FRIED – DAVID-STAR-JORDAN – HERIOT et<lb/> NOTRE ENQUETE DANS LES BALKANS – LES CONVENTIONS DE LA HAYE. <lb/> L’ENVOI d’une COMISSION D’ENQUETE AMERICAINE EST-ELLE POSSIBLE?<lb/> LA NEUTRALITE des ETATS-UNIS</hi></title>
...
</opener>
The numbering
Image may be NSFW.
Clik here to view.Paul d’Estournelles de Constant and Nicholas Murray Butler started their correspondence in the beginning of the 20thcentury in 1902 and it develops and evolves through the years. However, World War I represent a shift: it becomes a war correspondence, much more solemn and official and on August 15th, 1914, Paul d’Estournelles send what is labelled as the first letter of this correspondence and in ten years, until his death in 1924, 1500 letters, carefully numbered, will be sent.
This numbering is precious when working on a corpus because it assures an order well preserved that does not rely only on the dates of the letters, that can sometimes be equal and can then create confusion. It will be encoded in the metadata and in the body. Its place on the metadata is easy because we decide to use that part to name the letter on the <fileDesc> element. It also helps titled the XML file and so it can be found on the <correspDesc> to give its position in the <correspAction>. Positioning it in the body can be more complex because it is necessary to decide whether or not it can be considered a part of the opener. While the title makes sense in the opener, the numbering seems less appropriate, since it can be seen as a heading for the letter as part of a whole corpus, which means that I will use the <head> element, that will be placed after the <pb> element, because the header is located on the page of the facsimile, and before the <opener>.
<pb n="1" facs=" .jpg"/>
<head rend="center">LETTRE N°9 bis</head>
<opener>
…
</opener>
The letterhead
Image may be NSFW.
Clik here to view.Paul d’Estournelles de Constant was a senator from 1904 to 1924. Subsequently, among the supplies at his disposal were letterhead papers that he mostly used for his correspondence. This letterhead only appears on the first page of the letter but it still needs to be encoded. This issue was one that was easy to fixed, even though it implies an individual choice. The use of letterheads is not mentioned in the TEI guidelines but it is among the elements studied by the SIG Correspondence and answered in Encoding Correspondence, as part of the article on pre-printed text4. In the article, several ways to encode the letterhead are suggested, with their pros and cons and I decided to use the one preferred on the article: “Using <layoutDesc>”.
It is composed of two parts: one in the metadata and one in the body. In the metadata, in the <physDesc> of the <msDesc>, in the <layoutDesc> element, the letterhead is specified, along with an @xml:id:
<layoutDesc>
<layout xml:id="lh-senat">A letterhead is printed in the top left of the first page of the letter. The letterhead is constituted by "Sénat" written in uppercase and underlined. </layout>
</layoutDesc>
It is then referenced in the body, usually in the <opener> because the encoding has to represent the layout of the text on the page and the letterhead is mostly placed between the dateline and the title. In the body, we use a <fw> element (forme work) because it is used to encode “running head (e.g. a header, footer), catchword, or similar material appearing on the current page5” and it can be contained inside the <opener> bloc. The <fw> element has an attribute @corresp that links it to the metadata:
<opener>
...
<fw corresp="#lh-senat"><hi rend="underline">SÉNAT</hi></fw>
...
</opener>
The status
Image may be NSFW.
Clik here to view.This element is the last peculiar one from the correspondence of Paul d’Estournelles de Constant due to his style and it’s also the rarest: some letters were marked by an inscription “Personnelle”. It does not appear on numerous letters but it still needs to be encoded. First of all, by its place on the letter, it is sure that it has to be in the <opener> bloc. Second of all, since it is not part of the elements mentioned in the <opener>, it demands more research. No elements in the TEI guidelines seems to fit that peculiar information and the choice is even more restrained by the fact that it needs to be an element compatible with the <opener>, so I decided to use the same technic used for the letterhead, the <fw> element, only this time, it is not link to an element in the metadata.
<opener>
…
<fw><hi rend="underline">PERSONNELLE</hi></fw>
…
</opener>
The stamp
Image may be NSFW.
Clik here to view.Finally, the last issue resulted from an action made by the departmental archive when they received the collection. As a mean to indicate that the collection now belongs to them, they put a stamp on almost every page of the letters from our corpus. That stamp was placed in convenient place, where it does not embarrass the reading of the document and with more or less ink, as we can see here. This element raises several questions: can it be considered part of the text? Does it have to be on the body? How can it be written in the metadata?
After a lot of thinking, I decided that, considering its presence on every page of a letter and the fact that it can’t be considered an addition to the text, I would not integrate it to the body but only to the metadata. I had to find then where I could place it: my project managers suggested that it should belong in the <msDesc> since it is part of the “manuscript”. Consequently, I placed that part in two elements of the <msDesc>: the <handNote> and <additions>. The first one describes the hand that put the stamp on the document (the archivist) and the second one mentions and explains the stamp.
<handDesc>
...
<handNote xml:id="stamp" medium="black_ink" scope="minor" scribe="archivist" scribeRef="#ADSarthe"> Hand of the archivist that collected and stamped the letters.</handNote>
</handDesc>
<additions>
<p>A stamp has been put on almost every page of the correspondence by the institution in charge of the collection. On the stamp is written: <stamp resp="#stamp">Archives de la Sarthe <lb/>Propriété publique</stamp></p>
</additions>
To conclude, we can see that during the encoding of a document, one is likely to come across elements that can create some difficulties, which is rendered even more complicated by the lack of documentation on how to fix some of the issues encountered. However, it is possible to extrapolate with what we have at our disposal and then find an adequate solution, even if it is with some common elements easy to use and to incorporate in an encoding.
In this post, I only developed the case of five issues that I frequently encountered while reading the corpus documents and that were concentrated in the beginning of the letter. Yet, it’s likely that the transcription of my corpus will raise many more issues on other parts of the letters and I will have to find ways to fix it. Nevertheless, these examples prove that, with research, it’s possible to find easy solutions.
- Peter Stadler, Marcel Illetschko and Sabine Seifert, « Towards a Model for Encoding Correspondence in the TEI: Developing and Implementing <correspDesc> », Journal of the Text Encoding Initiative [Online], Issue 9 | September 2016 – December 2017, Online since 24 September 2016, connection on 03 April 2020. URL: http://journals.openedition.org/jtei/1433
- Stefan Dumont, « correspSearch – Connecting Scholarly Editions of Letters », Journal of the Text Encoding Initiative [Online], Issue 10 | December 2016 – July 2019, Online since 14 February 2018, connection on 03 April 2020. URL: http://journals.openedition.org/jtei/1742
- https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-opener.html
- Sabine Seifert, Nicolas Schenk: Pre-printed parts: Letterheads and forms. In: Encoding Correspondence. A Manual for Encoding Letters and Postcards in TEI-XML and DTABf. Edited by Stefan Dumont, Susanne Haaf and Sabine Seifert. Berlin 2019. URL: https://encoding-correspondence.bbaw.de/v1/pre-printed-parts.html
- https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-fw.html