Till Grallert
5 June 2015
The slides are based on those supplied by the various Digital Humanities Summer Schools at the University of Oxford under the Creative Commons Attribution license and have been adopted to the needs of the 2015 Introduction to TEI at DHSI.
Slides were produced using MultiMarkDown, Pandoc, Slidy JS, and the Snippet jQuery Syntax highlighter.
Increasingly people want to do not just ‘text’ editions but text editions with facing page (or otherwise linked) facsimile images. Indeed, some people want to just have images and create and electronic facsimile (perhaps with a view to later eventual transcription). The <facsimile>
element, a sibling of <teiHeader>
and <text>
, is provided to accommodate this desire.
<facsimile>
contains a representation of some written source in the form of a set of images rather than as transcribed or encoded text<surface>
defines a written surface in terms of a rectangular coordinate space
@start
points to an element which encodes the starting position of the text<zone>
defines a rectangular area contained within a <surface>
element@facs
(facsimile) points directly to an image, or to a part of a facsimile element which corresponds with this element.@facs
If a digital text contains one image per page or column (or similar unit), and no more complex mapping between text and image is envisaged, then the @facs
attribute may be used to point directly to a graphic resource.
This image of the first page of a consular report is found at “../images/pro-fo/618-3/DSCN9874_150dpi.jpg”
first page of PRO FO 618/3, Damascus 30, 3 Aug 1908, Devey to Lowther
<pb n="1" facs="../images/pro-fo/618-3/DSCN9874_150dpi.jpg"/>
<div>
<head>Ex-Mushir Fuad P. released</head>
<dateline><date>August 3, 1908</date></dateline>
<dateline>Dft<lb/>
<persName>Sir Gerald A. Lowther</persName><lb/> K.C.M.G.,C.B., <lb/>
<placeName><choice>
<abbr>Cple</abbr>
<expan>Constantinople</expan>
</choice></placeName>
<lb/> No. <del>30</del>
<del>29</del> 30</dateline>
<dateline>42 Emb.</dateline>
<p>Sir,
<lb/>I have the honour to report to Y.E. that the General Amnesty granted by
<persName><choice>
<abbr>H.I.M.</abbr>
<expan>His Imperial Majesty</expan>
</choice> the Sultan</persName> to <del>all</del> Political
<del>criminals was after some hesitation taken to</del>
<add>exciles was after some hesitation taken to</add> include the Ex-Mushir
<persName>Fuad Pasha</persName> who has been imprisoned under the
strictest confinement for the last six and a half years
<!-- ... -->
</p>
</div>
@facs
in conjunction with <facsimile>
, <surface>
, and <zone>
Using these attributes and elements together enables an editor to
<facsimile>
element in the TEI headerThe facsimile element is used to represent a digital facsimile. It appears within a TEI document along with, or instead of, the text element introduced in earlier in this course. When this module is selected therefore, a valid TEI document may thus comprise any of the following:
<facsimile>
<TEI>
<teiHeader>
<!-- ... -->
</teiHeader>
<facsimile>
<graphic xml:id="facs_1" url="../images/pro-fo/371-560/DSCN7960_150dpi.jpg"/>
<graphic xml:id="facs_2" url="../images/pro-fo/371-560/DSCN7961_150dpi.jpg"/>
<graphic xml:id="facs_3" url="../images/pro-fo/371-560/DSCN7962_150dpi.jpg"/>
</facsimile>
<text>
<!-- ... -->
</text>
</TEI>
<surface>
<TEI>
<teiHeader>
<!-- all the header elements -->
</teiHeader>
<facsimile>
<surface xml:id="facs_1">
<graphic url="../images/pro-fo/371-560/DSCN7960_150dpi.jpg"/>
<graphic url="../images/pro-fo/371-560/DSCN7960_300dpi.jpg"/>
</surface>
<surface xml:id="facs_2">
<graphic url="../images/pro-fo/371-560/DSCN7961_150dpi.jpg"/>
</surface>
<surface xml:id="facs_3">
<graphic url="../images/pro-fo/371-560/DSCN7962_150dpi.jpg"/>
</surface>
</facsimile>
<text>
<!-- the body of the document, i.e. digital text -->
</text>
</TEI>
The actual dimensions of the object represented are not documented by the surface element; instead, the surface is located within an abstract coordinate space, which is defined by the following attributes, supplied by the att.coordinated
class:
@ulx
gives the x coordinate value for the upper left corner of a rectangular space@uly
gives the y coordinate value for the upper left corner of a rectangular space.@lrx
gives the x coordinate value for the lower right corner of a rectangular space.@lry
gives the y coordinate value for the lower right corner of a rectangular space.<surface>
<facsimile>
<surface ulx="0" uly="0" lrx="820" lry="1182" xml:id="facs_1"/>
</facsimile>
Note: if no further unit of measurement is provided, pixels are presumed.
Kawkab America #55, 28 Apr 1893, p1. (English)
<zone>
elementsThe rectangular shapes form the previous can be modelled as <zone>
s on a larger surface of the page for analytical purposes:
<surface xml:id="facs_v2-i55_4">
<graphic url="../images/kawkab/dds-54634_Page_09_Image_0001_2R.tif" xml:id="facs_v2-i55_4_source"/>
<graphic url="../images/kawkab/dds-54634_Page_09_Image_0001_2R.jpg" xml:id="facs_v2-i55_4_web"/>
<zone ulx="71" uly="33" lrx="1344" lry="270" xml:id="facs_v2-i55_4_z1" facs="#facs_v2-i55_4_web"/>
<zone ulx="59" uly="253" lrx="326" lry="964" xml:id="facs_v2-i55_4_z2" facs="#facs_v2-i55_4_web"/>
<zone ulx="574" uly="39" lrx="804" lry="208" xml:id="facs_v2-i55_4_z3" facs="#facs_v2-i55_4_web"/>
<zone ulx="71" uly="33" lrx="1344" lry="270" xml:id="facs_v2-i55_4_z4" facs="#facs_v2-i55_4_web"/>
</surface>
Note: as there are two image files, zones indicate on which graphic they were drawn through the @facs
attribute.
The <desc>
element may be used within either <surface>
or <zone>
to provide some further information about the area being defined.
<surface xml:id="facs_v2-i55_4">
<desc>A printed page</desc>
<zone ulx="71" uly="33" lrx="1344" lry="270" xml:id="facs_v2-i55_4_z1" facs="#facs_v2-i55_4_web">
<desc>The issue's masthead</desc>
</zone>
</surface>
@xml:id
and @facs
@xml:id
identifier@facs
attribute to point from the transcription into the <facsimile>
<facsimile>
again<surface xml:id="facs_v2-i55_4">
<graphic url="../images/kawkab/dds-54634_Page_09_Image_0001_2R.tif" xml:id="facs_v2-i55_4_source"/>
<graphic url="../images/kawkab/dds-54634_Page_09_Image_0001_2R.jpg" xml:id="facs_v2-i55_4_web"/>
<zone ulx="71" uly="33" lrx="1344" lry="270" xml:id="facs_v2-i55_4_z1" facs="#facs_v2-i55_4_web"/>
<zone ulx="59" uly="253" lrx="326" lry="964" xml:id="facs_v2-i55_4_z2" facs="#facs_v2-i55_4_web"/>
<zone ulx="574" uly="39" lrx="804" lry="208" xml:id="facs_v2-i55_4_z3" facs="#facs_v2-i55_4_web"/>
<zone ulx="71" uly="33" lrx="1344" lry="270" xml:id="facs_v2-i55_4_z4" facs="#facs_v2-i55_4_web"/>
</surface>
<text corresp="#v2-i55_ar" n="55en" type="issue" xml:id="v2-i55_en" xml:lang="en">
<pb facs="#facs_v2-i55_4" n="1" xml:id="pb_v2-i55_1en"/>
<front facs="#facs_v2-i55_4_z1" xml:id="v2-i55_en_front">
<!-- the masthead -->
<!-- the main body, which usually does not change between issues -->
<div style="border-bottom:double black">
<bibl xml:lang="en"><title level="j">Kawkab America</title><lb/>
<title level="j" type="sub">"The Star of America"</title></bibl>
<bibl rend="center" xml:lang="ar" facs="#facs_v2-i55_4_z3"><title level="j">كوكب اميركا</title><lb/>
<title level="j" type="sub">جريدة سياسية علمية تجارية ادبية</title></bibl>
</div>
<!-- the bottom line, commonly containing dating information -->
<div style="border-bottom:solid black" xml:lang="en" facs="#facs_v2-i55_4_z4">
<bibl xml:lang="en"><biblScope n="2" unit="volume">Vol. 2.</biblScope>
<biblScope n="55" unit="issue">No. 55</biblScope>
<cb/><pubPlace>New York</pubPlace>, <date when="1893-04-28">Friday, April 28, 1893</date>.</bibl>
</div>
</front>
<body>
<!-- the body of the issue -->
</body>
</text>
<facsimile>
to the transcription using @start
It is also possible to point in the other direction, from a <surface>
or <zone>
to the corresponding text. This is the function of the @start
attribute, which supplies the identifier of the element containing the transcribed text found within the <surface>
or <zone>
concerned.
Consider this truncated text:
<text corresp="#v2-i55_ar" n="55en" type="issue" xml:id="v2-i55_en" xml:lang="en">
<pb facs="#facs_v2-i55_4" n="1" xml:id="pb_v2-i55_1en"/>
<front facs="#facs_v2-i55_4_z1" xml:id="v2-i55_en_front">
<!-- ... -->
</front>
<!-- ... -->
</text>
And this facsimile linking to it:
<surface start="#pb_v2-i55_1en" xml:id="facs_v2-i55_4">
<graphic url="../images/kawkab/dds-54634_Page_09_Image_0001_2R.jpg" xml:id="facs_v2-i55_4_web"/>
<zone start="#v2-i55_en_front" ulx="71" uly="33" lrx="1344" lry="270" xml:id="facs_v2-i55_4_z1" facs="#facs_v2-i55_4_web"/>
<!-- ... -->
</surface>
The linking and transcr modules provides a wide range of tools to let you describe relationships between parts of your text. If you use these techniques, remember:
@type
attributes with undefined meanings everywhere.@type
Despite <facsimile>
being part of TEI P5 from the very beginning, there are almost no software implementations available that provide an easy and convenient (graphic) interface for drawing shapes on image files, recording them as zones, and linking them to the transcription. In consequence, most large projects wrote their own software, while small projects use only the most basic function of linking one image per page to the <pb/>
elements.
Equally, there are no out-of-the box software implementations to transform and display the encoded links between bits of transcriptions and zones on the images for presentation purposes.
<zones>
and the link to the text to a new TEI P5 file, using its own schema<zones>
and the link to the text to a new TEI P5 file, using SVG (Scalable Vector Graphics) and its own schema<zone>
elementsLiterature: