Digital Arabic Periodical Editions: presentation at Books in Motion
Till Grallert
2016-05-07
1. The journal of al-Muqtabas
al-Muqtabas / المقتبس
- “monthly” journal published by Muḥammad Kurd ʿAlī between 1906 and 1918/19 in Cairo and, from 1908 onwards, in Damascus.
- 9 volumes, 96 issues (at least 2 double issues), c. 7000 pages
- Muḥammad Kurd ʿAlī (1876-1952): Ottoman bureaucrat, journalist, president of the Syrian Academy of Sciences, minister of education.
- available at c. 30 libraries (North America, Europe, Middle East):
- original prints (mostly incomplete)
- some copies of a “gray” reprint
- a number of microfiche copies from a single source
1.1 Importance of mundane texts / periodicals
- They are at the core of various discourses
- Modernity / -ism at the end of empire
- Arabic renaissance
- Arab nationalism
- Islamic reform movement
- They form large corpora with an equal distribution along a temporal axis (al-Muqtabas: 12 yrs, al-Manār: 43 yrs, al-Muqtaṭaf: 76 yrs)
- linguistic analysis
- historical semantics
- data sets for social history
1.2 A two-fold problem
- Preservation:
- Active destruction of cultural artifacts: iconoclasm, neoliberalism
- Neglect: fragile materiality
- Access:
- Absence / destruction of infrastructure / channels of knowledge transmission: lack of access to institutions, hardware, software, internet connections
- widely-dispersed collections
- technologies: absence of reliable OCR
- technical skills: lack of basic scripting skills
The consequence is a focus on “high” culture and canonical texts
2. Suggested solution: unite facsimile and transcription
- aims
- validate the transcription against the facsimiles
- improve the transcription with the help of the “crowd”
- make everything citable for scholars, linkable for machines
- provide the new edition with the broadest possible licence to facilitate access and re-use
- principles
- re-purpose available and established tools, technologies, and material
- preference for open and simple formats and tools
3. Test case: digital Muqtabas
3. Test case: digital Muqtabas
3. Test case: digital Muqtabas
3. Test case: digital Muqtabas
- Basis:
- XML/TEI edition of all 96 issues (c. 7000 pages) of Muḥammad Kurd ʿAlī’s Majallat al-Muqtabas
- The text links to open-access digital facsimiles
- licenced as CC BY-SA 4.0
- Core feature:
- social digital edition: gradually improve text and mark-up
- Sugar on top:
- Static web-view (doesn’t require a permanent internet connection)
- bibliographic metadata for all issues and articles (MODS, BibTeX)
- access to bibliographic metadata through a public Zotero group
3. Test case: digital Muqtabas
3.1 Basis: Generate the TEI edition
3.1 Basis: Is this legal?
3.1 Basis: Is this legal?
Copyright depends on the jurisdiction of creators, distributors, etc.
- text of al-Muqtabas
- is in the public domain: transcription and imaging is legal.
- the transcribers do not / cannot claim copyright: copying is legal
- images of al-Muqtabas
- digital files are protected by copyright: use is subject to licence, linking is legal
- download and redistribution: almost certainly illegal
- digital edition of al-Muqtabas
3.2 Core feature: Continuous improvement
3.2 Core feature: Continuous improvement
- Improvements depending on human labour (probably a “crowd”)
- correct the transcription
- add structural mark-up
- add semantic mark-up
- Automatic improvements:
- provide reliable bibliographic metadata based on the facsimile
- mark-up of natural entities with link to external reference files (e.g. personal names, toponyms)
3.2 Core feature: how to contribute
3.3 Sugar on top: web-view
- Adaptation of TEI Boilerplate XSLT stylesheets
- human-readable and static web-view (either rawgit or gh-pages)
- generated on-the-fly by the user’s browser using XSLT to transform the TEI XML files.
- can be run without an internet connection and with local facsimiles.
- parallel display of text and facsimile
- simple changes to display different facsimiles
- link to metadata on the article level (MODS, BibTeX)
- the code is shared with a CC BY-SA 4.0 licence on GitHub
3.3 Sugar on top: web-view
3.3 Sugar on top: Zotero group
3.3 Sugar on top: Zotero group
3.4 Use cases: reviewed works
4. To do, ongoing work
- Editorial decisions: TEI schema design
- mark-up of some text features has not yet been decided
- Editorial work:
- mark-up of page breaks (1-2 h per issue)
- correcting transcriptions
- add non-Arabic words omitted by shamela.ws
- add footnotes
- correct publication dates for all issues.
- Web-display:
- needs some polishing
- search functions beyond the Zotero group and individual issues (project can be searched on GitHub)
5. Experiences: simple, fast, sustainable
- Simple technologies and relatively little coding needed: Initial set-up took less than four weeks of after-hour labour
- Hosting with GitHub is free
- Core (but simple) features cannot be automated:
- all c.7000 page breaks must be manually tagged
- Code can be re-purposed:
- We set-up the sister project Digital Ḥaqāʾiq as a digital edition of ʿAbd al-Qādir al-Iskandarānī’s monthly journal al-Ḥaqāʾiq (1910–12, Damascus) in a single day.
Summary
- open scholarly digital editions of [Majallat] al-Muqtabas and al-Ḥaqāʾiq providing
- TEI XML files (transcription and links to facsimiles)
- plain text files
- BibTeX files for every article
- customised version of TEI Boilerplate (XSLT and CSS) with stable URLs for every element
- within a framework (git and GitHub) that allows for
- collaborative, open, version-controlled improvements of the edition
- re-use of the text