Digital Arabic Periodical Editions: presentation at Dixit2

Till Grallert

2016-03-16

#freebassel

Bassel Khartabil / باسل خرطبيل

The journal al-Muqtabas between Shamela.ws, HathiTrust, and GitHub: producing open, collaborative, and fully-referencable digital editions of early Arabic periodicals—with almost no funds

Project URL: https://www.github.com/tillgrallert/digital-muqtabas

Slides: https://tillgrallert.github.io/Slides/Dixit2

Twitter: @tillgrallert

Email:

1.1 Importance of mundane texts / periodicals

1.2 A two-fold problem

The consequence is a focus on “high” culture and canonical texts

1.3 State of digitisation

  1. gray online libraries / “crowd”-sourced transcriptions, e.g. al-Maktaba al-Shāmila, Mishkāt, Ṣayd al-Fawāʾid, al-Waraq etc.
  2. Digital imagery, e.g. Endangered Archives Programme (EAP), HathiTrust

1.3.1 state of digitisation: text

gray online libraries / “crowd”-sourced transcriptions, e.g. al-Maktaba al-Shāmila, Mishkāt, Ṣayd al-Fawāʾid, al-Waraq etc.

1.3.1 state of digitisation: text

al-Muqtabas on al-Maktaba al-Shāmila

al-Muqtabas on al-Maktaba al-Shāmila

1.3.2 state of digitisation: images

Digital imagery, e.g. Endangered Archives Programme (EAP), HathiTrust

1.3.2 state of digitisation: images

al-Muqtabas 6 on EAP

al-Muqtabas 6 on EAP

1.3.2 state of digitisation: images

al-Muqtabas 6 on HathiTrust without US IP

al-Muqtabas 6 on HathiTrust without US IP

1.3.2 state of digitisation: images

al-Muqtabas 6 on HathiTrust with US IP

al-Muqtabas 6 on HathiTrust with US IP

1.3.2 state of digitisation: images

al-Muqtabas 6 on HathiTrust, state of OCR (only visible to US IPs)

al-Muqtabas 6 on HathiTrust, state of OCR (only visible to US IPs)

2. Suggested solution: unite facsimile and transcription

  1. aims
    • validate the transcription against the facsimiles
    • improve the transcription with the help of the “crowd”
    • make everything citable for scholars, linkable for machines
    • provide the new edition with the broadest possible licence to facilitate access and re-use
  2. principles
    • re-purpose available and established tools, technologies, and material
    • preference for open and simple formats and tools

3. Test case: digital Muqtabas

al-Muqtabas / المقتبس

3. Test case: digital Muqtabas

  1. Basis: Generate and share a TEI edition of all 96 issues (c. 7000 pages) of Muḥammad Kurd ʿAlī’s Majallat al-Muqtabas with a CC BY-SA 4.0 licence
  2. Core feature: gradually improve the digital edition (text and mark-up)
  3. Sugar on top:
    • Static web-view (doesn’t require a permanent internet connection)
    • access through bibliographic metadata in public Zotero group

3. Test case: digital Muqtabas

Project scheme

Project scheme

3.1 Basis: Generate the TEI edition

3.2 Core feature: Continuous improvement

3.2 Core feature: Continuous improvement

  1. Improvements depending on human labour (probably a “crowd”)
    • correct the transcription
    • add structural mark-up
    • add semantic mark-up
  2. Automatic improvements:
    • provide reliable bibliographic metadata based on the facsimile
    • mark-up of natural entities with link to external reference files (e.g. personal names, toponyms)

3.2 Core feature: how to contribute

3.3 Sugar on top: web-view

Display of al-Muqtabas 6(2)

Display of al-Muqtabas 6(2)

3.3 Sugar on top: web-view

3.3 Sugar on top: Zotero group

Zotero group digital-muqtabas: list view

Zotero group “digital-muqtabas”: list view

3.3 Sugar on top: Zotero group

Zotero group digital-muqtabas: item view

Zotero group “digital-muqtabas”: item view

4. To do, ongoing work

5. Experiences: simple, fast, sustainable

Summary

Project URL: https://www.github.com/tillgrallert/digital-muqtabas

Slides: https://tillgrallert.github.io/Slides/Dixit2

Twitter: @tillgrallert

Email: