Disclaimer 1: many of the ideas have been inspired by the work of Alex Gil (@elotroalex) and Denis Tenen (@dennistenen)1 and discussions with the two of them and others at DHI Beirut, THATCamp Beirut, and DHSI.
Disclaimer 2: I am teaching short workshops on some of the ideas outlined in this post at Digital Humanities Institute - Beirut on 10 March 2017 and at DH Abu Dhabi on 10 April 2017. Basic slides are available here and here.
In the world of (academic) publishing, large aggregators and indexers have turned into and acquired publishing presses and generate obscene profits by charging the public (every tax-payer worldwide) multiple times over. First by charging the predominantly publicly-funded academic for publishing the results of her publicly-funded research and by enforcing a culture of pro bono labour among academic reviewers and editors; second by selling this content to equally predominantly publicly-funded libraries, which then increasingly demand access fees from members of the public, who want to access their collections; and third by offloading the cost of long-term preservation to, again, publicly-funded institutions. This system not only created a hierarchy of academics and institutions in the relatively well-off “West”—two classes divided by their ability to pay for being published and accessing publications (their own and others). It also increasingly prevents anybody outside western academia from accessing cutting-edge research and participating in intellectual discourse.
One of the opportunities afforded by the digital humanities and the stated goal of this endeavour is to remove the middlemen—be they technical or entrepreneurial—between authors, readers, and the library-cum-archive. There are two main obstacles to this aim:
We will not be able to change copyright legislation and the vested business interests in sustaining and expanding the regime of profit-generating copy and distribution rights in the foreseeable future, but we can all provide our knowledge under a creative commons licence. In order to do so, we need to re-claim the means of production. We argue that by doing so the main argument for restrictive copyright—namely, the provision of allegedly expensive services such as quality control and metadata curation—collapses. By the time of writing, printing costs and the global distribution of heavy and voluminous books are already negligible as the main avenue of scholarly publication, the journal, has already moved to digital online publication.
The main principles in our effort to (re)claim the means of production are: accessibility, simplicity, sustainability, and credibility. They shall pertain both to the intellectual endeavour and to the tools employed.
Looking at the most broadly-employed software in academic contexts and beyond, Microsoft’s Word, a piece of bloated and expensive proprietary software, what are the functions we need to replace?
To avoid overtly complex software and proprietary formats, form and content must be separated. Structural / semantic and representational information as well as metadata must be embedded in the text itself in order to be inseparable. We suggest using plain text with rudimentary markup following the conventions of MarkDown and a short block of metadata written in Yaml as the format of choice.
TXT
), this string of letters happens to be human readable. Plain text has been with us since the early days of computing and TXT
files could be viewed and edited with 1980s hard- and software. We can therefore assume that this basic format will remain accessible for the years to come.Writing is a process and subject to change. We need to be able to try out different structures and formulations and, more often than one would like to, we discover that yesterday’s deletions would have been worth keeping. Not to mention an external editor or collaborators that quickly make any approach involving ever-longer file names futile (we all have our folders full of text.docx
, text-new-version.docx
, text-new-version-2004-01-01.docx
, text-new-version-2004-01-01-comments-by-tg-2.docx
etc.).
{»We need additional collaborative tools«}
While it would be absolutely sufficient to publish / distribute the plain-text files by means of a USB key, a graphical user interface (GUI) that translates the structural information into a formatted and aesthetically pleasing layout is often {==advisable==}{»better wording?«}—be they a printed paper copy or a website:
HTML
, DOCX
, and PDF
. Pandoc, which is under active development by John MacFarlane, a professor of philosophy at UC Berkeley, also supports the formatting of references using BibTeX (another plain text format for storing structured bibliographic information) and CSL (citation style language).HTML
and CSS
: HTML
and CSS
are well-established standards to separate form and content maintained by the World Wide Web Consortium (W3C). While the hyper text markup language (HTML
) carries the content of our text as well as all the structural and formating information and the metadata, cascading stylesheets (CSS
) provide the actual layout. This combination allows to provide different layouts for different contexts and devices using the same content.HTML5
supports semantic tags and machine-readable metadata following various standards, such as those provided by Dublin Core (DC) or schema.org. If one includes Dublin Core metadata in the head of HTML
files, aggregators, search engines, and reference managers can find and extract the structured information on author, title, publication date, keywords, etc.{»mention copyright«}
A licence is formal agreement that specifies the rights and duties of both the licensor (e.g. us as authors) and the licencee (e.g. us as readers). Its most important purpose within our discussion is to assure the readers of our texts of their rights to read, copy, and cite them. An open licence might for instance allow reproduction of the text but might prohibit charging for accessing the reproduction.
Formulating one’s own licence text is a challenge and one might not be familiar enough with the necessary “legalese” to write a text readers can rely on. In consequence we suggest looking at established licences and having made a case for open access {»open science, open knowledge etc.«}, we suggest to start with creative commons licences.5
Websites of more than a single page of text tend to be technically complex and require an infrastructure of data storage, internet connections, web addresses, some content management system (cms), and databases that rarely come for free and without the need of maintenance. In addition, contemporary dynamic websites are almost impossible to archive or download in their entirety. To reduce complexity and the number of technologies, we suggest using static, self-contained websites without a content management system and no database.
HTML
and CSS
and nested files and folders. These can either be hosted online or distributed on a USB key.How to deal with sensitive data / material, that should not be publicly accessible, such as ethnographic field notes?
{»add relevant literature«}
Tenen, Dennis and Grant Wythoff. “Sustainable Authorship in Plain Text Using Pandoc and Markdown.” ↩
MarkDown really is only a convention and John Gruber’s (and Adam Swartz’ [yes, the Adam Swartz]) canonical description of the Markdown syntax is at least partially ambiguous and and lacks some core functionality for academic writing, such as support for tables and footnotes. In consequence, a plethora of formats (MultiMarkdown, GitHub flavored markdown, etc.) and software implementations have proliferated inspired by and based on Markdown that make the actual rendering of Markdown in HTML
rather unpredictable beyond the core functionality. In recent years, a group of people involving John MacFarlane, professor of philosophy at UC Berkeley and author of Pandoc, proposed and developed are more rigid standard which they call CommonMark. ↩
There is a great list of open source licences, including links to their full texts, at opensource.org ↩
As always, I have made the slides available on GitHub.
]]>I am currently preparing my thesis for publication and the process of revision, I am again turning to Ottoman legal texts and their translations. Today I want to come briefly back to a question I have extensively dealt with in my thesis: The difficulty of dating printed sources from the late Ottoman Bilād al-Shām. Consider the following image of the imprint for the second volume of Nawfal Niʿmat Allah Nawfal’s translation of Ottoman laws edited by Khalīl Khūrī and published by al-Maṭbaʿa al-Adabiyya in Beirut1.
The date of publication is clearly stated as the year 1301. The calendar of this dating could either be Muslim 1301 (hijrī), which would translate to 1883/84 Gregorian, or Ottoman 1301 (mālī), which began on 13 March 1885. So far, so common and without further ado—and I strongly suspect without further thought—the world’s libraries catalogued the book as having been publised in 1883. The content of the book as well as the publishing house—al-Maṭbaʿa al-Adabiyya was a venture by Khalīl Sarkīs2, the Greek Orthodox owner of Beirut’s most successful periodical and only daily newspaper Lisān al-Ḥāl—raise the probability for mālī reckoning.
Then I came across announcements of its publication in the Beiruti press. Both Lisān al-Ḥāl and Thamarāt al-Funūn ran adverts for the new publication on their front pages in May 1887.3 Against the backdrop of the book’s publisher printing announcements in his own newspaper it now seemed likely that the second volume had inherited the date of publication from the first volume and was indeed printed only in 1887.
Unlike the first volume, scans of the second cannot be found online, but I was lucky to locate a copy at the library of the American University of Beirut. To my surprise the volume carried an ownership stamp on its last page:
It reads in French and Arabic:
Librairie Universelle 1883 Beyrout
li-l-maktaba al-jāmiʿa li-Khalīl al-Khūrī 1883 Bayrūt
The stamp seemingly indicates that the copy at AUB once belonged to the editor of the book, Khalīl al-Khūrī, himself. The stamp also records a Gregorian date: 1883. If this was the date of acquisition, the stamp could prove that the volume was indeed published in 1301 hijrī. Going through my research notes, however, it appeared that the Librairie Universelle was a publishing press and bookstore rather than a library run by the brothers Amīn and Khalīl al-Khūrī in Beirut. It is unclear when they had established the printing press, but at least by 1887 they had adopted the more common Arabic term for a publishing press: al-maṭbaʿa al-jāmiʿ.4 But why would a bookstore stamp its merchandise?
For the moment this question must remain as open as the opening date of the endeavour.
I was lucky to have a one-month trial access to Gale’s new “Early Arabic Printed Books from the British Library” platform and eagerly browsed and searched for books from Beirut and Damascus. They hold a number of Niqūlā Efendi Naqqāsh’s translations of Ottoman laws. One of them, a translation of Orhan Vahan Efendi’s comment on the Commercial Code,5 carried another ownership stamp:
Librairie générale
A. Sader Beyrouth
al-maktaba al-ʿumūmiyya
li-Ibrāhīm Ṣādir Bayrūt
But this time we are better informed about the publisher al-Maktaba al-ʿUmūmiyya. This publishing house and bookstore was set up by Ibrāhīm Efendi Ṣādir in 1863. Soon the company also included his sons and operated under “Ibrāhim Ṣādir wa-Awlāduhu”. As “Sader” the company is still active and the leading publisher of Lebanese legal compendia. It seems that most of its operations have shifted online these days.
In May this year, I participated in a conference titled “Books in Motion” in Beirut and had a chance to finally meet Hala Auji and listen to her talk on “Visual Translations: The Shifting Material Dimensions of 19th-Century Printed Editions of Arabic Classics” — which was based on research conducted for her recently published book.6 Her talk focussed on the gradual shift from manuscript to print culture between Cairo, Beirut, and India and its visual aspects. Talking about the continuous popularity of al-Mutanabbī’s Dīwān and the plethora of editions published during the late 19th century, she projected the image of an edition held at Harvard’s Widener Library. According to Hala Auji this Dīwān was printed in Calcutta but the frontispiece carries the same stamp of Ibrāhīm Ṣādir’s al-Maktaba al-ʿUmūmiyya:
It is not entirely clear how Auji arrived at the conclusion that this edition was printed in Calcutta. Comparing the frontispieces it seems that a copy at University of Michigan, freely available through HathiTrust, is indeed the same edition. Its final page (292) states that Shaykh ʿUmar al-Rāfiʿī confirms the veracity of this print edition that was completed in 1283 aH [1866]. According to Ilyān Sarkīs’ union catalogue of Arabic printed works (Muʿjam al-maṭbūʿat al-ʿarabiyya wa-l-muʿarraba) this edition of 292 pages was printed on a lithographic printing press in Cairo.7
Nawfal, Nawfal Efendi Niʿmat Allāh. Al-dustūr: Tarjamahu min al-lughat al-turkiyya ilā al-ʿarabiyya Nawfal Niʿmat Allāh Nawfal bāshkātib kamāruk ʿArabistān sābiqan; bi-murājaʿa wa tadqīq Khalīl al-Khūrī mudīr maṭbūʿāt Wilāyat Sūriyya. Edited by Khalīl Efendi al-Khūrī. Vol.2. Bayrūt: al-Maṭbaʿa al-adabiyya, 1301. ↩
c.f. MWT Salname Suriye 13 1298 aH [Dec. 1880]:247, UBTüb Salname Suriye 17 1302 aH [Oct. 1884]:250. ↩
Lisān al-Hāl 26 May 1887 (#959):1, Thamarāt al-Funūn 30 May 1887 (#633):1 advertised the book at a price of 2 mecidiye or Ps 40. ↩
e.g. Lisān al-Ḥāl 13 Oct. 1887 (#999):4. ↩
Vāḥān, Ohan. Sharḥ Qānūn al-Tijārah. Translated by Niqūlā Efendi Naqqāsh. Bayrūt: al-Maṭbaʿa al-ʿUmūmiyya, 1880. ↩
Auji, Hala. Printing Arab Modernity: Book Culture and the American Press in Nineteenth-Century Beirut. Leiden: Brill, 2016. ↩
Sarkīs, Yūsuf Ilyān. Muʿjam al-maṭbūʿat al-ʿArabiyya wa-l-muʿarraba: wa-huwa shāmil li-asmāʾ al-kutub al-maṭbūʿa fī al-aqtar al-sharqiyya wa-l-gharbiyya, maʿa dhikr asmāʾ muʾallifiha wa-lumʿa min tarjamātihim; wa-dhalik min yawm ẓuhūr al-ṭabaʿa ilā nihāyat al-sanat al-Hijriyya 1339 al-muwāfiqa li-sanat 1919 milādiyya. 2 vols. Vol.2. Miṣr: Maṭbaʿat Sarkīs, 1928; p.1616. ↩
We are not allowed to share digital galley proofs of our essays, but I will make the text available here in the near future. The maps accompanying my essay and the code is already available on GitHub.
Muhanna, Elias (ed.). Digital Humanities and Islamic & Middle East Studies. Boston, Berlin: De Gruyter, 2016; Grallert, Till. “Mapping Ottoman Damascus Through News Reports: A Practical Approach.” In Digital Humanities and Islamic & Middle East Studies. Edited by Elias Muhanna. Boston, Berlin: De Gruyter, 2016: 175–98. ↩
My paper was titled: “Majallat al-Muqtabas between gray online libraries, large-scale scanning efforts, and programming tools: producing fully open, collaborative, and scholarly editions of early Arabic periodicals” and you can find the abstract below. It was part of a panel on digital remediation of the book on Saturday, 7 May, which I shared with David Wrisley (AUB) and Torsten Wollina (OIB). Torsten spoke on the challenges posed by the current state of digitization of books, manuscripts, and catelogues to researchers of the Islamicate world. David presented the fascinating results of course he taught on mapping Beirut’s publishing industry. The abstract to his paper is online as is the project website.
As always, I have made the slides available on GitHub.
Moving from the material to the seemingly immaterial, digitisation offers remedies for some of the Middle East’s most pressing issues when it comes to books as texts and cultural artifacts: protection, discovery, and access—particularly in times of war and iconoclasm, borders (between territories, linguistic communities, classes etc.), and highly dispersed audiences and artifacts. Yet, digitisation and the infrastructure to deliver digital artifacts is expensive and thus we have not a single scholarly digital edition of early Arabic printed books or periodicals—despite their importance for the history of the nahḍa, Arab political nationalism, and the Islamic reform movement; and despite the apparent promises for new methodological approaches to the book.
Some of the largest scanning projects, Hathitrust, the Endangered Archives Programme (EAP), or MenaDoc produce digital facsimiles for tens of thousands of Arabic books; but facsimiles cannot be searched and reliable OCR of Arabic script is not even available to Google. Gray online-libraries of Arabic literature, namely shamela.ws, provide access to a vast body of transcriptions of unknown provenance, editorial principals, and quality; but the transcriptions can be neither trusted nor referenced.
With the open digital edition of Muḥammad Kurd ʿAlī’s Majallat al-Muqtabas (1906–18) we want to show that through re-purposing well-established open software and by bridging the gap between immensely popular but non-academic online-libraries of volunteers and academic scanning efforts as well as editorial expertise, one can produce scholarly editions that remedy the short-comings of either world with very small funds: We use digital texts from shamela.ws, transform them into TEI XML—the quasi-standard for digital scholarly editions—add light structural mark-up, bibliographic meta-data, and link each page to facsimiles provided through EAP and HathiTrust. The digital edition (TEI XML and a basic web display) is then hosted as a public GitHub repository with a Creative Commons BY-SA 4.0 licence. Improvements can be crowd-sourced with clear attribution of authorship and version control using GitHub’s core functionality. Editions are referencable down to the word level for scholarly citations, annotation layers, and web-applications through a documented URI scheme. The web-display can be downloaded and run locally without an internet connection—a necessity for societies outside the global North, which again transforms the book into a highly mobile cultural artifact to be shared among intellectual networks across borders.
]]>They do not provide a digital, machine-readable text.
The focus is on cultural and scientific journals of the 20th century but they also have some journals of the late 19th and early 20th centuries, among them:
As one would imagine, I was exited to see a seemingly complete scan of al-Muqtabas among the journals hosted by archive.sakhrit. I am currently working on a digital scholarly and collaborative edition of this journal (see the project’s GitHub repository and blog)1 and only found accessible scans of volumes 1 to 8. Thus, the prospect of an additional and potentially complete scan, including volume 9, was exiting. But after my initial enthusiasm, I was in for a serious disappointment.
As with other gray libraries, such as al-Maktaba al-Shāmila (shamela.ws), archive.sakhrit is quiet about the personnel or company behind it. It remains unclear where the originals came from, who scanned them, who transcribed the heads, authors, and page numbers seemingly available for every article. The rather illegal / gray nature of the endeavour becomes clear from the shift from a .com
to a .co
domain (country code top-level domain for Colombia) documented by the watermark in the imagery that still refers to the http://Archivebeta.Sakhrit.com
domain.
I have assessed the quality of their “scans” of al-Muqtabas. Some volumes/ issues have been scanned from the original or a facsimile edition. Others, such as at least volumes 4 and 5, were indeed rendered from a modern digital text, namely shamela’s transcription. This is supported by the strikingly similar absence of all footnotes and non-Arabic script; a modern interpunction not present in the original; paragraph breaks that mirror shamela’s transcription; and the ellipsis between the two sections of a bayt, as provided by shamela (e.g. archive.sakhrit and shamela). The final evidence to prove this argument is that an uncommented gap of almost three pages in shamela’s transcription of volume 5(7) is reproduced in archive.sakhrit’s supposed facsimiles (compare the issue on digital-muqtabas, shamela, and archive.sakhrit).
Another, rather common, problem is archive.sakhrit’s bibliographic metadata on both the article and the issue level. The first is obviously poised by the reference to image renderings of shamela’s transcription, whose pagination does not correspond to the printed original. In addition, the tables of content provide only an eclectic selection of articles and sections and many articles are mis-attributed (for an example compare the MODS file for out digital edition of Muqtabas 4(1) with archive.sakhrit’s fihris of the same issue). The second issue relates to the publication dates. For al-Muqtabas, archive.sakhrit assumes a publication schedule in which volumes correspond to Gregorian years and issues correspond to Gregorian months (i.e. according to archive.sakhrit Muqtabas 1(1) was published on 1 January 1906). This is despite the fact that al-Muqtabas clearly states its publication schedule on the front page of every volume as adhering to the hijrī calendar for both volumes and issues (e.g. archive.sakhrit’s facsimile of this issue’s first page. As a consequence, bibliographic data obtained from archive.sakhrit cannot be considered reliable in any sense.
Therefore, archive.sakhrit is even more problematic than shamela in terms of scholarly use.2 The user is always aware of reading a derivative with an unknown relation to an assumed original while accessing a text from shamela. At archive.sakhrit, on the other hand, the user is deceived by a seemingly faithful representation of a fake original.
In addition to the user interface of the website, which has a severe 1990s look and feel but otherwise seems to be fully functional, the collection can be accessed in a number of ways that would ease automated access for other applications.
http://archive.sakhrit.co/newPreview.aspx?ISSUEID=5649
ISSUEID
is a single variable across the entire website and not specific to any one single journal.ISSUEID
s.http://archive.sakhrit.co/MagazinePages/Magazine_JPG/AL_moqtabs/AL_moqtabs_1906/Issue_1/001.JPG
http://archive.sakhrit.co/MagazinePages/Magazine_JPG/AL_moqtabs/AL_moqtabs_1906/Issue_11/553.JPG
I discovered that, contrary to my expectations, libraries around the world hold numerous unmarked editions and print-runs of tertib-i evvel of Düstur. Copies vary in pagination, spelling, and content. Yet, neither the people I asked nor the scholarly works citing copies of Düstur, seem to be aware of significant differences between copies of the same volume—which is not too unexpected an outcome when one considers that most scholars would not consult more than a single copy of every work at a single library; and once you read and / or copied a work, you would not consult another copy at another library with the explicit purpose of comparing the two for dissimilarities. I had also noted that an 1891 index to the first series of Düstur1 does not contain any information on divergent print-runs and editions.
In consequence, it is almost impossible to confirm references found in scholarly literature. Over the past years I had come to consider the many seemingly wrong references provided, for instance, by the two foremost contemporaneous French translations of Ottoman laws by Aristarchi2 and Young3 as, well, erroneous references caused by careless printers, copy-editors, even the translators themselves. But as it stands, they could have just used a different copy than the one available to me.
To illustrate the issue, I had quickly built a simple website providing imagery for different versions of the table of contents of the first, third, and fourth volume of tertib-i evvel of Düstur.
As I could not readily find any concordance or works dealing with this issue (which by the way also pertains to nineteenth-century Arabic monthly journals), I wondered whether anybody on various mailing lists could point me to relevant information. Düstur was the official collection of Ottoman legal texts at the time and the differences between the various print-runs had potentially grave consequences. Yet, to my surprise (again) almost nobody in the scholarly community of Ottomanists seemed to be aware of these puzzling divergences and no reply had any answers to offer.
In late 2012, I had had access to what I thought were four different copies of volume 1 of tertib-i evvel of Düstur. Two were held at the Hakki Tarık Us Collection (HTU) at the Beyazit Devlet Kütüphanesi5, and one each at the University of California, Berkeley and the School of Oriental and African Studies (SOAS), London. I grouped these four copies into two editions, based on the substantial differences in both layout and content. I further subdivided the second edition into three print-runs, which differ in spelling and printing errors (for the lack of a better term). Safa Saraçoğlu of Bloomsburg University, PA, solved at least part of the riddle in early 2013. He pointed out via email that, before Düstur became a series from 1872 onwards, three independent volumes were published under the same title of Düstūr in 18516, 18637, and 18668. Hence, what I thought of as the first edition of the first volume of the first series (tertib-i evvel) of Düstur turned out to be the 1863 volume7. Nevertheless, the issue remains that there are at least three print-runs of the first volume that differ in spelling and printing errors. Two of them are availabe online: one at HTU and the UC Berkeley copy through HathiTrust (if one has an American IP, that is). A fourth copy is available through the digital collections of Türk Büyük Millet Meclisi Kütüphanesi Açık Erişim Koleksiyonu (TBMM) but I have not yet checked it against the other three.
I have currently three digital copies of the second volume of tertib-i evvel of Düstur: from [HTU], UC Berkeley, and TBMM. In addition, I have seen the physical copy at SOAS. The difference in the shape of the number 6 (or rather “٦”) and the different font width in the UC Berkeley copy indicate two independently type-set print-runs. Pagination is seemingly identical.
There are at least two editions or print-runs of the third volume of tertib-i evvel of Düstur that differ in spelling and a marginally different page layout. The first can be found at SOAS and HTU and the second at UC Berkeley and TBMM.
At a workshop on Ottoman municipalities at the Istanbul Şerhir Üniversitesi in November 2015, I finally met Safa Saraçoğlu in person. We had long and interesting discussions on Ottoman legal history, digitisation efforts, and translations of Ottoman legislation into the various languages of the empire. We also shamelessly shared our private copies of Ottoman texts, among them Mīltiyādī Ḳārāvokīros’s Ottoman legal dictionary. To my surprise, the entry on expropriation of real-estate (istimlāk) references two print-runs of the fourth volume of the first series of Düstur with different paginations.12 According to this entry, among the copies I have seen, the one at SOAS would have originated in the first print-run, while those at Staatsbibliothek zu Berlin (SBB) and TBMM are part of the second print-run.
Ḳaraḳoç, Sarkiz. Miftāḥ-i Ḳavānīn-i ʿOŝmāniye. Der-i Saʿādet: Maḥmūd Bey Maṭbaʿası, 1309 aH [1891]. ↩
Aristarchēs, Grēgorios Bey. Législation Ottomane, Ou Recueil des Lois, Réglements, Ordonnances, Traités, Capitulations et Autres Documents Officiels de L’Empire Ottoman. Edited by Dēmētrios Nikolaides. Vol.1-7. Constantinople: Imprimerie Frères Nicolaides / Bureau du Journal Thraky, 1873–88. ↩
Young, George. Corps de Droit Ottoman: Recueil des Codes, Lois, Règlements, Ordonnances et Actes les Plus Importants du Droit Intérieur, et D’Études Sur le Droit Coutumier de L’Empire Ottoman. Vol.I-VII. Oxford: Clarendon Press, 1905–06. ↩
N.N. Düstur: Kavanin Ve Nizamat Ve Muahedat Ile Umuma Ait Mukavelat Ve Iradat-i Seniyeyi Muhtevidir. Vol.1 Tertip I. Der-i Saʿādet: Maṭbaʿa-yi ʿĀmire, 1289 aH [1872]. ↩
Large parts of this collection were digitised in a cooperation with the Tokyo University of Foreign Studies. ↩
N.N. Düstūr. Vol.[1]. [Der-i Saʿādet]: Taḳvīmḫāne-yi ʿĀmire, 15 Rab II 1267 aH [17 Feb. 1851]. ↩
N.N. Düstur: Ḳavānīn Ve Niẓāmātıñ Münderic Olduġu Mecmūʿa. Vol.[2]. Der-i Saʿādet: Maṭbaʿa-yi ʿĀmire, Shaʿ 1279 aH [Feb. 1863]. This volume is available online from the Hakkı Tarık Us Collection, where it is wrongly catalogued as volume four of the first series. ↩ ↩2
N.N. Düstur: Ḳavānīn Ve Niẓāmātıñ Münderic Olduġu Mecmūʿa. Vol.[3]. Der-i Saʿādet: Maṭbaʿa-yi ʿĀmire, 1866. ↩
N.N. Düstur: Kavanin Ve Nizamat Ve Muahedat Ile Umuma Ait Mukavelat Ve Iradat-i Seniyeyi Muhtevidir. Vol.2 Tertip I. Der-i Saʿādet: Maṭbaʿa-yi ʿĀmire, 1289 aH [1872]. ↩
N.N. Düstur: Kavanin Ve Nizamat Ve Muahedat Ile Umuma Ait Mukavelat Ve Iradat-i Seniyeyi Muhtevidir. Vol.3 Tertip I. Der-i Saʿādet: Maṭbaʿa-yi ʿĀmire, 1289 aH [1876]. ↩
N.N. Düstur: Kavanin Ve Nizamat Ve Muahedat Ile Umuma Ait Mukavelat Ve Iradat-i Seniyeyi Muhtevidir. Vol.4 Tertip I. Der-i Saʿādet: Maṭbaʿa-yi ʿĀmire, 1295 aH [1879]. ↩
Ḳārāvokīros, Mīltiyādī. Lüġat Ḳavānīn-i ʿOŝmāniye. Istānbūl: “A. Aṣādūriyān” Şereket-i Mürettebe Maṭbaʿası, 1310 R [1894/95], p.79 ↩
In the context of the current onslaught cultural artifacts in the Middle East face from the iconoclasts of the Islamic State, from the institutional neglect of states and elites, and from poverty and war, digital preservation efforts promise some relief as well as potential counter narratives. They might also be the only resolve for future education and rebuilding efforts once the wars in Syria, Iraq or Yemen come to an end.
Early Arabic periodicals, such as Butrus al-Bustānī’s al-Jinān (Beirut, 1876–86), Yaʿqūb Ṣarrūf, Fāris Nimr, and Shāhīn Makāriyūs’ al-Muqtaṭaf (Beirut and Cairo, 1876–1952), Muḥammad Kurd ʿAlī’s al-Muqtabas (Cairo and Damascus, 1906–16) or Rashīd Riḍā’s al-Manār (Cairo, 1898–1941) are at the core of the Arabic renaissance (al-nahḍa), Arab nationalism, and the Islamic reform movement. Due to the state of Arabic OCR and the particular difficulties of low-quality fonts, inks, and paper employed at the turn of the twentieth century, they can only be digitised by human transcription. Yet despite of their cultural significance and unlike for valuable manuscripts and high-brow literature, funds for transcribing the tens to hundreds of thousands of pages of an average mundane periodical are simply not available. Consequently, we still have not a single digital scholarly edition of any of these journals. But some of the best-funded scanning projects, such as Hathitrust, produced digital imagery of numerous Arabic periodicals, while gray online-libraries of Arabic literature, namely shamela.ws, provide access to a vast body of Arabic texts including transcriptions of unknown provenance, editorial principals, and quality for some of the mentioned periodicals. In addition, these gray “editions” lack information linking the digital representation to material originals, namely bibliographic meta-data and page breaks, which makes them almost impossible to employ for scholarly research.
With the GitHub-hosted TEI edition of Majallat al-Muqtabas we want to show that through re-purposing available and well-established open software and by bridging the gap between immensely popular, but non-academic (and, at least under US copyright laws, occasionally illegal) online libraries of volunteers and academic scanning efforts as well as editorial expertise, one can produce scholarly editions that remedy the short-comings of either world with very small funds: We use digital texts from shamela.ws, transform them into TEI XML, add light structural mark-up for articles, sections, authors, and bibliographic metadata, and link them to facsimiles provided through the British Library’s “Endangered Archives Programme” and HathiTrust (in the process of which we also make first corrections to the transcription). The digital edition (TEI XML and a basic web display) is then hosted as a GitHub repository with a CC BY-SA 4.0 licence.
By linking images to the digital text, every reader can validate the quality of the transcription against the original, thus overcoming the greatest limitation of crowd-sourced or gray transcriptions and the main source of disciplinary contempt among historians and scholars of the Middle East. Improvements of the transcription and mark-up can be crowd-sourced with clear attribution of authorship and version control using .git and GitHub’s core functionality. Editions will be referencable down to the word level^[currently we provide stable URLs down to the paragraph level] for scholarly citations, annotation layers, as well as web-applications through a documented URI scheme. The web-display is implemented through a customised adaptation of the TEI Boilerplate XSLT stylesheets; it can be downloaded, distributed and run locally without any internet connection—a necessity for societies outside the global North. Finally, by sharing all our code (mostly XSLT) in addition to the XML files, we hope to facilitate similar projects and digital editions of further periodicals, namely Rashīd Riḍā’s al-Manār.
The purpose and scope of the project is to provide an open, collaborative, referencable, scholarly digital edition of Muḥammad Kurd ʿAlī’s journal al-Muqtabas, which includes the full text, semantic mark-up, and digital imagery.
The digital edition will be provided as TEI P5 XML with its own schema. All files are hosted on GitHub
The project will open avenues for re-purposing code for similar projects, i.e. for transforming full-text transcriptions from some HTML or XML source, such as al-Maktaba al-Shamela, into TEI P5 XML, linking them to digital imagery from other open repositories, such as EAP and HathiTrust, and generating a web display by, for instance, adapting the code base of TEI Boilerplate.
The most likely candidates for such follow-up projects are
Muḥammad Kurd ʿAlī published the monthly journal al-Muqtabas between 1906 and 1914(1916). The publication schedule followed the Muslim hijrī calendar and, after the Young Turk Revolution of July 1908, publication moved from Cairo to Damascus in the journal’s third year.
There is some confusion as to the counting of issues and their publication dates. According to the masthead and the cover sheet, al-Muqtabas was published following the Islamic hijrī calendar (from the journal itself it must remain open whether the recorded publication dates were the actual publication dates). Sometimes the printers made errors: issue 2 of volume 4, for instance, carries Rab I 1327 as publication date on the cover sheet, but Ṣaf 1327 in its masthead. The latter would correspond to the official publication schedule.
Samir Seikaly argues that Muḥammad Kurd ʿAlī was wrong in stating in his memoirs that he published 8 volumes of 12 issues each and two independent issues.^[{Seikaly 1981@128}] But the actual hard copies at the Orient-Institut Beirut and the digital facsimiles from HathiTrust show that Kurd ʿAlī was right insofar as volume 9 existed and comprised 2 issues only. As it turns out, al-Muqtabas published a number of double issues: Vol. 4 no. 5/6 and Vol. 8 no. 11/12
In addition to the original edition, at least one reprint appeared: In 1992 Dār Ṣādir in Beirut published a facsimile edition, which is entirely unmarked as such but for the information on the binding itself. Checking this reprint against the original, it appeared to be a facsimile reprint: pagination, font, layout — everything is identical. But as Samir Seikaly remarked in 1981 that he used “two separate compilations of al-Muqtabas […] in this study” there must be at least one other print edition that I have not yet seen.^[{Seikaly 1981@128}]
Image files are available from the al-Aqṣā Mosque’s library in Jerusalem through the British Library’s “Endangered Archives Project” (vols. 2-7), HathiTrust (vols. 1-6, 8), and Institut du Monde Arabe. Due to its open access licence, preference is given to facsimiles from EAP.
Public Domain or Public Domain in the United States, Google-digitized: In addition to the terms for works that are in the Public Domain or in the Public Domain in the United States above, the following statement applies: The digital images and OCR of this work were produced by Google, Inc. (indicated by a watermark on each page in the PageTurner). Google requests that the images and OCR not be re-hosted, redistributed or used commercially. The images are provided for educational, scholarly, non-commercial purposes. Note: There are no restrictions on use of text transcribed from the images, or paraphrased or translated using the images.
Somebody took the pains to create fully searchable text files and uploaded everything to al-Maktaba al-Shamela and WikiSource.
It seems that somebody took the pains to upload the text from shamela to WikiSource. Unfortunately it is impossible to browse the entire journal. Instead one has to adress each individual and consecutively numbered issue, e.g. Vol. 4, No. 1 is listed as No. 37
The main challenge is to combine the full text and the images in a digital XML edition following the TEI. As al-maktabat al-shāmila did not reproduce page breaks true to the print edition, every single one of the more than 6000 page breaks must be added manually and linked to the digital image of the page.
The edition should be conceived of as a corpus of tei files that are grouped by means of xinclude. This way, volumes can be constructed as single TEI files containing a <group/>
of TEI files and a volume specific <front/>
and <back/>
Detailled description and notes on the mark-up are kept in a separate file in the GitHub repository.
A simple way of controlling the quality of the basic structural mark-up would be to cross check any automatically generated table of content or index against the published tables of content at the end of each volume and against the index of al-Muqtabas published by Riyāḍ ʿAbd al-Ḥamīd Murād in 1977.
To allow a quick review of the mark-up and read the journal’s content, I decided to customise TEI Boilerplate for a first display of the TEI files in the browser without need for pre-processed HTML and host this heavily customised boilerplate view as a seperate branch of the GitHub repository. For a first impression see here.
]]>Looking through this vast body of images—a large part of which documents the ruines and antiquities in the Levant and Egypt—one can find an astonishing record of late-nineteenth century graffiti: names and dates scribbled on walls and columns to document the visit of local and foreign travellers. The urge to record one’s existence was not restricted to the new middle-class traveller. Famously the Ottoman Sultan ʿAbdülḥamīd II and the German Kaiser Wilhelm II had marble plaques affixed to the ruins of Baalbek to commemorate the latter’s visit to the site in 1898. But looking through the images I stumbled over another graffiti
Shāhīn Makāriyūs 1878
This is the same Shāhīn Makāriyūs (also Chahin Macarius, Shaheen Makarius), who, together with Yaʿqūb Ṣarrūf and Fāris Nimr, published the monthly journal al-Muqtaṭaf from 1876 onwards—first in Beirut and, from 1884 onwards, in Cairo. He was also a prominent member of masonic lodges and authored a number of books on the history of freemasonry in Arabic and English. In 1890, al-Bashīr from Beirut already mentioned al-Muqtaṭaf and al-Laṭāʾif, another journal Shāhīn Makāriyūs edited between 1886–96, as being masonic papers.^[Bashīr 4 Nov. 1890 (#1037):3 and Bashīr 26 Nov. 1890 (#1040):1, which ran a front page article titled “The slander of al-Laṭāʾif” (iftirāʾ al-laṭāʾif).] Al-Laṭāʾif al-Muṣawwara, an illustrated successor journal to al-Laṭāʾif, that had most likely stopped being published around 1896, was edited by his son (?) Iskandar Makāriyūs in Cairo from 1915 onwards.
But it’s not just visitors leaving their traces for posterity, some images reveal that the walls of the ruins were also used for advertisements. Several photos of the entrance to the cella of Baalbek’s main temple of Jupiter dating to the 1870s and onwards show that the photo studio Bonfils itself had announced its services on the wall to the right:
BONFILS Photogra[phie …]
a Beyro[uth …]
Vues de Balbek […]
1871
]]>