Arabic Romanization at PUL


ALA-LC Romanization Tables: Transliteration Schemes for Non-Roman Scripts

Princeton University Library uses American Library Association and Library of Congress (ALA-LC) approved Romanization Tables for their transliterations of Arabic, Hebrew, Judeo-Arabic, Ottoman Turkish, Persian, Syriac, Urdu**, and Yiddish languages into Roman script.

If you are having trouble locating works in your Main Catalog search, check to be sure you are using the correct transliteration. A link to PDFs of the current ALA-LC approved Romanization Tables for Arabic is listed here.


Problems with using non-Roman script (mainly Arabic) when searching in our catalog

Specific Arabic characters that are problematic in WorldCat

The diacritics of Romanized characters are not consistent. There is a mixture of ALA Romanization rules in both Unicode and non-Unicode characters set, in addition to non-ALA Romanized diacritics. We can detect non Unicode characters (2 characters to form one Arabic diacritical characters like ā which is a + ¯ Macron diacritics versus 1 character in Unicode ā ) i.e. Daqāʼiq al-ʻArabīyah versus Daqāʼiq al-ʻArabīyah . Apparently they are similar but you can discover the difference if you try to erase the character by using the back space key. This doesn’t affect the search results.

The problematic characters are: āáḍḥṣṭūẓ Ā Ḍ Ḥ Ī Ṣ Ṭ Ū Ẓ both small and capital forms.

Use or non-use of Arabic stop words

Al-taarif is a very controversial issue. In some library systems like Millennium the system will omit the al-taarif automatically in order to enable retrieval of titles with or without al-taarif. Maybe it is useful to use the same practice in WorldCat .

It is advisable not to skip the stop words in Arabic like ( min, ilá, ‘lá, ) from title search. You can skip them from Keyword search. As for waw –atf iit is advisable to be separated with a space from the word if the word includes al-taarif because the system will disregard any stop word like waw atf as first step then al-taarif in second step during the indexing process.

i.e. الحقيقة و الخيال

The no. of hits in the results with al-taarif is not always equal to the no. of hits without al-taarif in Title search.

Examples of discrepancies in the no. of hits in results of Title search due to al-taarif in title search:

الخيال = 409

والخيال = 84

و الخيال = 33

خيال = 471