=Paper=
{{Paper
|id=Vol-3301/Paper5
|storemode=property
|title=On the Awakening of the Buddhological Epigraphy and Philology from the AI
|pdfUrl=https://ceur-ws.org/Vol-3301/paper5.pdf
|volume=Vol-3301
|authors=Haiyan Hu-von Hinüber,Sylvia Melzer
|dblpUrl=https://dblp.org/rec/conf/ki/HinuberM22
}}
==On the Awakening of the Buddhological Epigraphy and Philology from the AI==
On the Awakening of the Buddhological Epigraphy and Philology from the AI Haiyan Hu-von Hinüber1,2 , Sylvia Melzer3,4 1 Max-Weber-Kolleg, Steinplatz 2, 99085 Erfurt, Germany 2 Shandong-Universität, Shanda-Nanlu 27, 250100 Jinan, China 3 Universität Hamburg, Centre for the Study of Manuscript Cultures, Warburgstr.26, 20354 Hamburg, Germany 4 University of Lübeck, Institute of Information Systems, Ratzeburger Allee 160, 23562 Lübeck, Germany Abstract This paper aims to define the requirements for the systematically study – with help of natural language processing and schema matching techniques - the Buddhist bronzes provided with inscriptions and scattered around the World. It concerns a pilot project dealing with 50-60 ancient Buddhist bronzes. Their inscriptions are written in Sanskrit language by using two different types of handwriting. According to the paleographic and historical studies, the scholarship has become able to assign these artifacts to the era of the royal family Paḷola Ṣāhi, which ruled the area Gilgit/Chilas and beyond, today in northern Pakistan, during the 6th – 8th centuries. Keywords 1 Indology, information system, Buddhism, Post-Gandhāra, Sanskrit, Epigraphy, Paleography, Pakistan, Tibet, Beijing 1. Introduction The royal family Paḷola Ṣāhi belonged to a dynasty of Buddhist kings in the Gilgit kingdom in the northern part of the Indian subcontinent in the 6th-8th centuries [1]. During this period roughly 50- 60 ancient Buddhist bronze statues were manufactured with inscriptions written in Sanskrit language (see Figure 1). The Buddhist bronze statues were sponsored to someone to increase the quality of the person’s rebirth. The Buddhism is an Indian religion or also philosophical tradition. Buddha means “awakened” and “is conferred on an individual who discovers the path to nirvana, the cessation of suffering, and propagates that discovery so that others may also achieve nirvana.” [2] However, in the history of Buddhism, the goal of attaining nirvana was often considered unattainable in one lifetime. Therefore, the focus has been more on an accumulation of a good karma to increase the quality of rebirth. The increase in quality depends on the merits or demerits one has acquired through one's actions, as well as the merits a family member has acquired for one [3,4]. According to the paleographic and historical studies as published in last 20 years [5], the scholarship is able now to assign these special art objects of bronze foundry to the royal family Paḷola Ṣāhi of Iranian descent. In historical point of view, this area had been under the influence of Indian culture for several centuries. Therefore, it can be taken for sure that these bronzes originate from Historical Northwest India. In order to make such statements as who sponsored which bronze statue to whom, how the statues were made, and what religious significance they had, data was often analyzed by hand in meticulous patient work and brought together from various fields of the humanities, as long as the connections Humanities-Centred AI (CHAI), Workshop at the 45th German Conference on Artificial Intelligence, September 19-20, 2022, Trier, Germany EMAIL: haiyan.hu.von.hinueber@orient.uni-freiburg.de (H. Hu-von Hinüber) ORCID: 0000-0002-5284-9001 (H. Hu-von Hinüber) ©️ 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) could be made. In addition, a major goal of Indologists is to find out the history of the Buddhist bronze statues themselves. However, this turns out to be very difficult because the available information is often incomplete. As a current research question to be answered here is why there are some bronze statues in Beijing whose origin can be attributed to the royal family Paḷola Ṣāhi. For this purpose, it is necessary to evaluate the inscriptions of the above-mentioned Buddhist bronze statues e.g. from the field of epigraphy and philology. Epigraphy is the study of inscriptions to e.g. “clarify the meanings, classify their uses according to dates and cultural contexts, and drawing conclusions about the writing and writers.”[6] The Buddhist bronze statues have inscriptions, which are written in Sanskrit language by using two different types of the so-called “Gandhāra-Brāhmī” handwriting: the “round type” is from the 2nd century to the 630 AD, and the “rectangular type” form is from 630 to the 8th century. The study of the written language is called philology. The aim of this study is to determine the meaning of inscriptions. If the history in the 4th-6th centuries and the Buddhist bronze statue inscriptions are studied more closely, it is found that even the Tibetans consider the statues to be holy, but could no longer read and understand the writing. Therefore, in this paper we present the requirements to be solved by using AI methods to answer the research question by linking different research data sources. Front of the Bodhisattva Bronze from the Trashilünpo Back with the Inventory Number Monastery (Shigatse, Central Tibet) of the Monastery bKra-“2588” 12 cm high; pedestal measures 8,5 cm x 5,5 cm Left/Beginning of an inscription: deyadharmo yaṃ Right/End of an inscription: bandhuprabhāsasya Figure 1: A bronze statue with inscription kept in the Trashilünpo Monastery (Tibet) reading “This (statue) is a donation (given) by Bandhuprabhāsa”. 2. Representation and Retrieval of Research Data of Scholars from Different Disciplines and Countries A special feature, or better to say, one of the special difficulties for the continuation of this project is due to the fact that only very few scholars, who have the special knowledge in old-Indian epigraphy, have worked on the Sanskrit inscriptions engraved in roughly 50-60 bronzes, namely on their reading, dating and assignment with help of the comparative analysis. Furthermore, the epigraphic and historical studies on the inscriptions should be supported by or cooperated with archaeologists excavating in North Pakistan and Tibet or often conducting field research. Thus, the relevant disciplines include Sanskrit Philology and Paleography, Archaeology (Pakistan and Tibet), Early History of Buddhism, as well as Buddhist Art and Epigraphy. The researcher studying on the inscriptions and related bronzes work actually in different countries such as Germany, China, Switzerland, Italia, Japan and Holland (publishers). Nevertheless, these scholars from distant countries have to cooperate more closely, e.g. by linking via an information system: • Oskar von Hinüber (Germany: indologist: one book and a dozen articles [5]) • Haiyan Hu-von Hinüber (Germany: indologist and editor of a new bronze information system) • Luo, Wenhua & Team (China: archaeologists excavating in Tibet) Some other cooperating scholars and institutions: • Ulrich von Schroeder (Switzerland: pioneer; three volumes from 1987 and 2001) • Kudo, Noriyuki (Japan: one of the main publishers) • Jonathan Silk (Holland: one of the publishers) • Elisa Iori & Luca Olivieri (Italia: archaeologists excavating in Pakistan) • Different (private) art collectors and public museums As far as the state of research is concerned, there are mainly four major projects/stages (2001, 2004, 2007–2018, and from 2022 on) to be considered [7-12]: 1. A number of inscriptions deciphered and translated into English by O. von Hinüber has been published by U. von SCHROEDER in his two volumes Buddhist Sculptures in Tibet, 2001. Most of the read inscriptions are also documented with photos. 2. The chapter “Inscriptions on Bronzes (no. 11–16)” and the “Addendum” in O. von HINÜBER‘s monography Die Palola Şāhis (2004, pp. 28–42 & 190). The photos of all discussed inscriptions are published in this book. 3. From 2007 to 2018, O. von HINÜBER published a total of seven articles concerning inscribed bronzes originating from northwest India in the Annual Report of The International Research Institute for Advanced Buddhology at Soka University , vol. 10, 12–15, 18 and 21. The photos of the inscriptions are included in each respective volume. 4. In a close collaboration between Beijing and Freiburg which started in 2016, a number of inscribed Indo-Tibetan bronzes have been investigated within the framework of the paleographic and historical perspective. It concerns a dozen of newly found bronze sculptures that are equipped with Sanskrit inscriptions. As the “first-hand material”, these new finds examined in forthcoming publications (from 2022 on) can be regarded as supplements to the above-mentioned corpora documenting the Indo-Tibetan sculptures so far, especially the group of the so-called “portable statues” which do not always receive the greatest attention despite their large number. This German-Chinese collaboration forms a sub-project of the extensive survey project carried out by the Research Institute for Tibetan Buddhist Heritage of the Palace Museum (Beijing), which was set up ten years ago under the leadership of the institute’s director Luo Wenhua, one of the co-authors of this article. During the cooperation with China in recent years (2016-2022), it turned out that the publications from Europe are either very expensive (von Schroeder 2001) or already out of print (von Hinüber 2004). It therefore makes sense to gather all relevant publications via the database, be it book chapters, monographs or individual articles. Over the years, the researchers listed here have collected and analyzed data and stored the results in a wide variety of formats, i.e., printed books in pdf, XML, TIFF, DOCX, and CSV, or in databases with a non-standard, project-specific data model. From a technical point of view, the challenge now is to first store all data in such a form or to create an interface so that this data becomes exchangeable. From a humanities perspective, it must be determined which data can be mapped to each other or intelligent information retrieval (IR) systems deliver additional data which complement the data of the Bronze statues. For the retrieval of additional data, we have already developed an algorithm, the Compl-IR algorithm [13], that returns additional IR results based on a similarity computation of data types of an entity. Entity types are e.g.: person, sponsor, organization, place, and date. The identification of named entities in a text and their classification into predefined categories (entity types) can be computed automatically using natural language processing (NLP) techniques such as named entity recognition (NER). The Open Information Extraction (OpenIE) annotator [14] can assign the entities, e.g. the family Paḷola Ṣāhi, mentioned in the texts to the entity types in which users are mainly interested in. As a result, the Compl-IR algorithm can be used to obtain additional information to reconstruct the history of the royal family Paḷola Ṣāhi in more detail through the data of the Buddhist bronze statues, but also other artifacts. Before the search algorithms can be used, the data must first become accessible. Thus, the first step is to build an information system that contains the data of the 50-60 bronze statues made in the 5- 8 centuries, with the location Pakistan and inscriptions in the language Gandhāra-Brāhmī. An information system was already initially built with the database management tool Heurist [15] (see Figure 2). Heurist is an open-source database management system with a web-front end. Heurist allows researchers without prior IT knowledge to develop data models, store search, and publish data on a website. The idea is to link this information system with the other existing information systems. For example, it is possible to link the database Indoskript 2.0 [7], which contains the letters from the South Asian region. However, only the letters used in Pakistan should be considered. An evaluation, which letters are relevant for the analysis of the inscriptions, should be computed automatically. Figure 2: Information System built with the database management tool Heurist 3. An information system for paleographic analysis Up until today, dating the Buddhist sculptures from historical Northwest India has always been a demanding challenge and in most cases even impossible, so that many questions concerning the date and the location of the origin as well as the tradition of the artistic style remain unanswered or have not been explained sufficiently. Therefore, the paleographic analysis is all the more significant to help in dating the bronzes - besides comparing the artistic style and iconography. Paleography is the study of ancient writing and inscriptions as well as the deciphering and interpreting historical manuscripts and writing systems [16]. It is concerned with the forms and processes of writing; not the textual content of documents. In the period from the 2nd to the 8th century, two main types of writing were used in Northwest India: • 2nd century to 630 CE: the Gandhāra-Brāhmī or the so-called “round type” • 630 CE to 8th century: the Proto-Śāradā or the so-called “rectangular type” Figure 3: One folio from the Saṃghāṭasūtra found in Gilgit (Pakistan), written in the old type Gandhāra-Brāhmī (left); One folio from the Saṃghāṭasūtra found in Gilgit (Pakistan), written in the later type Proto-Śāradā (right) In connection with setting-up a database for paleographic analysis, the following key points regarding the development of the script types as found in the Historical Northwest India should be considered: 1. The abrupt transition from the Gandhāra-Brāhmī “round type” (approx. 2nd–7th centuries) to the Proto-Śāradā type took place almost exactly around 630 CE. 2. This radical exchange of the script type seems to go hand in hand with the change in the title of the ruling family Paṭola Ṣāhis. 3. The script Proto-Śāradā gradually fell out of use, probably in connection with the decline of the Paṭola Ṣāhis family from around 740 onwards Starting in 2005, Harry Falk (Prof. for Indology, FU Berlin) and Walter Slaje (Prof. for Indology, Uni Halle) established a paleographic database including different types of handwriting used in India from 3rd century BCE up today: Indoskript 2.0 [7]. However, there are many gaps that need to be filled, particularly in relation to the inscriptions found in Northwest India. Therefore, it would be useful to set up a database "Writings of Northwest India (2nd - 8th centuries)" which could benefit research in the longer term. Below are two examples: Old type “ya” consisting of Later tepe „ya“ Old type of the Later type of the three parts consisting of only two ligature “ndhu” ligature “ndha” parts Figure 4: Two examples of handwriting occurred in Northwest India (6th -7th centuries) 4. Federated Indology Bronze Database System (FIndo BDBS) The more databases exist, the more likely it is that additional data will be found to enrich the existing local information. To this end, federated searches and a federated database system (FDBS) provide users with additional information [17]. Users send their queries to the FDBS, which then forwards the queries to the individual (relevant) database nodes [18]. In order to be able to write down the history of the royal family Paḷola Ṣāhi in detail, the data of the bronze research are used as well as additional information is included. Additional information includes the writings used, the research data of wall paintings and manuscripts from the period when the royal family had ruled. A federated Indology bronze database system (FIndo BDBS) is created by federating databases which are in the field of Indology. In this paper, we want to highlight the requirements on how to systematically develop such a FIndo BDBS using AI methods. • Requirement 1: The various research data sources need to be put into an analyzable form so that they can be shared with other research data. Approach 1: Approaches such as transforming documents into a standard for databasing on demand can help ensure that data can be transferred to a database in a short time. In [19] it has been shown how to successfully transform a DOCX document to EpiDoc, a widely used data format in the field of Epigraphy, and then build a customized project-specific information system on demand. • Requirement 2: Four types of heterogeneity (syntax, semantics, model, access) must be considered if the data schemas are to be matched and data of these schemas are to be mapped among each other. Approach 2: When research data is stored in different databases, non-standardized, project- specific data models are often used. If a federated search is to be performed, it must be ensured that data schemas can also be matched and the data mapped to each other. There are a number of schema matching approaches that perform the process automatically, e.g. schema matching based only on the schema at structure- or element-level (linguistic, constraint-based), or based on the content at element-level (linguistic, constraint-based). [20] Deciding which paragraph would fit best depends on how the data is finally available. We have already simulated schema matching of different data models [18] and determined that the precision must be very high in humanities’ projects. Words can have a different syntax but the same semantics and vice versa. To the schema matching approaches we deal with topics such as name similarity, graph matching, and information retrieval. In [13] and [21], we have already show receive similar documents, an increase in precision can be achieved within a reasonable time frame, taking recall into account. • Requirement 3: A technical infrastructure must be created that allows the various research data sources to be linked together. Approach 3: In general, database federation provides logical centralization of data without the need to change physical database implementations. A common interface helps to create a basis for the exchange of data. In this case, it is a broker federation that allows the creation of messaging networks to which messages from one broker are automatically forwarded to another broker. RabbitMQ is an open-source message broker and provides broker federation between clients and servers. It has already been shown that cross-domain information systems can be generated using RabbitMQ and federated search queries can also be executed. [22] If these requirements are implemented, a FIndo BDBS is realized. Some of the approaches were tested individually and with other data sets from different projects. For the new project, new data sets (Gilgit Buddhist Inscribed Bronzes) will be used to obtain new information on the history of the royal family Paḷola Ṣāhi via AI methods. 5. Conclusion This paper gives an overview about the requirements for the systematically study of the Buddhist bronzes, which are provided with a Sanskrit inscription and originate from the historical Northwest India (today Pakistan). An information system was built with the tool Heurist to describe and represent the history of the royal family Paḷola Ṣāhi and their culture. With the linking to other sources of information a FIndo BDBS can be realized to come to new knowledge. For the realization of a FIndo BDBS it belongs to deal with different schema matching techniques and to use them in this new overall context. The approaches already implemented show that this approach can be promising for the new project. 6. Acknowledgements This research was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC 2176 ‘Understanding Written Artefacts: Material, Interaction and Transmission in Manuscript Cultures’, project no. 390893796 7. References [1] von Hinüber, Oskar, “Bronzes of the Ancient Buddhist Kingdom of Gilgit”, University of Freiburg, https://www.metmuseum.org/metmedia/video/collections/asian/bronzes-of-ancient-gilgit [2] Siderits, Mark, ”Buddha”, The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, 2019. [3] Buswell, Robert E. Jr., Lopez, Donald Jr. (2003), ”The Princeton Dictionary of Buddhism”, Princeton University Press, 2003 [4] Ronald Wesley Neufeldt, ”Karma and Rebirth: Post Classical Developments”, State University of New York Press, pp. 123–131, 1986, ISBN 978-0-87395-990-2. [5] Von Hinüber, Oskar, ”Publikationen”, http://www.iriaabs-freiburg.de/index.php/2- uncategorised/8-oskar-von-hinueber#publikationen [6] ”Epigraphy”, 2022, https://en.wikipedia.org/wiki/Epigraphy [7] Falk, Harry, Slaje, Walter ”Eine elektronische indische Paläographie (Programmierung: O. Hellwig; Dateneingabe: K. Einicke, K. Hoffmann, J. Neuß), Berlin 2005. http://userpage.fu- berlin.de/falk. [8] Wang, Xin, Tapani Ahonen, and Jari Nurmi. “Applying CDMA technique to network-on- chip”, IEEE transactions on very large scale integration (VLSI) systems 15,10, 2007, pp. 1091- 1100. [9] P. S. Abril, R. Plant, ”The patent holder’s dilemma: Buy, sell, or troll?”, Communications of the ACM 50, 2007, 36–44. doi:10.1145/1188913.1188915. [10] Hellwig, Oliver, ”Dating Sanskrit texts using linguistic features and neural networks”, in: Indogermanische Forschungen 2019, https://www.degruyter.com/document/doi/10.1515/if-2019- 0001/html?lang=de [11] von Hinüber, Oskar, Luo, Wenhua,”The inscribed Buddha image donated by Vappaṭa and Dhruvabhaṭā kept in the Sakya Monastery”, in: Annual Report of The International Research Institute for Advanced Buddhology at Soka University for the Academic Year 2021, Vol. 25, Tokyo 2022: 3-9, plates 1 - 4 (8 figures). [12] Hu-von Hinüber, Haiyan, Luo, Wenhua, ”Two Newly Found Bronze Statues with Sanskrit Inscriptions originating from Historical Northwest India”, Connectiong the Art, Literature, and Religion in South and Central Asia. Studies in Honour of Monika Zin, ed. by I. Konczak-Nagel, S. Hiyama and A. Klein, Delhi 2022: 161-170 (with 10 figures). [13] S.Melzer, S. Schiff, R. Möller, ”Complementary Document Representations for Information Retrieval”, The 34th International FLAIRS Conference, North Miami Beach, Florida, USA, 2021 [14] Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard,S., McClosky, ”The Stanford CoreNLP natural language processing toolkit. Proceedings of 52nd Annual Meeting of the ACL: System Demonstrations, 55–60, Association for Computational Linguistics, 2014 [15] Heurist (2022) “A unique solution to the data management needs of Humanities researchers.” Software available: https://heuristnetwork.org/ [16] ”Palaeography, n.”, Oxford English Dictionary (Online ed.). Oxford University Press. (Subscription or participating institution membership required.) [17] Melzer, Sylvia. (2022, March). Federated Search in Manuscript Databases. http://doi.org/10.25592/uhhfdm.10289 [18] Melzer, Sylvia, Thiemann, Stefan, Möller, Ralf ”Modeling and Simulating Federated Databases for early Validation of Federated Searches using the Broker-based SysML Toolbox”, The 15th Annual IEEE International Systems Conference (SYSCON 2021), virtual conference, 2021 [19] Melzer, S., Schiff, S., Weise F., Harter, K., Möller, R. ”Databasing on demand for research data repositories explained with a large epidoc dataset”, CENTERIS 2022 - Conference on ENTERprise Information Systems, 2022 (will be published) [20] Rahm, E., Bernstein, P. A survey of approaches to automatic schema matching. The VLDB Journal 10, 334–350 (2001). https://doi.org/10.1007/s007780100057 [21] S. Melzer; Semantic Assets: Latent Structures for Knowledge Management, University of Lübeck, 2018, phd thesis [22] S. Melzer, H. Peukert, H. Wang, S. Thiemann, ”Model-based Development of a Federated Database Infrastructure to support the Usability of Cross-Domain Information Systems”, The 16th Annual IEEE International Systems Conference (SYSCON 2022), Montreal, Canada, 2022