=Paper=
{{Paper
|id=Vol-3301/Paper2
|storemode=property
|title=Understanding the Zhangzhung Nyengyu tsakali Collection using Computational Pattern Analysis
|pdfUrl=https://ceur-ws.org/Vol-3301/paper2.pdf
|volume=Vol-3301
|authors=Hussein Mohammed,Agnieszka Helman-Wazny
|dblpUrl=https://dblp.org/rec/conf/ki/MohammedH22
}}
==Understanding the Zhangzhung Nyengyu tsakali Collection using Computational Pattern Analysis==
Understanding the Zhangzhung Nyengyu tsakali Collection using Computational Pattern Analysis⋆ Hussein Mohammed1,*,† , Agnieszka Helman-Ważny1,† 1 Cluster of Excellence: Understanding Written Artefacts, Universität Hamburg, Germany Abstract This research is a part of larger study aiming to recover the little-known story of production and usage of the Zhangzhung Nyengyu tsakali collection, which is the set of “Initiation Cards” used in the numerous Tibetan rituals. The complete study will include not only the pattern analysis of digitised images, but also the application of advanced material-analysis techniques in order to analyse several physical aspects of these artefacts. In this work, several pattern-analysis methods have been applied to the digital images of this collection in order to help answering the aforementioned research questions. The preliminary results of this research demonstrate the potential of pattern analysis and its applicability to manuscript research. The utilised methods are briefly described and the preliminary results of each method are presented and discussed. Keywords Pattern Analysis, Tibetology, Manuscript Research 1. Introduction The Zhangzhung Nyengyu tsakali Collection is a set of approximately 50 Tibetan initiation cards (tsakali) belonging to the Bon religion. The tsakalis are known as initiation cards precisely because they are used in initiation rituals to empower neophytes into the particular domain of Buddhism or Bon that they represent. Typically, the initiate is shown each of the cards in turn before being blessed by means of the officiant touching his or her head with the whole set. It is a genre of miniature painting rarely mentioned in the liturgical literature and little known beyond Tibetan cultural sphere. The cards are made of paper, each measure 9.4 cm in width x 20.2 cm in height. On the recto side they bear the polychrome image of a divinity, a saint, or a sacred object, and on the verso a passage of text of varying length citing a scripture related to the corresponding image. The origin of the collection is unknown, but it is said to have originally been created in Dolpo, in Nepal, before being taken to Tibet, where they were concealed and saved from destruction during the Cultural Revolution before being brought to 45th German Conference on Artificial Intelligence (KI2022): Humanities-Centred AI (CHAI) workshop, September 19-23, 2022, Trier, Germany * Corresponding author. † All authors contributed equally. $ hussein.adnan.mohammed@uni-hamburg.de (H. Mohammed); agnieszka.helman-wazny@uni-hamburg.de (A. Helman-Ważny) 0000-0001-5020-3592 (H. Mohammed); 0000-0003-1525-767X (A. Helman-Ważny) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) the Bonpo monastery of Triten Norbutse, in Kathmandu, in 1986. On stylistic grounds they have been provisionally dated to the fifteenth century. In the past few years, automatic pattern analysis proved to be a powerful and useful tool for the study of written artefacts [1, 2, 3] when it is developed and used properly. Therefore, we utilised three of the Pattern Analysis Software Tools (PAST[4]) in this research in order to analyse the handwriting styles, the writing support and the drawing instances of this collection. The preliminary results of this research demonstrate the potential of utilising pattern analysis and the breadth of its applicability in the field of manuscript research. This research is a part of a larger study aiming to recover the little-known story of production and usage of the aforementioned collection, which is the set of “Initiation Cards” used in the numerous Tibetan rituals. In addition to the pattern analysis, state-of-the-art techniques of material analysis will also be applied to this collection in order to gain new insight on writing support and pigment composition from a seldom-sampled period in this part of Asia. At the same time, we also have a good chance to obtain new information about the historical context and provenance of the studied collection by comparing the results of our material analyses to available reference samples dated to an approximately similar period of time, such as the Dunhuang manuscripts, Dolpo and Mustang collections and the early manuscripts from La Stod area in Central Tibet dated from the tenth to the 15th centuries [5, 6]. 2. Handwriting Analysis The tsakalis manuscripts have a very specific ritual function. Each card bears both an image on the recto side, and a text on the verso. The text describes the painting and thus it is inseparable from the image. It is of different length on each card, written in the headless ume (dbu med) script, in black and red ink. This type of script is usually not standardised and shows idiosyncrasies of individual hands. Therefore, analysing the handwriting styles with HAT [7] can be helpful in determining if all the cards were written by the same individual. The tsakalis are not bound together and they can easily get mixed with other sets of the same size. Furthermore, the objects and figures painted on tsakalis are selected for specific types of performances; therefore the content of each collection is unique. This means that the identification of different hands can help revealing more information about the production process and the individuals behind it. The results of this analysis will be further verified by other approaches, such as ink analysis. 2.1. Analysis Method The Handwriting Analysis Tool (HAT) [7] is used for this analysis in order to measure the similarity between handwriting styles on different pages. This software tool is based on the training-free NLNBNN classifier [8] in order to offer the possibility to analyse handwriting styles without the need for any labelled data, which was not available in this case. FAST keypoints [9] are used to detect local features, and SIFT [10] descriptors are used to create the feature vectors. This classifier calculates the distances between detected local features in handwriting images as follows: 𝐷𝑖𝑠𝑡(𝑑, 𝑐) 𝐷𝑖𝑠𝑡𝑁 (𝑑, 𝑐) = , (1) 𝐾𝑐 𝐷𝑖𝑠𝑡𝑁 (𝑑, 𝑐) is the normalised distance between the detected feature 𝑑 in the test image and class 𝑐 using the distance calculation presented in [8]. Each handwriting sample is considered as a class, and 𝐾𝑐 is the number of features from the labelled samples in class 𝑐, and 𝐷𝑖𝑠𝑡(𝑑, 𝑐) is the Local NBNN [11], which has been reformulated in [8] as follows: 𝑛 [︂ ∑︁ ]︂ ‖ 𝑑𝑖 − 𝜑(NN𝑐 (𝑑𝑖 )) ‖ − ‖ 𝑑𝑖 − N𝑘+1 (𝑑𝑖 ) ‖ 2 2 (2) (︀ )︀ 𝐷𝑖𝑠𝑡(𝑑, 𝑐) = , 𝑖=1 where {︃ NN𝑐 (𝑑𝑖 ) if NN𝑐 (𝑑𝑖 ) ≤ N𝑘+1 (𝑑𝑖 ) 𝜑(NN𝑐 (𝑑𝑖 )) = N𝑘+1 (𝑑𝑖 ) if NN𝑐 (𝑑𝑖 ) > N𝑘+1 (𝑑𝑖 ), and N𝑘+1 (𝑑𝑖 ) is the neighbour (𝑘 + 1) of 𝑑𝑖 . In a similar way to the work in [11], we used the distance to the 𝑘 + 1 nearest neighbours (𝑘 = 10) as a “background distance" to estimate the distances of classes which were not found in the k nearest neighbours. 2.2. Analysis Results A similarity score is calculated by the software for each style (scribe) so that the user can have a relative comparison between the styles with respect to a given unknown handwriting. In this test, the similarity of every first Page in all the images has been measured against every second page from all images. The results show that the handwriting in all pages from all images is very similar in general. Nevertheless, the similarity value of the handwriting in one particular page, namely "g-h-v-PSC-P2", is always half (or less) compared to all other instances. Therefore, a second test has been carried out in order to measure the similarity of page "g-h-v-PSC-P2" to the second page of all other images. No significant similarity has been found to any image. See Fig.1. 3. Writing-Support Analysis The tsakalis are made of paper typically produced in the Himalayas. However, two types of paper-making sieve print were detected during preliminary observation in the analysed set, which suggests that more than one type of sieves were used during paper-making process. It is why we used the Line Detection Tool (LDT [12]) to find out how many types of paper were used in this collection. The presence of laid paper in the collection could support the hypothesis that this set of tsakalis could be produced outside of Tibet, where usually woven type of paper was used. 3.1. Analysis Method The Line Detection Tool (LDT) [12] is used to analyse the writing supports in these images. This tool is based on the method described in [1] as follows: The contrast of the selected images is first enhanced using the Contrast Limited Adaptive Histogram Equalisation (CLAHE) [13], then a vertical projection is calculated. These projections are smoothed using a Gaussian filter (b) The similarity values calculated by HAT. Page "g-h-v-PSC-P2" is always the least sim- ilar with large score gap detected by HAT. (a) The detected features in Page "g-h-v-PSC- P2" by HAT. Figure 1: Example of the results produced by HAT. in order to construct a histogram like the one in Fig. 2, part (c). Lines are detected from the resulting histogram as follows: 𝑛 ∑︁ 𝐻𝑐𝑜𝑙 = 𝐼(𝑐𝑜𝑙, 𝑖), (3) 𝑖=1 {︃ 1 if 𝐻𝑚𝑖𝑛 < (𝐻𝑐𝑜𝑙 × 𝑇𝑚𝑖𝑛 ) and 𝐻𝑐𝑜𝑙 > (𝐻𝑚𝑎𝑥 × 𝑇𝑚𝑎𝑥 ) 𝐿𝑐𝑜𝑙 = (4) 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒. 𝐿𝑐𝑜𝑙 is the line to be detected at the column 𝑐𝑜𝑙 in the image. 𝐻𝑚𝑎𝑥 and 𝐻𝑚𝑖𝑛 are the maximum and minimum values of the histogram correspondingly. 𝑇𝑚𝑎𝑥 and 𝑇𝑚𝑖𝑛 are threshold values which can be changed by the user. These two thresholds depend on the regularity, contrast and texture of the image, but can be determined visually from the histogram. 𝐻𝑐𝑜𝑙 is the histogram value at the column 𝑐𝑜𝑙, and 𝐼(𝑐𝑜𝑙, 𝑖) is the pixel value of image 𝐼 at the (𝑐𝑜𝑙, 𝑖) position. 3.2. Analysis Results A square region of 30 x 30 mm has been analysed from three different samples using the LDT. Several measurements has been calculated automatically for all samples as shown in Fig. 2. The lines in sample "Tsakalis-d-e-f-v-PSC-P1" have a slightly less density, which might indicate the use of a different paper-making source. The integration of results from different types of analysis and the comparison with results from other collections can led to better interpretation of these findings. (a) The calculated measurements by LDT for three pages from the collection. (b) Cropped part of the writing support. (c) Detected lines. Figure 2: Example of the results produced by LDT. 4. Drawing-Elements Analysis The tsakali cards are used in numerous ritual situations such as empowerment, ritual mandalas, transmission of teachings, substitutes for ceremonial items, visualization aids and funerals. The subjects depicted in tsakali cover a vast range from main deities and protectors to their various power attributes and appropriate offerings. Detecting these visual elements in different instances, and maybe in other collections, can facilitate greatly the retrieval process of relevant semantic contents. 4.1. Analysis Method The Visual-Pattern Detector (VPD) [14] is used in order to detect and allocate the visual-patterns (small parts of images) without the need for any ground-truth annotations. This tool is based on the proposed method in [2], and the recall-precision balance of detected patterns can be visually controlled. The general approach of this tool is based on the voting of every detected local feature for a proposed centre of a pattern hypothesis. FAST Keypoints and SIFT descriptors are used for this experiment. A detection matrix 𝑀 𝑑 (𝐿𝑖,𝑐 ) per class is created for the image, where the vote of each feature in the matrix is calculated from the distance to features of the corresponding class using the Normalised Local NBNN distance calculation presented in equation 1 as follows: 𝑀 𝑑 (𝐿𝑖,𝑐 ) = 𝑀 𝑑 (𝐿𝑖,𝑐 ) + 𝐷𝑖𝑠𝑡𝑁 (𝑑𝑖 , 𝑐), (5) where 𝑀 𝑑 (𝐿𝑖,𝑐 ) is the detection matrix of class 𝑐, and 𝑖 is the current feature index. Each detection matrix is convolved with a kernel in order to produce the final detections. The detection kernel can be described as follows: 1 if Offset2𝑥 + Offset2𝑦 < 𝑅𝑐 {︂ 𝑑𝑖 𝐾𝑐 (𝑥, 𝑦) = (6) 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒, where 𝐾𝑐𝑑𝑖 (𝑥, 𝑦) is the detection kernel of class 𝑐 for the detected feature 𝑑𝑖 centred at location (𝑥, 𝑦). Offset𝑥 and Offset𝑦 are the differences in the x- and y-axis between the kernel centre and the current location (x,y) respectively. 4.2. Analysis Results Only one example is used per pattern in this analysis, as the used method is a training-free approach. The VPD detects similar visual-patterns in this collection automatically without the need to any annotations. The pattern in Fig. 3 is a bowl made of a human skull. Such human-skull bowls were often used in Tibetan rituals. 5. Conclusions and Future Work We presented in this paper preliminary results of an ongoing interdisciplinary collaboration between computer science and the Humanities. this work aims at answering research questions from the field of Tibetology with the help of automated pattern-analysis methods. The research questions have been presented along with the proposed means to answer them. Furthermore, preliminary results of pattern analysis have been provided and discussed for handwriting styles, writing support, and drawings instances. The current results clearly demonstrate the potential of utilising pattern analysis and the breadth of its applicability in the field of manuscript research. As a second step, we are intending to apply similar analysis on other tsakali collections in order to better understand and interpret our findings. Once all the needed analysis are carried out, proper conclusions can be reached based on careful interpretation of the quantitative measurements. Acknowledgments The research for this work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC 2176 ‘Understanding Written Artefacts: Material, Interaction and Transmission in Manuscript Cultures’, project no. 390893796. The research was conducted within the scope of the Centre for the Study of Manuscript Cultures (CSMC) at Universität Hamburg. Figure 3: A detection of three different instances for the same pattern (pointed at by a green arrow) by VPD. In addition, we thank Charles Ramble for making the tsakalis collection available for our research, Ivan Shevchuk and Kyle Ann Huskin for digitising the collection, and Aneta Yotova for the image preparation of this analysis. References [1] H. Mohammed, A. Helman-Wazny, C. Colini, W. Beyer, S. Bosch, Pattern analysis software tools (past) for written artefacts, in: S. Uchida, E. Barney, V. Eglin (Eds.), Document Analysis Systems, Springer International Publishing, Cham, 2022, pp. 214–229. [2] H. Mohammed, V. Märgner, G. Ciotti, Learning-free pattern detection for manuscript research, International Journal on Document Analysis and Recognition (IJDAR) (2021) 1–13. [3] H. Mohammed, V. Märgner, T. Seidensticker, A Comparison of Arabic Handwriting-Style Analysis Using Conventional and Computational Methods, manuscript cultures 15 (2020) 77–86. [4] H. Mohammed, The Pattern Analysis Software Tools (PAST), 2022. URL: https://www. csmc.uni-hamburg.de/publications/software.html. [5] A. Helman-Ważny, S. Van Schaik, Witnesses for tibetan craftsmanship: bringing together paper analysis, palaeography and codicology in the examination of the earliest tibetan manuscripts, Archaeometry 55 (2013) 707–741. [6] S. Van Schaik, A. Helman-Ważny, R. Nöller, Writing, painting and sketching at dunhuang: assessing the materiality and function of early tibetan manuscripts and ritual items, Journal of Archaeological Science 53 (2015) 110–132. [7] H. Mohammed, Handwriting Analysis Tool (HAT), 2020. doi:10.25592/uhhfdm.900. [8] H. Mohammed, V. Märgner, T. Konidaris, H. S. Stiehl, Normalised local naïve bayes nearest- neighbour classifier for offline writer identification, in: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), IEEE, 2017, pp. 1013–1018. [9] E. Rosten, R. Porter, T. Drummond, Faster and better: A machine learning approach to corner detection, IEEE transactions on pattern analysis and machine intelligence 32 (2010) 105–119. [10] D. G. Lowe, Distinctive image features from scale-invariant keypoints, Int. Journal of Computer Vision 60 (2004) 91–110. [11] S. McCann, D. G. Lowe, Local Naive Bayes Nearest Neighbor for image classification, 2012 IEEE Conf. on Computer Vision and Pattern Recognition (2012) 3650–3656. doi:10.1109/ CVPR.2012.6248111. [12] H. Mohammed, Line Detection Tool (LDT), 2020. doi:10.25592/uhhfdm.1042. [13] S. M. Pizer, E. P. Amburn, J. D. Austin, R. Cromartie, A. Geselowitz, T. Greer, B. ter Haar Romeny, J. B. Zimmerman, K. Zuiderveld, Adaptive histogram equalization and its variations, Computer vision, graphics, and image processing 39 (1987) 355–368. [14] H. Mohammed, Visual-Pattern Detector (VPD), 2021. doi:10.25592/uhhfdm.8832.