<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Artificial Intelligence-Driven Text-to-Tactile Graphics Generation for Visual Impaired People</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Yehor</forename><surname>Dzhurynskyi</surname></persName>
							<email>y.a.dzhurynskyi@gmail.com</email>
							<affiliation key="aff0">
								<orgName type="institution">Ukrainian Academy of Printing</orgName>
								<address>
									<addrLine>19, Pid Holoskom Str</addrLine>
									<postCode>79020</postCode>
									<settlement>Lviv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Volodymyr</forename><surname>Mayik</surname></persName>
							<affiliation key="aff1">
								<orgName type="institution">Lviv Polytechnic National University</orgName>
								<address>
									<addrLine>28a, Stepan Bandera Str</addrLine>
									<postCode>79013</postCode>
									<settlement>Lviv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Lyudmyla</forename><surname>Mayik</surname></persName>
							<email>ludmyla.maik@gmail.com</email>
							<affiliation key="aff1">
								<orgName type="institution">Lviv Polytechnic National University</orgName>
								<address>
									<addrLine>28a, Stepan Bandera Str</addrLine>
									<postCode>79013</postCode>
									<settlement>Lviv</settlement>
									<country key="UA">Ukraine</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<address>
									<postCode>2024</postCode>
									<settlement>Cambridge</settlement>
									<region>MA</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Artificial Intelligence-Driven Text-to-Tactile Graphics Generation for Visual Impaired People</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">9B8B0E4139998848A27B3B10D0FEF927</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T19:11+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Artificial intelligence</term>
					<term>tactile graphics</term>
					<term>visual impairment</term>
					<term>natural language processing</term>
					<term>model</term>
					<term>machine learning</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This research presents the development of a text-conditional tactile graphics generation model using the Bidirectional and Auto-Regressive Transformer (BART) and Vector Quantized Variational Auto-Encoder (VQ-VAE). The model leverages a modified organization of the latent space, divided into two independent components: textual and graphic. The study addresses the challenge of the limited availability of tactile graphics samples by expanding the training dataset with custom samples, enhancing the model's capability to convert textual information into graphical representations. The proposed method improves the creation of tactile graphics for visually impaired individuals, offering increased variability, controllability, and quality in synthesized tactile graphics. This advancement enhances both the technical and economic aspects of the production process for inclusive educational materials.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>The dynamics of modern inclusive society development emphasize the need to integrate people with visual impairments into active social life. The problem of socializing individuals with visual impairments involves various aspects that complicate their education, training, and full participation in society <ref type="bibr" target="#b0">[1]</ref>. Specifically, people with visual impairments have limited access to information, as many materials are produced only in the usual printed or digital formats. This issue is further exacerbated by the increasing prevalence of information in graphic form, designed for more effective perception by readers. The aforementioned problems hinder the ability of individuals with visual impairments to receive quality education and professional development <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b3">4,</ref><ref type="bibr" target="#b4">5,</ref><ref type="bibr" target="#b6">7</ref>].</p><p>An analysis <ref type="bibr" target="#b5">[6]</ref> of the activities of publishing and printing industry enterprises that produce educational and methodological literature (textbooks, manuals, etc.) for people with visual impairments revealed problems related to the creation or adaptation of images and illustrative materials, which are particularly crucial for this type of publication. When creating or adapting graphic materials, enterprises encounter the following issues: an insufficient number of trained specialists with specific competencies related to the technical implementation of tactile graphics; additional time and financial costs for training specialists; and the high labor intensity and cost of the process of creating or adapting tactile graphics. Consequently, the production issues surrounding tactile graphics remain one of the primary factors contributing to the low level of access to graphical information for people with visual impairments.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>Scientists are working to solve the problem of producing tactile graphics by developing models for the automatic generation of tactile images <ref type="bibr" target="#b7">[8,</ref><ref type="bibr" target="#b8">9,</ref><ref type="bibr" target="#b9">10,</ref><ref type="bibr" target="#b10">11,</ref><ref type="bibr" target="#b11">12,</ref><ref type="bibr" target="#b12">13]</ref>. The task of most existing models is to transform the content of a photo image into a tactile one.</p><p>Models <ref type="bibr" target="#b7">[8,</ref><ref type="bibr" target="#b8">9,</ref><ref type="bibr" target="#b9">10,</ref><ref type="bibr" target="#b10">11]</ref> that attempt to directly convert the content of an image into a tactile format usually utilize computer vision and have the following disadvantages: they violate the requirements for tactile graphics <ref type="bibr" target="#b13">[14,</ref><ref type="bibr" target="#b14">15]</ref>; they display redundant elements of the image that are difficult to read and interfere with the overall interpretation of the graphic material.</p><p>In models <ref type="bibr" target="#b11">[12,</ref><ref type="bibr" target="#b12">13]</ref> whose principle of operation involves the detection and subsequent recognition of individual image elements, replacing these elements with their tactile representations from a limited sample, there is no variability in the synthesized image samples (it is impossible to synthesize new samples, and the attractiveness of the synthesized image for people with visual impairments decreases). Despite the mentioned drawback, it should be noted that the method effectively conveys the content of the original photo at a high level in compliance with the requirements for tactile graphics.</p><p>Additionally, such methods require supplementary source graphic information (e.g., photographs), the search for or creation of which slows down the process of preparing material for the production of tactile images.</p><p>The development of information technologies, particularly in the field of deep machine learning, has opened new opportunities for addressing the aforementioned problems. Recently, significant advancements have been demonstrated by information technologies based on artificial intelligence <ref type="bibr" target="#b15">[16,</ref><ref type="bibr" target="#b16">17,</ref><ref type="bibr" target="#b17">18]</ref>, which enable the generation of images based on user text prompts. However, according to the analysis <ref type="bibr" target="#b18">[19,</ref><ref type="bibr" target="#b19">20]</ref>, confirmed by a series of experiments, the information technologies built upon these mathematical models have proven ineffective for creating tactile graphics. Despite this, the concept of text-guided image generation was chosen as the foundation for this work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Text-conditional tactile graphics generation model</head><p>The text-conditional tactile graphics generation model is built upon the Bidirectional and Auto-Regressive Transformer (BART) <ref type="bibr" target="#b20">[21]</ref> and Vector Quantized Variational Auto-Encoder (VQ-VAE) <ref type="bibr" target="#b21">[22]</ref>. The subject of its modeling is the process of converting text information into graphic information. To do this, the embedded space of the transformer, which was formed during language modeling on pretraining task, was divided into two independent embedded spaces: text and graphics, instead of a shared one. At the same time, the parameters of the graphic embedded space were adjusted so that the dimension of the embedded space was equal to the size of the "codebook" <ref type="bibr" target="#b21">[22]</ref>, and the dimensionality of the vectors of the graphic embedded space was equal to the dimensionality of the latent space vectors of the variational image synthesis model. The parameters of the text embedded space remained the same as during language modeling.</p><p>Before obtaining text tokens using the BPE <ref type="bibr" target="#b21">[22,</ref><ref type="bibr" target="#b22">23]</ref> tokenization model, the original text components are normalized by bringing them to a uniform format (uppercase letters were converted to lowercase letters).</p><p>Formally, the process of converting text tokens into graphic tokens using a text-conditional tactile graphics generation model is described in successive stages.</p><p>The first step is to generate a bounded sequence of text tokens based on a text prompt:</p><formula xml:id="formula_0">𝑡 ̅ = {𝑡 ! ∈ 𝑉} !"# $%&amp; !"#,% , 𝑡 ̅</formula><p>where is a sequence of text tokens of dimension 𝑆𝑒𝑞 '(),+ = 64; 𝑉 is a dictionary of tokens. If the size of the generated sequence of text tokens exceeds the value, its size is reduced to the maximum value, discarding the excess tokens. If the size of the generated sequence of text tokens is smaller than the value, its size is increased to the maximum value by adding utility tokens 〈𝑃𝐴𝐷〉 that do not affect the simulation result.</p><p>In the next step, the text tokens that form the sequence 𝑡 ̅ are mapped to the text embedded space vectors 𝑒 , , forming a subset of it:</p><formula xml:id="formula_1">𝑒 + 2 = {𝑒 , + ∈ 𝐸 + 4𝑘 = 𝑡 ! ∈ 𝑡 ̅ } !"# $%&amp; !"#,% ; 𝑒 + 2 ⊆ 𝐸 + ,<label>(1)</label></formula><p>where 𝑡 ̅ is the sequence of text tokens; 𝐸 + is a text embedded space; 𝑒 , + are elements of the text embedded space. Elements 𝑒 + 2 reflect the semantic meaning of text tokens in the embedded space.</p><p>Next, the vectors of the text embedded space 𝐸 + are transformed by the transformer's bidirectional encoder, which is formed from several layers, forming hidden states ℎ + ;;; . The bidirectionality of the encoder means that it analyzes the full context of an individual vector of the embedded space, considering both the previous and the following elements of the sequence:</p><formula xml:id="formula_2">ℎ + ;;; = 𝐸𝑛𝑐𝑜𝑑𝑒@𝑒 + 2 A; ℎ + ;;; ⊆ 𝐸 + ,<label>(2)</label></formula><p>where ℎ + ;;; is the hidden state of the encoder; 𝐸𝑛𝑐𝑜𝑑𝑒(•) is the transformer's encoding operation defined within <ref type="bibr" target="#b20">[21]</ref>.</p><p>The hidden state of the encoder ℎ + ;;; is then converted by linear layers and a nonlinear activation function to the hidden state of the decoder (i.e., graphic information), forming a subset of the graphic embedded space 𝐸 -:</p><formula xml:id="formula_3">ℎ - ;;;; = 𝐿𝑖𝑛𝑒𝑎𝑟 . ∘ 𝑅𝑒𝐿𝑈 ∘ 𝐿𝑖𝑛𝑒𝑎𝑟 # @ℎ + ;;; A,<label>(3)</label></formula><p>where ℎ + ;;; ⊆ 𝐸 + is the hidden state of the encoder; ℎ - ;;;; ⊆ 𝐸 -is the hidden state of the decoder;</p><p>𝐿𝑖𝑛𝑒𝑎𝑟 ! is a linear layer; 𝑅𝑒𝐿𝑈 ≝ max (0, 𝑥) is a non-linear layer, activation function. At the next stage, an autoregressive <ref type="bibr" target="#b24">[25,</ref><ref type="bibr" target="#b25">26]</ref> transformer decoder is used. This means that the decoder generates one graphics token per iteration, considering the context of the previously generated graphics tokens. Thus, during the decoding process, the model performs calculations based on the hidden state ℎ + ;;; and pre-generated elements of the vector sequence of the graphic embedded space 𝑒 , -, or ℎ - ;;;; :</p><formula xml:id="formula_4">𝑒 ! -= 𝐷𝑒𝑐𝑜𝑑𝑒@ℎ + ;;; , 𝑒 # -, 𝑒 . -, … , 𝑒 !/# -A; 𝑖 ≤ 𝑑 0 ,<label>(4)</label></formula><p>where 𝑒 ! -is the i-th element of the vector sequence of the graphic embedded space 𝐸 -; 𝑒 1 -; 𝑗 &lt; 𝑖 are previously generated vectors of the graphic embedded space; ℎ - ;;;; is the hidden state of the decoder; 𝑑 0 is the size of the final sequence 𝑒 - ;;; ⊆ 𝐸 -; 𝐷𝑒𝑐𝑜𝑑𝑒(•) is the transformer's decoding operation defined within <ref type="bibr" target="#b20">[21]</ref>.</p><p>Decoding occurs in an iterative manner until the sequence 𝑒 - ;;; size is equal to 𝑑 0 (i.e., the size of the latent space vector of the VQ-VAE model). Once the decoding is complete, the resulting sequence of vectors of the graphics embedded space 𝑒 - ;;; is converted by a linear layer and 𝑆𝑜𝑓𝑡𝑚𝑎𝑥 function into a sequence of probability distributions from which the element with the highest probability is selected, determining the selected graphics token:</p><formula xml:id="formula_5">𝑔̅ = {𝑔 ! } !"# 2 &amp; ; 𝑔 ! = 𝑎𝑟𝑔 𝑚𝑎𝑥 (𝑆𝑜𝑓𝑡𝑚𝑎𝑥 ∘ 𝐿𝑖𝑛𝑒𝑎𝑟(𝑒 ! -)), (<label>5</label></formula><formula xml:id="formula_6">)</formula><p>where 𝑔̅ is the generated sequence of graphic tokens of size 𝑑 0 ; 𝑒 ! -∈ 𝑒 - ;;; is an element of the vector sequence of the graphic embedded space 𝐸 -. In the next step, on the basis of graphic tokens (5), a sequence of latent quantized vectors is formed 𝑧 &amp; , which is defined by the formula <ref type="bibr" target="#b5">(6)</ref>. Each graphic token: 𝑔 ! 1 ≤ 𝑔 ! ≤ 𝐾; 𝑖 = 1. . 𝑑 0 , is the positional number of the quantized vector in the "codebook" of the VQ-VAE model: where 𝑍 is the set of latent quantized vectors, or "codebook"; 𝑧 &amp; ⊆ 𝑍 is a sequence of latent quantized vectors; 𝑔 ! ∈ 𝑔̅ is a graphic token; 𝑑 0 is the size of the sequence of latent quantized vectors.</p><p>The final step is the synthesis of tactile graphics using a sequence-based variational image synthesis model decoder <ref type="bibr" target="#b5">(6)</ref>:</p><formula xml:id="formula_7">𝑌 = 𝐼𝑚𝐷𝑒𝑐𝑜𝑑𝑒@𝑧 &amp; A,<label>(7)</label></formula><p>where 𝑧 &amp; is the sequence of latent quantized vectors; 𝐼𝑚𝐷𝑒𝑐𝑜𝑑𝑒(•) is an image decoding operation based on latent representation defined within <ref type="bibr" target="#b21">[22]</ref>; 𝑌 is a generated tactile image. The diagram of the text-conditional tactile graphics generation model is shown in Figure <ref type="figure" target="#fig_0">1</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Experiment</head><p>In this experiment, the proposed model was trained using the parameters presented in Tables <ref type="table" target="#tab_1">1 and  2</ref> for the BART and VQ-VAE models, respectively. It is important to note that the size of the decoder's dictionary and the length of the sequence are each increased by one unit compared to the original values. This adjustment is necessary to introduce an additional image service token (i.e., SOS token), which is added at the beginning of the sequence to facilitate autoregressive image generation. The language modeling has been done using the BrUK corpus <ref type="bibr" target="#b26">[27]</ref> consisting of Ukrainian texts from different sources. Unlike textual datasets (i.e., corpora), which are widely accessible, tactile graphics samples are much less common. A significant obstacle in modeling tactile graphics generation using machine learning is the insufficient number of publicly available samples, as the tactile graphics production industry is less prevalent compared to the traditional one.</p><p>Nevertheless, a collection of plant and animal images stored in the APH Tactile Graphics Library <ref type="bibr" target="#b27">[28]</ref> was chosen as the original set of images for the model to learn to reproduce. Additionally, the training dataset was expanded with 41 custom tactile image samples, increasing the total number of samples to 179. The custom samples are formed from simple images of animals and were used at one of the enterprises of Ukraine, which provides preschool education for children with visual impairments. The results of the experiment include samples of generated tactile graphics images based on various types of text prompts, such as monosyllabic prompts, prompts with numerals, and prompts with epithets. These samples are presented in Figure <ref type="figure" target="#fig_1">2</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>"a daisy"</head><p>"a cow" "Top view of butterfly" "a tree" "three daisies" "a spotted cow" "a butterfly (side view)" "a naked tree" "a leaf" "a dog" "a turkey" "a deer (side view)" The model's performance was evaluated separately for each component: BART and VQ-VAE. The results of this evaluation are presented in Table <ref type="table" target="#tab_2">3</ref>. The Cross-Entropy metric reflects how well the model converts text prompts into appropriate graphic tokens, and Perplexity represents the uncertainty in the model's predictions. Lower values indicate better performance, meaning the model is more confident in its generation process. For tactile graphics, FID measures how similar the generated tactile images are to real ones in the latent space of the model. A lower FID score indicates that the generated tactile graphics are closer to real tactile images in terms of visual and tactile features.</p><p>Additionally, the overall performance of the model was evaluated using the CLIP Score metric <ref type="bibr" target="#b28">[29]</ref>, which reflects the model's capability in converting textual information into graphical information. The average CLIP Score of the developed model is 23,7. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Limitations</head><p>The current dataset used for training includes relatively simple images (e.g., animals, plants, basic objects). One limitation of the model is its potential difficulty in scaling to more complex images, such as those with intricate details (e.g., architectural blueprints, detailed scientific diagrams). The model's ability to capture fine details may be limited by the size of the latent space and the number of hidden layers used in the VQ-VAE model. Complex tactile graphics might require a more finegrained representation, which could lead to inefficiencies or inaccuracies in generation if the model architecture remains unchanged.</p><p>Besides, while the model performs well on simpler prompts (e.g., "a cow," "a tree"), more complex and nuanced prompts (e.g., "a group of children playing soccer with a spotted ball") might pose challenges. This is because the Transformer's encoding of textual information becomes more demanding as the semantic richness and length of the prompt increase. The model may struggle to disentangle and appropriately represent all components of a complex scene in tactile graphics form, leading to loss of information or oversimplification.</p><p>Regarding computational requirements, the training process of the proposed model, which integrates both the BART Transformer and the VQ-VAE, requires significant computational resources. Due to the autoregressive nature of the model and the need to process both textual and graphical latent spaces, training is computationally expensive. It requires powerful GPUs or TPUs, large memory capacity, and extended training time, particularly as the dataset grows. This makes scaling to larger datasets or higher-dimensional image outputs challenging without access to advanced computing infrastructure.</p><p>One of the key ethical concerns in the development of tactile graphics is ensuring that the generated images do not misrepresent the information. For visually impaired users, the tactile graphic is a primary means of understanding visual content, and any distortion or inaccuracy could lead to misunderstandings. For example, if a generated tactile graphic oversimplifies or omits important details, users might receive an incomplete or misleading representation of the intended information. To mitigate this risk, it's important to validate the model outputs rigorously against established standards for tactile graphics and seek feedback from visually impaired users to ensure that the tactile representations are both accurate and understandable.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion</head><p>As a result of this research, a text-conditional tactile graphics generation model was developed using BART and VQ-VAE. The model employs a modified organization of the latent space, divided into two independent components: textual and graphic.</p><p>The method of creating tactile graphics for publications aimed at individuals with visual impairments has been improved. This enhancement increases the variability, controllability, and quality of synthesized tactile graphics, thereby improving the technical and economic aspects of the production process.</p><p>This technology can bridge the gap in access to educational materials, allowing visually impaired individuals to better engage with subjects that rely heavily on visual content, such as science, mathematics, and geography. The availability of automated tactile graphics can facilitate greater independence in learning and enhance participation in inclusive classrooms and professional environments.</p><p>An important direction of further research is to increase the size and diversity of the training sample to improve the general ability of the model to generalize and ensure its stable operation in various scenarios.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Structural and functional diagram of the text-conditional tactile graphics generation model</figDesc><graphic coords="4,81.15,220.87,441.08,243.35" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Samples of generated images determined by a text prompt given below the corresponding image</figDesc><graphic coords="5,81.12,633.60,88.80,88.80" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc></figDesc><table><row><cell>BART's parameters</cell><cell></cell><cell></cell></row><row><cell>Parameter</cell><cell>Encoder's value</cell><cell>Decoder's value</cell></row><row><cell>Dictionary size</cell><cell>8192</cell><cell>513</cell></row><row><cell>Sequence size</cell><cell>64</cell><cell>65</cell></row><row><cell>Number of layers</cell><cell>3</cell><cell>3</cell></row><row><cell>Layer dimension</cell><cell>512</cell><cell>512</cell></row><row><cell>FFN dimension</cell><cell>1024</cell><cell>1024</cell></row><row><cell>Number of attention heads</cell><cell>8</cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 VQ-VAE's parameters</head><label>2</label><figDesc></figDesc><table><row><cell>Parameter</cell><cell>Value</cell></row><row><cell>Image dimension</cell><cell>256 × 256 × 1</cell></row><row><cell>"Codebook" size</cell><cell>512</cell></row><row><cell>Latent vectors size</cell><cell>16</cell></row><row><cell>Number of hidden layers</cell><cell>5</cell></row><row><cell>Hidden layers dimension</cell><cell>16</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc></figDesc><table><row><cell>Evaluation results</cell><cell></cell><cell></cell></row><row><cell>Component</cell><cell>Metric</cell><cell>Value</cell></row><row><cell>BART</cell><cell>Cross-Entropy</cell><cell>5,709</cell></row><row><cell></cell><cell>Perplexity</cell><cell>301,662</cell></row><row><cell>VQ-VAE</cell><cell>MSE (image space)</cell><cell>0,0144</cell></row><row><cell></cell><cell>MSE (latent space)</cell><cell>0,0058</cell></row><row><cell></cell><cell>FID</cell><cell>0,242</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0">𝑧 &amp; = {𝑒 , ∈ 𝑍|𝑘 = 𝑔 ! ∈ 𝑔̅ } !"# 2 &amp; ; 𝑧 &amp; ⊆ 𝑍,(6)</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Blindness and Vision Impairment Collaborators; Vision Loss Expert Group of the Global Burden of Disease Study. Trends in prevalence of blindness and distance and near vision impairment over 30 years: an analysis for the Global Burden of Disease Study</title>
		<author>
			<persName><surname>Gbd</surname></persName>
		</author>
		<idno type="DOI">10.1016/S2214-109X(20)30425-3</idno>
	</analytic>
	<monogr>
		<title level="j">Lancet Glob Health</title>
		<imprint>
			<biblScope unit="page" from="e130" to="e143" />
			<date type="published" when="2019">2019. 2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">World blindness and visual impairment: Despite many successes, the problem is growing</title>
		<author>
			<persName><forename type="first">P</forename><surname>Ackland</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Serge</forename><surname>Resnikoff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Bourne</surname></persName>
		</author>
		<idno type="PMID">29483748</idno>
	</analytic>
	<monogr>
		<title level="j">Community Eye Health Journal</title>
		<imprint>
			<biblScope unit="page" from="71" to="73" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Graphic Reading Performance of Students with Visual Impairments and Its Implication for Instruction and Assessment</title>
		<author>
			<persName><forename type="first">K</forename><surname>Zebehazy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Wilton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Visual Impairment &amp; Blindness</title>
		<imprint>
			<biblScope unit="volume">115</biblScope>
			<biblScope unit="page" from="215" to="227" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">A Systematic Literature Review on the Automatic Creation of Tactile Graphics for the Blind and Visually Impaired</title>
		<author>
			<persName><forename type="first">M</forename><surname>Mukhiddinov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Soon-Young</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">The Effect of Tactile Illustrations on Comprehension of Storybooks by Three Children with Visual Impairments: An Exploratory Study</title>
		<author>
			<persName><forename type="first">F</forename><surname>Bara</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Visual Impairment &amp; Blindness</title>
		<imprint>
			<biblScope unit="volume">112</biblScope>
			<biblScope unit="page" from="759" to="765" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">An Approach Towards Vacuum Forming Process Using PostScript for Making Braille</title>
		<author>
			<persName><forename type="first">V</forename><surname>Mayik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Dudok</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Mayik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Lotoshynska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Izonin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kusmierczyk</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-031-03877-8_4</idno>
	</analytic>
	<monogr>
		<title level="m">Advances in Computer Science for Engineering and Manufacturing</title>
				<imprint>
			<publisher>Springer International Publishing</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="38" to="48" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Analysis of the process of preparing illustrations for inclusive literature</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Dzhurynskyi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Mayik</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Qualilogy of the book</title>
		<imprint>
			<biblScope unit="volume">41</biblScope>
			<biblScope unit="page" from="7" to="15" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Towards Automatic Generation of Tactile Graphics</title>
		<author>
			<persName><forename type="first">T</forename><surname>Way</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Barner</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Rehabilitation Engineering and Assistive Technology Society of North America</title>
		<imprint>
			<biblScope unit="page" from="161" to="163" />
			<date type="published" when="1996">1996</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Automatic visual to tactile translation -Part I: Human factors, access methods, and image manipulation</title>
		<author>
			<persName><forename type="first">T</forename><surname>Way</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Barner</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Rehabilitation Engineering</title>
		<imprint>
			<biblScope unit="page" from="81" to="94" />
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Automatic visual to tactile translation. II. Evaluation of the TACTile image creation system</title>
		<author>
			<persName><forename type="first">T</forename><surname>Way</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Barner</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Rehabilitation Engineering</title>
		<imprint>
			<biblScope unit="page" from="95" to="105" />
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Automatic image conversion to tactile graphic</title>
		<author>
			<persName><forename type="first">T</forename><surname>Ferro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Pawluk</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility</title>
				<meeting>the 15th International ACM SIGACCESS Conference on Computers and Accessibility<address><addrLine>Bellevue Washington</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Communicating Photograph Content Through Tactile Images to People With Visual Impairments</title>
		<author>
			<persName><forename type="first">K</forename><surname>Pakėnaitė</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Nedelev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Kamperou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Proulx</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Hall</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Frontiers in Computer Science</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Pic2Tac: Creating Accessible Tactile Images using Semantic Information from Photographs</title>
		<author>
			<persName><forename type="first">K</forename><surname>Pakenaite</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Kamperou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Proulx</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sharma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Hall</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Eighteenth International Conference on Tangible, Embedded, and Embodied Interaction</title>
				<meeting>the Eighteenth International Conference on Tangible, Embedded, and Embodied Interaction<address><addrLine>Cork</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">Instructions for creating and adapting illustrations and typhlographic materials for blind students</title>
		<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
	<note>Polish association of the blind</note>
</biblStruct>

<biblStruct xml:id="b14">
	<monogr>
		<title level="m" type="main">Guidelines and Standards for Tactile Graphics</title>
		<ptr target="https://www.brailleauthority.org/guidelines-and-standards-tactile-graphics" />
		<imprint>
			<date type="published" when="2022-04-20">2022. 20 April 2024</date>
			<publisher>Braille Authority of North America &amp; Canadian Braille Authority</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">The Creativity of Text-to-Image Generation</title>
		<author>
			<persName><forename type="first">J</forename><surname>Oppenlaender</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 25th International Academic Mindtrek Conference</title>
				<meeting>the 25th International Academic Mindtrek Conference<address><addrLine>New York</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note>Academic Mindtrek &apos;22</note>
</biblStruct>

<biblStruct xml:id="b16">
	<monogr>
		<title level="m" type="main">Hierarchical Text-Conditional Image Generation with CLIP Latents</title>
		<author>
			<persName><forename type="first">A</forename><surname>Ramesh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Dhariwal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Nichol</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Chu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Chen</surname></persName>
		</author>
		<idno>ArXiv, vol. abs/2204.06125</idno>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">High-Resolution Image Synthesis with Latent Diffusion Models</title>
		<author>
			<persName><forename type="first">R</forename><surname>Rombach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Blattmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Lorenz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Esser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ommer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)</title>
				<meeting><address><addrLine>New Orleans, Louisiana</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Preparation of illustrations for inclusive literature using artificial intelligence models of image synthesis from text</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Dzhurynskyi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Mayik</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Proceedings</title>
		<imprint>
			<biblScope unit="volume">66</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="155" to="163" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Generation of illustrations for inclusive literature using Midjourney artificial intelligence model</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Dzhurynskyi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">SCIENTIA» with Proceedings of the II International Scientific and Theoretical Conference</title>
				<meeting><address><addrLine>Zagreb</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note>collection of scientific papers</note>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<title level="m" type="main">Diverse Image Inpainting with Bidirectional and Autoregressive Transformers</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Yingchen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Fangneng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Rongliang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Jianxiong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Kaiwen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Shijian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Feiying</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Xuansong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Chunyan</surname></persName>
		</author>
		<idno>arXiv, vol. abs/2104.12335</idno>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Neural Discrete Representation Learning</title>
		<author>
			<persName><forename type="first">A</forename><surname>Van Den Oord</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Vinyals</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kavukcuoglu</surname></persName>
		</author>
		<idno>. abs/1711.00937</idno>
	</analytic>
	<monogr>
		<title level="j">CoRR</title>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Influence of Hadamard matrices canonicity on image processing</title>
		<author>
			<persName><forename type="first">Kh</forename><surname>Kulchytska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Semeniv</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Kovalskyi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Pysanchyn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Selmenska</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-031-03877-8_29</idno>
	</analytic>
	<monogr>
		<title level="m">ISEM &apos;21</title>
				<editor>
			<persName><forename type="first">Z</forename><surname>Hu</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Petoukhov</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Yanovsky</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>He</surname></persName>
		</editor>
		<meeting><address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">463</biblScope>
			<biblScope unit="page" from="329" to="338" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">A Formal Perspective on Byte-Pair Encoding</title>
		<author>
			<persName><forename type="first">V</forename><surname>Zouhar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Meister</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Luis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Gastaldi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Vieira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Sachan</surname></persName>
		</author>
		<author>
			<persName><surname>Cotterell</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Findings of the Association for Computational Linguistics: ACL 2023</title>
				<meeting><address><addrLine>Toronto</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Autoregressive Models: What Are They Good For?</title>
		<author>
			<persName><forename type="first">M</forename><surname>Dalal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Taori</surname></persName>
		</author>
		<idno>. abs/1910.07737</idno>
	</analytic>
	<monogr>
		<title level="j">CoRR</title>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Generating Sequences With Recurrent Neural Networks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Graves</surname></persName>
		</author>
		<idno>. abs/1308.0850</idno>
	</analytic>
	<monogr>
		<title level="j">CoRR</title>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<monogr>
		<title level="m" type="main">LanguageTool API NLP UK</title>
		<author>
			<persName><forename type="first">A</forename><surname>Rysin</surname></persName>
		</author>
		<ptr target="https://github.com/brown-uk/nlp_uk" />
		<imprint>
			<date type="published" when="2022-04-21">2022. 21 April 2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<monogr>
		<ptr target="https://imagelibrary.aph.org/portals/aphb/#page/welcome" />
		<title level="m">Tactile Graphic Image Library</title>
				<imprint>
			<publisher>American Printing House</publisher>
			<date type="published" when="2024-04-21">21 April 2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">CLIPScore: A Reference-free Evaluation Metric for Image Captioning</title>
		<author>
			<persName><forename type="first">J</forename><surname>Hessel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Holtzman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Forbes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">Le</forename><surname>Bras</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Choi</surname></persName>
		</author>
		<idno>. abs/2104.08718</idno>
	</analytic>
	<monogr>
		<title level="j">CoRR</title>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
