=Paper=
{{Paper
|id=Vol-3396/paper23
|storemode=property
|title=Peculiarities of an Information System Development for Studying Ukrainian Language and Carrying out an Emotional and Content Analysis
|pdfUrl=https://ceur-ws.org/Vol-3396/paper23.pdf
|volume=Vol-3396
|authors=Taras Basyuk,Andrii Vasyliuk
|dblpUrl=https://dblp.org/rec/conf/colins/BasyukV23
}}
==Peculiarities of an Information System Development for Studying Ukrainian Language and Carrying out an Emotional and Content Analysis==
Peculiarities of an Information System Development for Studying Ukrainian Language and Carrying out an Emotional and Content Analysis Taras Basyuk, Andrii Vasyliuk Lviv Polytechnic National University, Bandera str.12, Lviv, 79013, Ukraine Abstract This article analyzes the existing methods and well-known systems that provide tools for learning the Ukrainian language and describes the mechanisms for evaluating these skills. The technologies and software tools for conducting emotional content analysis were analyzed, which made it possible to identify the main shortcomings of the existing approaches and showed the relevance of the research. Models of structuring and formalization of knowledge and information, which are presented in the content of the educational environment, have been developed in order to provide a basis for the development and programmatic implementation of individualized user access to the requested educational materials methods. The design of the software system was carried out using a structural approach and displaying the created diagrams in accordance with the IDEF0 standard. The study presents a functional model and its decomposition, which created a basis for understanding the peculiarities of the Ukrainian language learning system functioning and the implementation of emotional and content analysis. The mobile application is built using the Flutter framework and written in the Dart language, providing fast compile times and dynamic reloading without the need for a restart. Natural language processing is implemented by a separate module that provides tokenization and parsing, lemmatization/stemming, part-of-speech tagging, and identification of semantic relationships. The process of implementing emotional content analysis is described, which includes the following stages: selection of a working algorithm, selection/creation of the sentiments' dictionary, creation of rules, collection, cleaning and storage of data, emotional content analysis, output and visualization of results. A software tool has been created that works in prototype mode and implements the described functionality. Keywords 1 Ukrainian language, education, skills assessment, information system, emotional content analysis 1. Introduction Learning any language is relevant in the context of business, literature, culture, or politics. Its knowledge enables people to follow events from relevant primary sources, the opportunity to conclude, and form a personal opinion and position regarding the chosen topic. In the modern pace of life, the world requires a person to have global knowledge, the best and most innovative of which can be obtained only by using various primary sources of information. At the same time, mastering any language makes it possible to communicate with a broader circle of people and to learn about culture, customs, and values from native speakers [1]. As for the Ukrainian language, the problem of learning it today is widespread, both among foreign citizens traveling for business, and among residents of our country who, due to objective circumstances, want to improve their language skills. The Ukrainian language ranks 44th in the world in terms of the COLINS-2023: 7th International Conference on Computational Linguistics and Intelligent Systems, April 20–21, 2023, Kharkiv, Ukraine EMAIL: Taras.M.Basyuk@lpnu.ua (T. Basyuk); Andrii.S.Vasyliuk@lpnu.ua (A. Vasyliuk) ORCID: 0000-0003-0813-0785 (T. Basyuk); 0000-0002-3666-7232 (A. Vasyliuk) © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Wor Pr ks hop oceedi ngs ht I tp: // ceur - SSN1613- ws .or 0073 g CEUR Workshop Proceedings (CEUR-WS.org) number of speakers, which is 45 million people who use it in everyday life or consider it their native language [2]. As you know, the most popular way to master it is to attend language courses or classes with tutors. However, these methods are financially and time-consuming and require changing schedules, which is not always convenient. Because of that, given the popularity of information technologies, the decision to make language learning accessible to anyone using software applications [3,4] is promising. Another aspect of the research is the automated analysis of emotions expressed in the form of Internet comments about goods and services in the Ukrainian segment of the global network. The mentioned problem is relevant since more and more Ukrainian-language resources are being created on the territory of Ukraine, which do not include means of automated analysis for the analysis of reviews. Today, thanks to the Internet, customers share their experiences and opinions about products and services much more openly, and companies, in order to receive feedback, create a significant number of means of communication: telegram channels, hotlines, forms for evaluating customer satisfaction, chatbots, etc. However, often dissatisfied customers do not use these resources but share their experience on social networks, online resources with reviews, or personal blogs. These reviews can significantly affect the company's reputation, therefore it is essential to track various mentions of the brand and determine customer satisfaction depending on the emotional and meaningful analysis [5]. Therefore, an urgent task is to develop an information system for learning the Ukrainian language and introduction of means of emotional and meaningful analysis into it, which will provide additional means of overcoming the language barrier and analyzing user satisfaction. 1.1 Analysis of recent research and publications 1.1.1 Known methods of language learning After analyzing the literary sources, we can conclude that there are many ways of learning a language, and everyone can choose the most convenient one, but there are three main approaches: structuralist (represents the language as a system of structurally interconnected elements with some hidden meaning), functional (presupposes language as a mechanism for performing various functions, for example, expressing one's opinion or requesting a certain service) and interactive (considers language as a means for creating and maintaining social connections, focusing around negotiations, activities, everyday communication, etc.). Among the most famous structuralist methods, we can single out [6]: ● Grammatical translation method. The main thing in this method is that listeners focus on learning the rules of grammar and increase their vocabulary by learning literal translations by heart. The main purpose of the application is to study classical languages. At the same time, one of the tasks of the method is that after completing the study, the students should be able to freely use language techniques such as spelling, grammar, and vocabulary, for comfortable reading, writing, and understanding of texts in various contexts. Learning grammatical rules increases the understanding that language can be represented as a system to be analyzed. ● Audio-language method. The purpose of this method is to develop language skills in a strictly defined order: listening-speaking-reading-writing. Learning consists of certain skills formation with the help of multiple, mechanical material repetitions. The task is the study of standard structures and the ability to apply them according to the situation. The result of language learning is monitored by using specially created test tasks. The advantage of this method is the development of situational techniques for the new grammatical material presentation and a specific set of tasks. As for functional methods, the oral approach to situational language learning has become the most widespread among them. This method was developed by applied linguists from Britain, Harold E. Palmer and A. S. Hornby, who conducted significant research on language learning and found that the most attention should be paid to reading skills and defined such a concept as "vocabulary control". According to this, to learn a language, it is necessary to memorize a vocabulary of about 2,000 words that are most often found in written texts, and it was assumed that understanding these words allows you to create a basis for understanding texts in the process of reading. At the same time, the concept of "grammatical control" appeared, which focused attention on grammatical constructions that are most often encountered in the process of conversational technique. After that, these constructions were added to dictionaries and guides for listeners. A significant difference between the direct method and the oral approach was that the methods formed based on this approach became theoretical foundations that indicate how to choose content, structure the complexity of tasks, and present materials. It is believed that this approach allows students to acquire certain skills that are repeated in certain situations: presentation (introduction of new material in context), practice (controlled practical phase), and production (classes are built on less controlled practice)) [7]. As for interactive methods, it is worth noting among them: ● The direct method, which is often called the natural method, consists of the listeners avoiding their native language and communicating only in the language being studied [8]. This method is based on the idea that learning another language should be similar to learning a native language: a child never pays attention to another language in order to learn the native language, and in the same way, the native language is not necessary when learning a foreign language. The method emphasizes correct pronunciation. According to it, listeners should focus on conversations and avoid text until they acquire speaking skills. Anything related to grammar and writing should be avoided, as it slows down the acquisition of speaking skills. Classes begin with the student learning simple words such as window, pen, table, etc. This allows you to motivate the listener because he thinks that he has mastered the language immediately. Eventually, the lessons progress to verb forms and other grammatical constructions, with the goal of learning about twenty new words in each lesson. ● The scenario method is a type of direct method when teaching is directly related to the language being studied. According to François Gohen, listeners learn the language faster when it is presented in chronological order. Namely, individual expressions are studied based on activities, in the order in which they occur, for example, leaving the house, going to the bus stop, getting on public transport. He noticed that if this sequence is broken, memorizing sentences becomes almost impossible. Another of his observations regarding memory was the concept of "incubation", which determines the time of "memorization", namely, the period of time during which linguistic ideas should be stored in the memory under the condition of their use in speech [9]. ● The language "immersion" method is used in the process of teaching other disciplines in the language being studied. The term "immersion" in the context of learning foreign languages for a professional direction with the use of integrated special courses has two main definitions: first, it is a method of learning a foreign language by teaching one or more disciplines in this language. Secondly, it is a special type of integrated foreign language learning, the purpose of which is to master a foreign language for special purposes. Full immersion in a foreign language, and translation, in particular, occurs gradually. To achieve the effect of language immersion, the teacher carefully selects the educational material and arranges the tasks, and clearly plans the work at seminars and practical classes. The teacher should direct his activities guided by the following principles: the content of the analyzed authentic texts should include information that expands professional knowledge; tasks for professionally oriented texts are grouped according to the principles of gradual complexity; the nature of tasks changes from illustrative and reproductive to constructive and creative [10, 11]. The given list of methods used in learning foreign languages is not exhaustive. But the conducted analysis shows that an ideal method does not exist and is unlikely to exist [12]. Because each person, as an individual, is characterized by different skills that should be taken into account in the learning process. Therefore, it can be concluded that the specified mechanisms can be provided through an adaptive approach, which is planned to be implemented in the constructed system. 1.1.2. Peculiarities of emotional content analysis Emotional content analysis is one of the areas of natural language processing, the purpose of which is to distinguish the thoughts and moods of people from written texts. Different types of analysis try to get different results: feelings, emotions, personal experience, and relation to a product or service. Features of emotional content analysis are the need to filter a significant amount of data; data analysis is performed in real-time; the application of the "sequence of criteria", namely the labeling of text depending on emotions, is subjective, influenced by personal experiences, thoughts and beliefs. By using a centralized sentiment analysis system, companies can apply the same criteria to all their data, which helps to increase the accuracy of the results obtained [13]. Sentiment analysis is extremely important in many fields, from business to politics. Today, it is widely used both for processing reviews about brands, and for monitoring opinions in social networks about upcoming elections, attitudes towards certain events, or following certain processes or activities. The main areas of application of sentiment analysis are monitoring of social networks, monitoring of brand position, determination of the voice of the customer (Voice of the customer), and customer service [14]: ● Monitoring of social networks. With the proliferation of social media, people's opinions have become more accessible than ever. With the use of social networks people most often share their emotions about services, products and express their opinion about certain social problems. If earlier it was enough for brands to create a good advertisement and a convenient website with a few dozen reviews, thus forming a positive image of their company, now the quick processing of feedback from customers comes first. At the same time, a significant number of mentions and reviews do not always indicate the quality of the company's work. It is important to understand what exactly customers are saying. Using the analysis of emotions in social networks, you can find potential customers, monitor the work of competitors, and identify urgent problems before they get out of control. ● Brand monitoring. Brands receive feedback not only from social networks, but also from news sites, blogs, forums, product reviews, and more. Analysis of moods and opinions provides an opportunity to collect feedback and additionally obtain information about the geographical location, gender, and age of customers. This helps to correctly form the target audience and choose the correct development strategy. Analysis of these factors helps to understand how the image of the brand changes over time and to compare it with the image of competitors. It allows you to identify potential PR crises in real-time and take action before they become serious problems. ● Voice of the customer. Social media and brand monitoring provide general information about customer sentiment that needs to be further structured. At the same time, this information can be collected both in the process of extracting (parsing) data from resources and using the survey mechanism. Net Promoter Score (NPS) surveys are one of the most popular ways for companies to get feedback by asking a simple question: Would you recommend a company, product and/or service to a friend or family member? These surveys receive a single score on a scale of numbers. Numerical survey data are easy to summarize and evaluate, but they do not explain the reasons for choosing a particular score. Adding the following question, such as "Why did you leave such a rating?", will make it possible to clarify possible problems both in the company itself and in its external policy. These questions provide more information, but it is more difficult to process them manually, given that the task of their emotional and content analysis using automated means is urgent. ● Customer service. A high level of service is as important as the product itself. Customers expect that working with the company will be fast, intuitive, customer-oriented, and hassle-free. If the experience turns out negative, the customer will switch to a competitor. Sometimes a single negative experience is enough for a customer decide to change companies. Analyzing customer interactions allows you to ensure that employees follow the appropriate protocol, are loyal, and do not make mistakes during service. 1.1.3. Analysis of known systems The main problem in learning any language is that there are many systems of analogs, which in most cases do not have the characteristics of complexity. Since the given task consists of two subtasks, the analysis of the existing solutions will be carried out in two directions: systems that specialize in learning the Ukrainian language and systems that are used for emotional and content analysis. Well-known systems for learning the Ukrainian language: ● A national platform for studying the Ukrainian language of the Ministry of Culture and Information Policy of Ukraine [15]. This resource contains online and offline resources for learning the Ukrainian language, both for citizens of Ukraine who want to improve their Ukrainian language skills and for foreigners who want to learn it. Among the disadvantages, it is possible to identify significant bulkiness and excessive informativeness, and therefore the difficulty of choosing the necessary tool. ● The project "Language is the DNA of the Nation" [16]. An educational project for those who want to improve their knowledge of the Ukrainian language. Learning takes place in a game form. All the rules, exercises, and other things are presented by the fictional hero - Lepetun. It is depicted on each card with rules and tasks, which allows the associative memory to work and learn the language faster. The application has the following main functions: learning the basic rules of spelling; description of phraseological units and synonyms; correct accentuation of words; exercises for mastering the rules; instructions on how to get rid of Russianisms. Among the shortcomings can be identified: a small theoretical base; only spelling rules are present; a small number of exercises; there is no possibility to determine the level. ● R.I.D. mobile application [17] – the application was created to improve and deepen knowledge of the Ukrainian language and culture in the country and beyond. A feature of the application is the presence of a set of tasks and exercises that have certain restrictions on use. In general, the application is characterized by functional stability and a developed interface. The lack of learning progress and infrequent updates can be identified among the disadvantages. ● Language Course S.L. [18] - a lexical simulator for learning the Ukrainian language. The main emphasis of the system is related to the study of the spoken Ukrainian language for use in the process of travel, business, and education. The application contains a Ukrainian dictionary with flashcards and a translation of 10,000 words and is characterized by free use. Unstable operation and the presence of errors during operation can be identified among the disadvantages. ● Learn languages - Mondly [19] is an application that provides tools for learning about 30 languages, including Ukrainian. A feature of the system is the presence of daily lessons, the main emphasis of which is focused on memorizing words, constructing sentences, and practicing dialogues. The app uses a special learning technique that focuses on memorizing keywords to build sentences and phrases. Among the disadvantages can be defined as the presence of superficial language constructions and the commerciality of the application. ● Let's Remember And Learn Ukrainian [20] is a developing application that is a mobile tutor for the independent study of vocabulary and phonetics at the elementary level. The list of words is specially selected from various topics that are used in everyday life. This self-tutor allows you to effectively learn the correct pronunciation and writing, due to the presence of visual and audio support. Properly organized educational material will help you learn words quickly and easily, and the learning process itself is built in several stages: training (allows you to remember nouns, adjectives, verbs, and the alphabet); testing and verification of learned words are organized in the form of gamification methods: reading and association, visualization and spelling. Among the shortcomings, one can identify the commerciality of the application, the presence of advertising, and the lack of detailed results demonstrating the level of language knowledge. Regarding the systems that are used for emotional content analysis, among the most popular, the following were highlighted: ● Awario [21] is a social network monitoring and analysis tool. It covers major social media networks, news, blogs, and forums. Built-in sentiment analysis sorts brand mentions into positive, negative, and neutral. Next, the program creates a graph that shows how mentions of the brand have changed over time and how sentiment has changed. In addition to emotions and sentiments, you can also see the most frequently used topics related to the company. Among the shortcomings can be identified: lack of support for the Ukrainian language and the commercial nature of the application. ● Talkwalker [22] is one of the leading American software products for analyzing social data, which helps companies and agencies collect and interpret information for reputation management, as well as find ideas for marketing and PR actions. The list of key features of the system includes team communications, channel management, content analysis, automated alerts, and intelligent media processing. Using this software, brand managers can monitor online forum discussions, receive alerts on social activities, access trend forecasting charts, and respond to comments in real- time. The disadvantages of this service are determined by the work in the English-speaking segment of the global network and the significant cost of use. ● Critical Mention [23] - differs from other options in that it analyzes news and other publications containing references to a specific company. With its help, you can quickly find out about all new mentions and react in time. Since news production is now a 24/7 job, such software can help monitor feedback on online resources and social networks. ● Lexalytics [24] is a business analytics solution that analyzes different types of text. Lexalytics works with social media comments, polls, reviews, and other text content. In addition to emotional content analysis, the tool performs categorization, topic extraction, and intent detection, which can make it easier for companies to track extended context and understand business strengths and weaknesses. ● Social Searcher [25] is a social network monitoring platform that includes a sentiment analysis tool. To get started, you need to enter a keyword, hashtag or username, and Social Searcher will tell you what sentiments characterize the specified keyword. This platform forms reports in accordance with the source of communication, which allows you to get detailed information depending on the resource. The peculiarity of these programs is the work with English-language content and the commerciality of the application. As for the Ukrainian language, today there is no industrial model that would carry out an emotional and content analysis of reviews, which is primarily due to the small base of corpora for conducting research. The main projects include: ● The corpus of the Ukrainian language GRAK is a representative, structured collection of texts in the Ukrainian language, accompanied by a program that allows you to build your own subcorpora based on the corpus, search for words, grammatical forms, and their combinations, as well as process search results, sort, make balanced samples and receive various statistical information. The corpus is intended for linguistic research on grammar, vocabulary, and history of the Ukrainian literary language, as well as for use when compiling dictionaries and grammar [26]. ● The Lang-uk project [27] is an open community of people (software developers, linguists, and researchers) who are passionate about natural language processing and intelligent text processing. The Lang-uk community is focused on the development of new projects for the collection of Ukrainian corpora and other text resources. The Lang-uk project website features open-source corpora and dictionaries. Among them: vector representations of words (Word2Vec, LexVec, and GloVe 300d vectors), a lemmatized version of the vectors was built using the Ukrainian POS tag dictionary. ● Linguistic portal Mova.info [28]. A corpus of the Ukrainian language that provides means of searching by subcorpus and by morphological features. In this resource, frequency dictionaries for multiple sections and authors, grammar dictionaries, and dictionaries of comparisons have been created separately. The conducted analysis showed that the available software is characterized by significant shortcomings, which makes the task of creating an information system for studying the Ukrainian language with the possibility of emotional and meaningful content analysis an urgent task. 1.2 The main tasks of the research and their significance The purpose of the research is the development of an information system for learning the Ukrainian language and the implementation of emotional content analysis. The conducted research will provide means for creating on its basis software for managing informational and educational content, generating an individual learning environment for each user, and providing functions for emotional and meaningful analysis of user feedback. To achieve the goal, the following tasks must be solved: analyze the existing approaches, methods and software tools used in the field of learning and evaluating foreign language skills; determine the main tasks that arise at the same time; develop models of structuring and formalization of knowledge and information presented in the content of the educational environment, in order to provide a basis for the development and programmatic implementation of methods of individualized user access to the requested educational materials; to implement a mobile application for learning the Ukrainian language and providing tools for emotional content analysis. The results of the study solve the actual scientific and practical task of activating the meaningful and motivational side of the educational process and will provide the means to open additional opportunities for individualization and differentiation of Ukrainian language learning and monitoring of emotional content analysis. 2. Major research results In order to present the main aspects of the studied subject area, its conceptual scheme was built, which is shown in Fig. 1. As can be seen from the figure, the key tasks in the creation process are the construction of a database model, the development of methodological and algorithmic foundations of the subsystem for the construction of an individual learning and communication environment, the informatization of the learning management process, and the provision of mechanisms for emotional and meaningful content analysis. The user receives individualized access to the system's resources with the help of the subsystems of organizing the educational request, building an individual learning environment, the learning management subsystem, and the emotional content analysis subsystem. Figure 1: Conceptual diagram of the information system Further work was aimed at conducting a systematic analysis of the subject area using the methodology of functional modeling and graphic description of processes. For these purposes, a structural approach and the IDEF0 standard, which is intended for the formalization and description of business processes, were used. A context diagram reflecting the process of forming an individual learning environment for learning the Ukrainian language is presented in Fig. 2. In the specified model, information is received from the listener, which contains individual educational goals, interests, the purpose of using the system, etc., and knowledge of the subject area and educational materials. With the help of internal models, the educational system forms an individual learning environment. Support for individualization and adaptability of the educational process should be implemented on the basis of content intellectualization, which should be ensured at the stage of its creation [29]. In turn, intellectual content is the central entity for knowledge management in the context of the synthesis of an information system for learning the Ukrainian language and the implementation of emotional and substantive analysis in the process of communication. Figure 2: Functional model of the designed system The created database of formalized educational content at the stage of knowledge management can be used as a knowledge portal, which presents informational and educational resources in a structured form for familiarization with the grammatical features of the Ukrainian language. On the other hand, such a database, thanks to the application of methods of generating an individual educational environment, serves as a basis for building relevant educational courses that meet the individual educational goals and interests of the student, thereby ensuring individualized access for users to the requested information [30]. A separate function of the system is emotional content analysis, which will be carried out using algorithms and language processing methods. In order to detail the described functionality, we will perform the process of decomposition of the created model (Fig. 2). The decomposition process was carried out taking into account the demarcation of knowledge management processes, the organization of individualized training, and the implementation of emotional and meaningful analysis. As a result, the functional model presented in Fig. 3 was obtained. Figure 3: Functional model decomposition To implement the system based on the presented functional model, it is proposed to use a set of models and methods as the basis of software tools for managing educational content. This complex of models contains such components as a complex content model, a model of educational content organization, a model of knowledge control and diagnostics, a model of educational inquiry, a model of the subsystem of generating an individual learning environment, and a model of conducting emotional content analysis. Thanks to this, the functioning of the learning environment is ensured according to the concept in which knowledge management plays the role of preparing a repository or knowledge portal, and the organization of learning is based on the methods of using this repository as a generator of an individual learning environment. In this way, the implementation of this approach is achieved by dividing the work of the system into three levels: the level of knowledge management; the level of organization of individual training; level of emotional and meaningful analysis. Knowledge management is aimed at forming a didactic-oriented knowledge base of the subject area in which training takes place. The organization of individual training is based on the use of formalized knowledge, which is obtained at the first level of working with the system, to build an individual learning environment. The organization of emotional content analysis will be based on natural language processing and will be implemented by a separate module that will provide tokenization and syntactic analysis, lemmatization/stemming, part-of-speech tagging, and identification of semantic relationships. After further decomposition of the presented model (Fig. 3), we will present a detailed functional model of the system (Fig. 4). Figure 4: Detailed functional model The level of knowledge management involves the performance of two key functions: the distribution of educational material by level of complexity and the formalization of content. This happens on the basis of the educational process organization model and the comprehensive educational Web-content model, respectively. In this way, knowledge of the subject area is formalized with the help of educational content and its semantic component [31]. The emotional content analysis level uses a set of created rules to determine the main context. These rules will include the use of emotion dictionaries, which define lists of words with positive or negative connotations. The operation of the subsystem will include the following stages: ● identifying lists of polarized words (negative words, such as bad, terrible, etc., and positive words, such as good, kind, etc.) and assigning numerical values to them; ● adding rules that have the greatest impact on the meaning of the sentence, such as the use of "not" and other negative constructions. ● splitting the input text into tokens and counting the number of positive and negative words in the source text. ● conducting statistics on user sentiments. The process of implementing emotional content analysis can be divided into several stages: selection of the algorithm for the system, selection/creation of a dictionary of sentiments, creation of rules, collection, cleaning and storage of data, emotional content analysis, output and visualization of results. Selection of the system operation algorithm. There are many variations on how a sentiment analysis system based on rules and sentiment dictionaries can work. The simplest and least effective systems use two sentiment dictionaries, positive and negative, to break the text into words and count the number of polar words in the text. Which words are more - such is the result. This approach has little accuracy because it does not take into account the order of words in the sentence and negative constructions. Another approach also uses dictionaries of sentiments, but they are built according to a different principle: each word is assigned a value within certain limits (most often from -1 to 1) [33]. Neutral words receive 0, words with a negative connotation receive negative values, and words with a positive connotation receive positive values. The value itself is chosen according to the intensity of the emotional coloring. In such systems, the polarity of words, and their order in the sentence, to which the rules apply are simultaneously analyzed (for example, "not" before a word changes its meaning to the opposite). The second option was used in the research process. Selection/Creation of sentiment dictionary. There are several dictionaries of sentiments for the Ukrainian language, but most of them are not distinguished, but simply divided into positive and negative [33]. In addition, large dictionaries often contain words that have different meanings in different contexts. For example, the word "high" can be positive in the context of "high quality" and negative in the context of "high price". Such ambiguous words can negatively affect the results of the analysis. Therefore, in this study, it is proposed to create a prototype of a designated dictionary with words with a clearly expressed polarity, which are most often used when writing reviews about goods or services. The stages of creating such a dictionary are based on research [33]. To highlight the most frequently used words, 500 reviews from various sources were analyzed: Internet resources, and social networks. On the basis of this, 200 words were singled out, which determine meaningful coloring. Each word is tagged pos for positive words and neg for negative and is rated for polarity from -1 to 1. Words with a clearly defined positive color have the highest value, and strongly negative words have the lowest value. Since the evaluation of words is a rather subjective phenomenon, during the creation of the dictionary, an analysis of existing dictionaries for the English and Ukrainian languages was carried out to establish evaluation criteria, according to which words have the following values: 0.3, 0.5, 0.7 and 1 (with positive and from "captive signs"). The words were entered in the dictionary without endings, for this the stemming operation was used. Today, the literature describes many approaches that perform lexical analysis (Stemka, MyStem) or shorten words (Porter stemmer, Paice/Husk Stemmer), but in most cases, they do not have Ukrainian localization. In view of this, as a stemmer in this study, it is proposed to use the method described by T. Holub [34], which is based on a modification of Porter's algorithm [35] and does not require the use of a generated database, which reduces equipment requirements and the number of performed calculations [31]. Creating rules. As a result of the conducted research, a set of rules was formed, the description of which is given below. In the process of content analysis, one of the main aspects that significantly changes the meaning of a sentence in the Ukrainian language is the use of objections. The negation before the word must change the sign of the number to the opposite. However, it should be noted that "not bad" does not always equal "good". The polarity reverses, but the intensity decreases. Therefore, it is advisable not only to change the sign but also to reduce the intensity. Namely, if there is a negative particle in front of the word to be evaluated, the value of this word must be multiplied by a certain coefficient, which is set at the level of -0.7. Example: sent_analysis ('good'): 0.5, sent_analysis('not good'): -0.35. In the process of using reinforcing adverbs (very, extremely, incredibly, etc.), the intensity of the word that is located further increases. Therefore, if the word being analyzed is preceded by a strengthening word, its value must be multiplied by a certain coefficient, which is determined at the level of 1.5. Example: sent_ analysis('benign'): 0.5, sent_analysis('very benign'): 0.75. If adverbs that have a strengthening meaning are in front of a negative construction, then they strengthen the entire construction. Example: sent_analysis('good'): 0.5, sent_analysis('very bad') : -0.525. Often in a sentence, the reinforcing word separates "not" and an adjective, for example, "not very kind." The adverb "very" refers to the particle "not", so they must be analyzed together. "Not very good" has a higher degree of positive value than "not good", so if the word being analyzed is preceded by "not" together with a reinforcing word, the value of the word must be multiplied by a certain coefficient, which is set at the level of -0.5. The display of the structure of the rules is given in the form of a binary tree of decisions (Fig. 5). Figure 5: Decision tree Collection, cleaning and data storage. The conducted analysis showed that the data must be obtained from specific Internet resources of websites and social networks. As for Internet resources, a number of software libraries are used for the automated uploading of reviews, among the most common is the Requests library. However, the format of the downloaded content, in most cases, contains a lot of markup elements, which requires a data cleaning operation, which should be implemented using the BeautifulSoup library. Cleaning or pre-processing of data, which is defined by a set of operations: lowercase text, removal of punctuation marks and semantically irrelevant text, which is often found in documents from HTML pages, and stemming operations. In this study, checking the language of the text and highlighting only Ukrainian-language texts is implemented using the langdetect library. Next, the described operations are performed and the result is entered into the storage. After cleaning the data, an emotional and content analysis is carried out, for which a set of rules and a corresponding decision tree are used (Fig. 5). The result of the specified operation will be the formation of a python dictionary, the key of which is feedback, and the value is its emotional and content value expressed as a numerical value from -1 to 1. Derivation and visualization of results. Text data can be graphically represented in various ways, depending on the purpose [36]. Pie charts will be used to display the results of the emotional content analysis, which will show the ratio of positive, neutral and negative reviews, histograms will show the distribution of the polarity of reviews during a certain time, and linear graphs will be used to compare the emotional coloring of reviews in different periods (comparison of data by months, years etc). The level of organization of individualized training contains a sequence of two key functions: the organization of an educational request and the generation of an individual learning environment. Information from the listener (individual learning goals of the student, interests, language constructions) is received at the input of the educational request organization process. The generator of an individual learning environment, manipulated with the help of a suitable model structured knowledge and educational request, outputs a set of individualized content, which, together with a set of learning support functions, represents an individual learning environment. From the administrator's point of view, he mainly performs mentoring functions, the linguist performs system training. The next stage was the development of the system using modern software tools. The developed system consists of two parts - a mobile application and a web platform for organizing the educational process and creating a comprehensive model of educational content. The mobile application was created using the Flutter framework [37], which is a set of open-source software created by Google Corporation. Flutter applications are written in the Dart language [38] and run in a virtual machine during writing and debugging, providing fast compile times and dynamic reloading without the need for a restart. The graphical interface is implemented using Flutter tools using widget mechanisms. In addition to the mobile application, a web platform has also been developed, the client and server parts of which are written using frameworks and libraries of the Python language [39]. To create and manage the database of the information system, the MySQL database management system was used in combination with the MySQL Workbench tool [40]. The developed system is characterized by an intuitive interface and described functionality. When entering the system from the mobile application, the user is offered to choose the functionality that needs to be obtained: learning the Ukrainian language or emotional content analysis. If you choose to study the Ukrainian language, the main window with the available main menu is displayed (Fig. 6). Figure 6: Screenshots of system windows: main menu, nesting of practical and theoretical classes This page shows a pull-down menu that allows you to go to settings and notifications, or use the main functions, that is, open practical tasks, theory or level determination. After exiting this menu, the main interaction interface with three attachments is displayed. The first attachment contains practical tasks. The system forms them based on the passed theoretical materials, and they correspond to the level of the performer. The following attachment contains theoretical material. If you select the appropriate tab, in particular "Noun", a new card will open with information about the theoretical material according to the topic (Fig. 7). There is also a tab with information about the user's level, namely: how much material remains to be completed to reach the next level, how many practical tasks have been completed at this level, and how much theoretical material has been completed. In the settings, you can learn more about all the levels that can be achieved. Figure 7: Windows that show system functionality A feature of the system is the presence of various types of interactive tasks (Fig. 8), with the help of which you can comprehensively improve skills in mastering the Ukrainian language. The main categories, in the form of a list of cards, in the application are "Categories of words", "Learn words" and "Pronunciation of words" (Fig. 8) Figure 8: Windows that show advanced system functionality As for emotional content analysis, when working with the system, it is necessary to enter the URL address of the corresponding resource containing reviews of the product, service or brand, and check the possibility of accessing it. If everything is fine, the system downloads/cleans them and analyzes them according to the described work algorithm. The result of the work can be presented in the form of a pie chart, a histogram and a line graph. As an example, we will present the results of the emotional content analysis of 50 reviews of product_X from resource_Y during two periods (September-June) (Fig. 9). Figure 9: The result of emotional content analysis. From a privacy point of view, the resource and product name is not specified. The information generated on the basis of the download is stored in the database and has a structure: the serial number of the feedback, the field of the cleared text of the feedback, the numerical value of the analysis, and the data of writing the feedback. On the polarity distribution diagram, we can see the ratio between positive, negative and neutral feedback. The graph of polarity distribution needs special attention, which shows that in the second half of October, the polarity of reviews changed sharply from positive to negative, which may indicate the existing products or an open conflict between the company and its customers. In any case, such a situation requires attention from the company. 3. Conclusion As a result of the conducted research, the existing methods and known systems that provide means of learning the Ukrainian language and describe the mechanisms for evaluating the specified skills were analyzed. The technologies and software tools for conducting emotional content analysis were analyzed, which made it possible to determine the features of existing approaches. As the analysis showed, today there are many software systems, but all of them are characterized by certain shortcomings, from the commerciality of the application to limited functionality, which makes the task of developing an information system for learning the Ukrainian language and meaningful content analysis urgent. Models of structuring and formalization of knowledge and information presented in the context of the educational environment have been developed, providing the basis for the development and software implementation of methods for individualized user access to the requested educational materials. The next stage was the design of the software system using a structural approach and displaying the created diagrams in accordance with the IDEF0 standard. The study presents a functional model and its decomposition, which created a basis for understanding the peculiarities of the functioning of the Ukrainian language learning system and the implementation of emotional and meaningful analysis. An applied software system has been developed, which provides methodological and algorithmic foundations for building an individual learning and communication environment, provides informatization of the learning management process, and conducts meaningful content analysis. During the development of the software product, the modular programming principle was used. This structure of the system allows modifying its individual parts in the future, without impairing performance and loss of functionality. The constructed prototype of the software system can be useful as an additional tool for anyone who is interested in learning the Ukrainian language and emotional content analysis. Further research will be aimed at testing and improving systems, eliminating conflicts and expanding functionality in accordance with defined requirements. 4. References [1] R. Sushko, The voice and sounds of the native language, Apriori 2020, (In Ukrainian) [2] Wordtips - The 100 Most-Spoken Languages in the World, URL. https://word.tips/100-most- spoken-languages/ [3] S. Dixon, 100 Ways to Teach Language Online: Powerful Tools for the Online and Flipped Classroom Language Teacher, Wayzgoose Press, 2020. [4] S. Kosslyn, Active Learning Online: Five Principles that Make Online Courses Come Alive, Alinea Learning, 2021. [5] K. Ramandeep, Sentiment Analysis - From Theory to Practice, LAP LAMBERT Academic Publication, 2017 [6] C. Chapelle, The Handbook of Technology and Second Language Teaching and Learning (Blackwell Handbooks in Linguistics), Wiley-Blackwell, 1st edition, 2017. [7] J. Algeo, British or American English?: A Handbook of Word and Grammar Patterns, Cambridge University Press, 2006. [8] E. Spooner, Interactive Student Centered Learning, Rowman & Littlefield, 2015. [9] K. Conrad, The Language Teaching Controversy. Rowley, Massachusetts: Newbury House, 1978. [10] O. Kanishcheva, V. Vysotska, L. Chyrun, A. Gozhyj, Method of Integration and Content Management of the Information Resources Network. In: Advances in Intelligent Systems and Computing, 689, Springer, 2018, pp. 204-216. [11] T. McConachy, I. Golubeva, M. Wagner, Intercultural Learning in Language Education and Beyond: Evolving Concepts, Perspectives and Practices: 38 (Languages for Intercultural Communication and Education), Multilingual Matters, 2022. [12] A. Vasyliuk, T. Basyuk, Construction Features of the Industrial Environment Control System, Proceedings of the 5rd International Conference on Computational Linguistics and Intelligent Systems (COLINS-2021). Volume I: Main Conference, Kharkiv, Ukraine, April 22-23, 2021, Vol- 2870: pp.1011-1025. [13] W. Pedrycz, S. Chen, Machine Learning Sentiment Analysis and Ontology Engineering, Springer International Publishing, 2016. [14] B. Liu, Sentiment Analysis: Mining Opinions, Sentiments, and Emotions, Cambridge University Press, 2020. [15] National platform for studying the Ukrainian language of the Ministry of Culture and Information Policy of Ukraine, URL. https://speakukraine.net/ [16] "Language - DNA of the Nation", URL. https://ukr-mova.in.ua/ [17] R.I.D. it's time to become familiar with the language, URL. http://rid.ck.ua/ [18] Language Course S.L., URL. https://www.languagecourse.net/ [19] Learn languages – Mondly, URL. https://www.mondly.com/ [20] Let's Remember And Learn Ukrainian, URL. http://www.domosoft.biz/ua/gnlu.html [21] Awario, URL. https://awario.com/ [22] Talkwalker, URL. https://www.talkwalker.com/ [23] Critical Mention, URL. https://www.criticalmention.com/ [24] Lexalytics, URL. https://www.lexalytics.com/ [25] Social Searcher, URL. https://www.social-searcher.com/ [26] M. Shvedova, R. von Waldenfels, S. Yarygin, A. Rysin, V. Starko, T. Nikolayenko, General regionally annotated corpus of the Ukrainian language (GRAC), Jena, 2017. [27] Project Lang-uk, URL. https://www.lang.org.ua/uk/ [28] Linguistic portal Mova.info, URL. http://www.mova.info/ [29] D. Nayab, English Teachers’ Attitudes in acquiring Grammatical competence: By using Grammar Translation Method and Communicative Language Teaching at Graduate Level (In The Context of Punjab), LAMBERT Academic Publishing, 2020. [30] O. Naum, L. Chyrun, O. Kanishcheva, V. Vysotska, Intellectual System Design for Content Formation. In: Computer Science and Information Technologies, Proc. of the Int. Conf. CSIT, 2017, pp. 131-138. [31] T. Basyuk, A. Vasyliuk, V. Lytvyn, Mathematical Model of Semantic Search and Search Optimization, Proceedings of the 3rd International Conference on Computational Linguistics and Intelligent Systems (COLINS-2019). Volume I: Main Conference, Kharkiv, Ukraine, April 18-19, 2019, Vol-2362: pp.96-105. [32] L. Manika, M. Margam, Application of sentiment analysis in libraries to provide temporal information service: a case study on various facets of productivity. Social Network Analysis and Mining. 8 (1), 2018, pp.1–12. [33] A. Romanyuk, M. Romanishyn, Tonal dictionary of the Ukrainian language based on the sentiment-annotated corpus, URL. http://nbuv.gov.ua/UJRN/Um_2013_43_10 [34] T. Golub, Yu. Tyagunova, The method of Ukrainian language stitemming for the classification of documents based on Porter's algorithm. In: Scientific works of the Donetsk National Technical University, vol. 1, 2017, pp. 59-63 (In Ukrainian). [35] M. Porter, An algorithm for suffix stripping Program. In: Data technologies and application, vol. 40(3), 2006, pp. 211-218. [36] T. Basyuk, A. Vasyliuk, Approach to a Subject Area Ontology Visualization System Creating, Proceedings of the 5rd International Conference on Computational Linguistics and Intelligent Systems (COLINS-2021). Volume I: Main Conference, Kharkiv, Ukraine, April 22-23, 2021, Vol- 2870, pp. 528–540. [37] F. Zammetti, Practical Flutter: Improve Your Mobile Development, Apress L. P., 2019. [38] S. Ladd, K. Walrath, Dart: Up and Running: a New, Tool-Friendly Language for Structured Web Apps, O'reilly, Incorporated, 2012. [39] W. Mckinney, Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython, O'Reilly Media, 2nd edition, 2017. [40] D. Nichter, Efficient MySQL Performance: Best Practices and Techniques, O'Reilly Media, 2022.