An Intelligent Information System for Generating a Scientist's Scientometrics Using Content Analysis Methods *

An Intelligent Information System for Generating a Scientist's Scientometrics Using Content Analysis Methods * MykolaDyvak West Ukrainian National University

11 Lvivska Street 46001 Ternopil Ukraine

AndriyYushko a.yushko@wunu.edu.ua West Ukrainian National University

11 Lvivska Street 46001 Ternopil Ukraine

AndriyMelnyk West Ukrainian National University

11 Lvivska Street 46001 Ternopil Ukraine

TiandePan West Ukrainian National University

11 Lvivska Street 46001 Ternopil Ukraine

International Scientific and Practical Conference Applied Information Systems and Technologies Digital An Intelligent Information System for Generating a Scientist's Scientometrics Using Content Analysis Methods * 1613-0073 71C9360F30BB8350EE61D1307A853D47 GROBID - A machine learning software for extracting information from scholarly documents intelligent information system scientometrics researcher content analysis methods large language model irrelevant publications. *

The paper proposes methods and software tools for developing a scientometric profile of a researcher using content analysis techniques. A scientometric profile is a system of indicators that assesses a researcher's scientific productivity and influence. The growing volume of scientific information in various databases, such as Scopus and Web of Science, has made it challenging to manually track and analyze individual publishing activities. For scientific and higher education institutions, monitoring both the quantity and quality of publications is crucial. Additionally, understanding researchers' main areas of interest helps support their professional development and foster interdisciplinary collaboration. Existing tools for monitoring scientific metrics typically offer limited functionality, lack the ability to process large volumes of data efficiently, and struggle to filter irrelevant information automatically. This paper presents an approach to building a researcher's scientometric profile using content analysis, supported by large language models, specifically Ollama. A mathematical model was developed to filter out irrelevant publications based on the researcher's scientometric profile. The system for collecting and analyzing scientometric indicators was implemented, and experimental studies were conducted using profiles of researchers from West Ukrainian National University.

Introduction

In the modern scientific environment, the problem of managing the publishing activity of professors and teaching staff is gaining more and more importance. The increase in the volume of scientific information placed in various scientometric databases, such as Scopus, Web of Science, and others, greatly complicates the manual process of tracking and analyzing the publication activity of each individual researcher. For scientific institutions and institutions of higher education, it is important not only to monitor the number and quality of publications, but also to understand the main scientific interests of employees, which allows to support their professional activities and develop interdisciplinary cooperation [1][2][3][4].

Existing tools for monitoring scientific publications, as a rule, have limited functionality, are not able to efficiently process large volumes of information and automatically filter irrelevant publications. They often require significant human resources to analyze and select relevant data, which is not always possible for large scientific teams [1,5].

The purpose of this article is to present the developed software system that automates the collection, analysis and processing of the publishing activity of the teaching staff. The system not only allows you to collect information from scientific databases about publications, projects, grants and participation in scientific events, but also forms a profile of a scientist, determining his scientific interests. Using this profile, the system is able to filter irrelevant publications, automatically assessing their relevance to the scientist's interests. This decision contributes to increasing the efficiency of scientific activity, allowing to focus attention on really important and relevant scientific achievements.

Algorithms and approaches for selecting keywords and determining the researcher's scientific interests

Modern research actively uses algorithms for automatic analysis of text data to select keywords that reflect the main scientific interests of the researcher. The development of such approaches is aimed at simplifying the process of collecting, analyzing and systematizing scientific materials, which allows not only to identify the main areas of work, but also to identify interdisciplinary connections.

The main methods used to analyze texts for the purpose of extracting keywords can be divided into several categories:

Statistical methods

One of the basic approaches is to calculate the frequency of use of terms in texts. The TF-IDF metric (Term Frequency-Inverse Document Frequency) is the most popular among statistical methods and allows taking into account both the frequency of a term in a document and its significance in the context of the entire corpus of texts [6]. This increases the accuracy of extracting significant terms, as frequent but insignificant words are given less weight. Figure 1 shows an example of the implementation of the TF-IDF metric in the Python programming language using the scikit-learn library. As a result of executing the code, we will get a table with the top 10 keywords and their TF-IDF values.

The TF-IDF value of each keyword reflects its weight in the context of the article's annotationthe higher the TF-IDF value, the more important the term is for this text

Rule-based methods

Rule-based approaches, such as Named Entity Recognition (NER), allow the extraction of certain categories of words, such as names of organizations, names of people, geographic locations, and other important entities [7]. In the Python programming language, you can use the transformers library from Hugging Face, which allows you to load a pre-trained model for recognizing named entities (Fig. 2). As you can see from the code above, we use the pipeline method with a pre-trained dbmdz/bertlarge-cased-finetuned-conll03-english model that is specially tuned for Named Entity Recognition (NER). The aggregation_strategy="simple" parameter allows you to aggregate the results for greater convenience.

The next step is to run NER on the text. This pulls up a list of colds with the specified types (eg organizations, technology names, scientific concepts).

After that, keyword filtering is performed by selecting entities that may be relevant. For example ORG (organizations) and MISC (various terms such as technology or scientific concepts).

After passing all the stages, we get a list of keywords selected from the text, in our case it is: Google Cloud Reduce and MapReduce.

Natural language processing (NLP) models

Thanks to the development of natural language processing methods and the emergence of deep models such as BERT, GPT and others, it became possible to significantly improve the accuracy of text analysis [8]. These models take into account the context of words, which allows not only to highlight keywords, but also to understand their relationship and semantic meaning.

Figure 3 shows a code fragment for selecting article categories by their annotations using the ready-made facebook/bart-large-mnli model from the Transformers library.

In Figure 4, the output shows which categories most closely match each text, as well as the confidence level of the model corresponding to each category.

Text vectorization

To vectorize the text and create its numerical representation, you can use the Word2Vec or Doc2Vec methods from the gensim library in Python. Word2Vec creates vectors for individual words, while Doc2Vec allows you to get a vector representation for an entire document [9] (Fig. 5). As a result of executing this code, we will get a vector representation for three annotation texts, which is shown in Figure 6.

Figure 6: Vector representation of annotations

In the future, the obtained vectors can be used to compare the similarities between documents or to cluster documents based on topics. For example, we can calculate cosine similarity between vectors to find out how similar documents are to each other (Fig. 7). The cosine similarity value of two documents of 0.98 indicates that these documents have a very high level of similarity in terms of their vector representations. Cosine similarity measures the angle between the vectors of two texts: a value close to 1 means that the vectors are nearly parallel, indicating a high degree of similarity between the texts.

These methods can be used both separately and in combination to achieve more accurate results in determining the researcher's key scientific interests. The use of these approaches allows automating the processes of analysis of scientific activity, which, in turn, contributes to the formation of a comprehensive profile of a researcher capable of reflecting the dynamics of his scientific work and interdisciplinary connections.

Each of the described methods has its own unique application and can complement other methods in complex tasks of text analysis. In the next section, we will look at how you can use the Ollama model with its powerful language models to identify keywords in text. This approach will make it possible to apply the latest capabilities of deep learning to improve the accuracy of extracting relevant terms and analyzing complex textual data.

Methodology for creating a scientometric portrait of scientist using large Ollama language models

A researcher's profile is a comprehensive description of the researcher's professional activities, scientific achievements, and interests. It includes such key elements as name, surname, position, academic title, scientific interests, number of published works, participation in scientific grants and projects. The formation of a scientist's profile is an important task, since it can be used to solve such tasks as automated filtering of publications that match the researcher's scientific interests, and optimized selection of a scientific supervisor for young scientists or graduate students whose scientific activity coincides with the topic.

To form a profile of a scientist, first of all, it is necessary to collect basic metadata, which will become the foundation for further processing. The web scraping method can help us in this, which will allow us to collect basic information from the official website of the organization where the scientist works. This method provides automated extraction of such data as name, surname, position, academic title, circle of scientific interests, links to scientometric profiles of the author (Scopus, Web of Science, ORCID, Google Scholar, DSpace).

The use of web scraping at the initial stage provides automatic filling of the profile with publicly available information, which significantly reduces the time spent on manual data collection and creates an accurate starting point for further analysis.

To implement the web scraping process, you can use specialized libraries that allow you to automatically read and extract information from web pages [10,11]. For example, using the Cheerio library in JavaScript, it is possible to retrieve and process the HTML content of a page, extracting the required metadata such as name, title, academic interests, etc. The following example demonstrates the basic code for obtaining information about a scientist from the official website of the Western Ukrainian National University, focusing on the necessary profile elements.

Figure 8 shows a fragment of the code for parsing the metadata of scientists from the official website of the organization. Now that we have a basic set of web-scraping metadata, we can move on to the next stepfleshing out a scientist's profile using Ollama's large language models. Large Language Models (LLM) are a powerful tool for analyzing and processing textual data. Thanks to the ability to understand the context and extract meaningful units.

The main advantage of Ollama is the ability to run and manage large language models (LLM) locally on a computer, without the need for cloud services. This ensures increased confidentiality of data, reduces costs and allows users to fully control information processing processes [10].

The models presented in the Ollama platform are specialized in the processing of scientific texts and have a wide range of applications, such as automatic text classification, extraction of keywords and phrases, identification of scientific interests and creation of a generalized profile.

Table 1 provides a comparative analysis of the major language models supported by the Ollama platform [12,13].

The above table shows the main features of the models, their advantages and disadvantages and allows you to choose the appropriate model for solving this or that problem. Effective in code generation and analysis of large volumes of data.

Narrow specialization; may be less effective in other tasks.

Code Llama 1,5B ~3

Optimized for tasks related to programming; support for various programming languages.

Narrow specialization; not suitable for general NLP tasks.

Gemma 2 1,1B ~2

Multimodal model; processes text and images simultaneously.

Requires more resources to process multimodal data.

Phi-3 1,4B ~2,8

Specialized in scientific and research tasks; high accuracy.

May be less effective in general tasks; needs setting.

Phi-3 1,4B ~2,8 Specialized in scientific and research tasks; high accuracy.

May be less effective in general tasks; needs setting.

Figure 9 demonstrates the process of forming a profile of a scientist, which includes the main stages: data collection, analysis of text documents using Ollama models, parsing of scientific interests, classification of information and final creation of a profile. As can be seen from the diagram above, the profile formation process can be divided into six main stages:

1. Request -the process begins with formulating a request to create a profile of a scientist; 2. Metadata web scraping -the first stage of information collection. Web scraping is used to obtain basic metadata, such as name, title, research interests, etc., from organizations' websites or other available sources; 3. Data pre-processing -collected data undergo a process of cleaning and structuring to prepare them for further analysis; 4. Prompt -a generated request or command for data analysis using a language model; 5. Analysis of scientific publications -the language model analyzes scientific publications, projects and dissertations to highlight key topics, scientific interests and specializations; 6. Update of profile data -based on the analysis of publications and additional data, the profile of a scientist is formed or supplemented.

A mathematical model of filtering irrelevant publications based on the profile of a scientist

In scientometric databases, a problem often arises when, due to the coincidence of the author's last name, first name, and patronymic, publications that do not belong to the scientist are added to the scientist's profile. This distorts the indicators of scientific activity and complicates the objective assessment of the researcher's contribution. The development of a mathematical model for filtering irrelevant publications based on the profile of a scientist allows to effectively solve this problem.

To build mathematical models in conditions of limited data sampling, it is advisable to use methods based on interval analysis [14][15][16][17][18]. Using detailed scholarly profile data, such as author research interests, affiliations, and other unique characteristics, it is possible to accurately identify publications that actually belong to a particular scholar. This ensures an increase in the accuracy of scientometric indicators and will contribute to a more objective analysis of scientific activity. The process of building a mathematical model for filtering irrelevant publications based on the profile of a scientist can be divided into several steps:

Step 1. Formulation of the author's scientific interests. The author's scientific interests can be represented as a vector of keywords that provides an opportunity to describe the main areas of research. Let    12 , , , n I k k k , where i k is a keyword or phrase describing the author's interests.

Step 2. Vector representation of the publication. Each post can also be represented as a vector of keywords. Let Step 3. Calculating relevance using cosine similarity. To measure the similarity between the scientific interests of the author I and the publication vector

, (1)

where  is the scalar product operation. The value of   , j relevance I P ranges from 0 to 1, where a value close to 1 means high relevance.

Step 4. Filtering of irrelevant publications.

If the value

 

, j relevance I P is less than some threshold T , then the publication is considered irrelevant and is filtered out.

j P is relevant if    , j relevance I P T

So, as we can see, the model we received allows us to automatically filter out irrelevant publications based on the scientific profile of the author.

Software implementation of the system for collecting and analyzing scientific and scientific-pedagogical activities of the academic team

In the modern conditions of the information society, it is important to have effective tools for collecting and analyzing scientific activity [19][20][21]. The developed system is aimed at automating the processes of data collection about the scientific and scientific and pedagogical achievements of the academic staff, filtering this data based on relevance to their interests, and forming reports by university, faculty, department, which allows to improve the quality of management and planning of scientific work. Conventionally, our system can be divided into several interacting modules, namely:

1. The authorization and authentication module, which ensures secure user access to the system, access management and protection of personal data; 2. Data collection module: responsible for obtaining information from scientometric databases (for example, Scopus, Crossref, NRAT) and the profile of a scientist; 3. Data processing and analysis module: cleans, normalizes and pre-processes collected data for preparation for further analysis; 4. Filtering module: implements a mathematical filtering model using machine learning algorithms and criteria defined on the basis of the scientist's profile; 5. Reporting module: provides an opportunity to generate a general report on the scientific activity of the university, faculty or department; 6. User interface: provides user interaction with the system, providing the ability to perform CRUD operations with the main entities (for example, publications, dissertations, grants, projects, scientific activities). The system architecture was implemented using advanced technologies that ensure reliability, performance and flexibility. The core technology stack is based on JavaScript as both client-side and server-side programming languages, which helps ensure codebase consistency and eases application development. The server part was developed using Node.js, which allows you to create high-performance and scalable server applications with high request processing capabilities even in real time.

To optimize the interaction between the client and the server, GraphQL is used, which gives the client the opportunity to get only the data that is needed, which reduces the load on the network and server resources. Which, in turn, will improve system performance from the point of view of building complex queries.

The MongoDB database acts as a storage, which provides speed and flexibility when working with large volumes of unstructured data. It also provides efficient work with various data types used in describing the structure for data from various scientometric information systems, and provides easy scalability of the database in accordance with the load and needs.

In addition, the Ollama platform is integrated into the system, which provides a mechanism for working with various models of machine learning and artificial intelligence. Thanks to such capabilities, the system can more accurately determine the relevance of publications based on the profile of a scientist, calculating complex relationships between data.

The system interface is developed based on the principles of building intuitiveness and ease of use, which provides convenient user access to the main functionality without the need for additional training of personnel.

Figure 10 shows the initial screen of the page with the authorization and authentication forms. As you can see from the screenshot above, the authorization form is quite simple, as it requires the user to enter only the e-mail address and the password that was created during registration in the system (Fig. 11). The registration form requires the user to fill out a database of information about himself, such as surname, first name, patronymic, position, faculty, department, etc. Also, when registering, the employee must specify his identifiers in other scientometric databases to ensure the process of automated information collection. Another of the main fields of this form is the scientist's last name and first name in Latin, as these data will be needed to search for publications in the Crossref database.

After successful authorization in the system, the user gets to the "Overview" page (Fig. 12) where he can see quantitative indicators of publication activity. As can be seen from the screenshot above, the user has the opportunity to filter all indicators by faculty, department and publication period.

There is also an opportunity to create a world based on a specific division by clicking the "Download report" button. This opportunity is available only to employees with appropriate access rights (for example, the head of the department, the dean of the faculty, the vice-rector for scientific work).

If the user has entered the system and there is no added data yet, he will see a welcome window and a button that will allow synchronization of all publication activity from other scientometric databases (Fig. 13). After pressing the "Synchronization" button, a window will open with a description of the databases in which information will be searched (Fig. 14). Also, as can be seen from the image above, the user is provided with brief information about the period for which his data will be collected (by default 5 years) and how exactly this process will take place.

After successful synchronization, the user of the system receives a message in his e-mail box, which informs that his publishing activity has been successfully collected.

By going to the "Publications" section, the user will be able to view all the publications that the system managed to find (Fig. 15). If the system could not find any publication of the author, he can add it manually by pressing the corresponding button. After that, the user will open a form where he will need to fill in all the necessary fields (Fig. 16). The page for viewing dissertations protected by the user, where there is an addition form, has a similar appearance (Fig. 17 Also, our system provides a section where you can view the list of granted grants (Fig. 18). In the planned events section, users can plan new events and view existing ones (Fig. 20). Figure 21 shows the Add Activities form, which includes a variety of required fields marked with an asterisk to collect information. It includes fields such as "Name", "Type", "Faculty", "Department", "Date", "Place", "Number of participants" and "Co-organizers". There are also sections for the contact details of the organizing committee, where you need to enter a postal address, phone number and email address. The R&D funding page displays a list of research and development (R&D) funded projects, including the name, manager, terms, amount of funding, type of funding, and category of each project (Fig. 22).

It is also possible to quickly search by faculty, department and deadline. As already mentioned earlier, the system provides for the possibility of automatic creation of a scientist's profile, which can be used in the future for the tasks of filtering publications. As can be seen from the figure above, the user has the opportunity not only to view his profile, but also to edit the necessary information.

Conclusion

This work emphasizes the need for automation of collection, processing and analysis of publication activity in the modern scientific environment. The increase in the volume of scientific information complicates the manual control and analysis of data, especially in large academic groups. The developed system described in this paper not only provides automated collection of information from scientometric databases such as Scopus and Web of Science, but also forms a profile of a scientist, which includes information about his scientific interests, publications, grants and participation in scientific events. This allows you to optimize the processes of managing scientific activities, making them more efficient and objective.

An important part of the work is the use of modern algorithms for automatic text analysis, such as TF-IDF, Named Entity Recognition (NER) and text vectorization, which contribute to the selection of keywords and the identification of scientific interests of researchers. The application of deep language models, such as BERT, GPT, as well as the capabilities of the Ollama platform for localized processing of big data, allows you to achieve high accuracy in text analysis, taking into account the semantic context and the relationship between terms.

In addition, a mathematical model of filtering irrelevant publications is built in the work, which is based on the profile of a scientist, which solves the problem of filtering the author's original works, thereby significantly increasing the accuracy of scientometric indicators.

Also, the use of vector representation of scientific interests and publications with the calculation of cosine similarity is proposed for the first time. This approach contributes to the objective assessment of scientific contributions, reducing the risk of inaccuracies due to random coincidence of surnames or errors in databases.

Another important component of this work is the integration of the Ollama platform into its own system, which allows the use of language models for accurate identification of scientific interests, as well as for automatic categorization and clustering of scientific materials. This greatly facilitates the formation of reports for scientific institutions, which allows you to quickly obtain generalized data on the activities of the university, faculties and departments.

Declaration on Generative AI

During the preparation of this work, the authors used ChatGPT and Grammarly to check grammar and spelling, paraphrase, and reword the text. These tools help identify and correct grammatical errors, typos, and other writing mistakes, improving the clarity and professionalism of the text. After using these tools, the authors reviewed and edited the content as needed and take full responsibility for the publication's content.

Figure 1 :1Figure 1: Implementation of the statistical method of selecting keywords from the test using the TF-IDF metric.

Figure 2 :2Figure 2: Implementation of the NER method for searching keywords.

Figure 3 :3Figure 3: Implementation of the BART method.

Figure 4 :4Figure 4: The result of text categorization using the BART method.

Figure 5 :5Figure 5: Implementation of the Doc2Vec vectorization method.

Figure 7 :7Figure 7: Calculation of cosine similarity of two documents.

Figure 8 :8Figure 8: Code listing for parsing metadata.

Figure 9 :9Figure 9: Scheme of formation of a scientist's profile.

jP, you can use the cosine similarity:

Figure 10 :10Figure 10: Authorization form.

Figure 11 :11Figure 11: Registration form in the system.

Figure 12 :12Figure 12: Viewing the number of publications in each of the author's categories.

Figure 3 :3Figure 3: The user's welcome page in the system.

Figure 14 :14Figure 14: Synchronization confirmation window.Also, as can be seen from the image above, the user is provided with brief information about the period for which his data will be collected (by default 5 years) and how exactly this process will take place.After successful synchronization, the user of the system receives a message in his e-mail box, which informs that his publishing activity has been successfully collected.

Figure 15 :15Figure 15: Publications display page.

Figure 16 :16Figure 16: Form for adding a new publication.

Figure 17 :17Figure 17: Dissertation viewing page.

Figure 18 :18Figure 18: Grants View Page.

Figure 1919Figure19shows the form for adding a new grant.

Figure 19 :19Figure 19: Grant addition form.

Figure 20 :20Figure 20: View and add events page.

Figure 21 :21Figure 21: Event planning form.

Figure 22 :22Figure 22: R&D funding review page.

Figure 2323Figure23shows the scientist's profile view page.

Figure 23 :23Figure 23: Scientist profile page.

Table 11Comparison table of major language models in OllamaModelNumber of parametersSize (GB)AdvantagesDisadvantagesHigh productivity in generalRequires significant computingLlama 3.21,2B~2NLP tasks; support for manyresources for efficientlanguages.operation.Mistral1,3B~2,5

Tracing the influence of large language models across the most impactful scientific works D.-MPetroșanu APîrjan ATăbușcă 10.3390/electronics12244957 Electronics 12 24 4957 2023 Deep semisupervised learning-based network anomaly detection in heterogeneous information systems NLutsiv TMaksymyuk MBeshley OLavriv VAndrushchak ASachenko LVokorokos JGazda 10.32604/cmc.2022.018773 Comput., Mater. & Contin 70 1 2022 Intelligent nodes for distributed sensor network ASachenko VKochan VTurchenko VTymchyshyn NVasylkiv 10.1109/IMTC.1999.776072 Proceedings of the 16th IEEE Instrumentation and Measurement Technology Conference (Cat. No.99CH36309) the 16th IEEE Instrumentation and Measurement Technology Conference (Cat. No.99CH36309)

Venice, Italy

1999 3 IMTC/99 Analysis of the developed quantitative method for automatic attribution of scientific and technical text content written in Ukrainian VLytvyn VVysotska PPukach ZNytrebych IDemkiv ASenyk OMalanchuk SSachenko RKovalchuk NHuzyk 10.15587/1729-4061 .2018.149596 Eastern-European J. Enterp. Technol 6 2 2018 A methodology for machine-learning content analysis to define the key labels in the titles of online customer reviews with the rating evaluation AZakiAhmed MRodríguezDíaz 10.3390/su14159183 Sustainability 14 9183 2022 Research on the TF-IDF algorithm combined with semantics for automatic extraction of keywords from network news texts YWang 10.1515/jisys-2023-0300 J. Intell. Syst 33 1 2024 Named entity recognition (NER) and relation extraction in scientific publications ASingh AGarg 10.35940/ijrte.b7846.0712223 Int. J. Recent Technol. Eng. (IJRTE) 12 2 2023 LNLF-BERT: transformer for long document classification with multiple attention levels LMPham HCThe 10.1109/access.2024.3492102 IEEE Access 1 2024 Sentiment classification: review of text vectorization methods: bag of words, tf-idf, word2vec and doc2vec HDAbubakar MUmar 10.56471/slujst.v4i.266 SLU J. Sci. Technol 4 2022 Implementation of generative AI using metaverse-based LLM H.-SLee H.-SShim 10.29279/jitr.2024.29.2.123 Korea Ind. Technol. Converg. Soc 29 2 2024 MBrown AGruen GMaldoff SMessing ZSanderson MZimmer 10.48550/arXiv.2410.23432 Web scraping for research: legal, ethical, institutional, and scientific considerations 2024 Forward learning of large language models by consumer devices DPPau FMAymone 10.3390/electronics13020402 Electronics 13 2 402 2024 Large language models meet next-generation networking technologies: A review C.-NHang P.-DYu RMorabito C.-WTan 10.3390/fi16100365 Future Internet 16 10 365 2024 Algorithms of parallel calculations in task of tolerance ellipsoidal estimation of interval model parameters MDyvak PStakhiv APukas 10.2478/v10175-012-0022-9 Bull. Pol. Acad. Sci 60 1 2012 MDyvak IVoytyuk NPorplytsya APukas 10.1109/TCSET.2018.8336426 2018 14th international conference on advanced trends in radioelecrtronics, telecommunications and computer engineering

TCSET)

2018 Modeling the process of air pollution by harmful emissions from vehicles Features of structure identification the macromodels for nonstationary fields of air pollutions from vehicles NOcheretnyuk IVoytyuk MDyvak YMartsenyuk Proceedings of international conference on modern problem of radio engineering, telecommunications and computer science international conference on modern problem of radio engineering, telecommunications and computer science 2012 Parameters identification method of interval discrete dynamic models of air pollution based on artificial bee colony algorithm MDyvak 10.1109/ACIT49673.2020.9208972 2020 10th international conference on advanced computer information technologies (ACIT) 2020 Selection the "saturated" block from interval system of linear algebraic equations for recurrent laryngeal nerve identification MDyvak APukas IOliynyk AMelnyk 10.1109/DSMP.2018.8478528 IEEE second international conference on data stream mining & processing (DSMP) 2018. 2018 Content analysis using specific natural language processing methods for big data MPirnau MABotezatu IPriescu AHosszu ATabusca CCoculescu IOncioiu 10.3390/electronics13030584 Electronics 13 3 584 2024 Illustration of a software-aided content analysis methodology applied to educational research MGkevrou DStamovlasis 10.3390/educsci12050328 Educ. Sci 12 5 328 2022 Content analysis of three-dimensional model technologies and applications for construction: current trends and future directions NLe DTran RSturgill 10.3390/s24123838 Sensors 24 3838 2024