=Paper= {{Paper |id=Vol-3745/paper2 |storemode=property |title=Automated Identification of Emerging Technologies: Open Data Approach |pdfUrl=https://ceur-ws.org/Vol-3745/paper2.pdf |volume=Vol-3745 |authors=Ljiljana Dolamic,Julian Jang-Jaccard,Alain Mermoud,Vincent Lenders |dblpUrl=https://dblp.org/rec/conf/eeke/DolamicJML24 }} ==Automated Identification of Emerging Technologies: Open Data Approach== https://ceur-ws.org/Vol-3745/paper2.pdf
                         Automated Identification of Emerging Technologies: Open Data
                         Approach
                         Ljiljana Dolamic1,† , Julian Jang-Jaccard1,*,† , Alain Mermoud1,† and Vincent Lenders1,†
                         1
                             Cyber-Defence Campus, armasuisse Science and Technology, Thun, Switzerland


                                            Abstract
                                            Identifying emerging technologies and forecasting their trends is pivotal for stakeholders and decision-makers across academia, industry,
                                            and government agencies. The current strategies employed to track technology trends often rely on proprietary closed datasets and often
                                            rely on the insights of human domain experts. Not only are these approaches expensive and manual, but they are also time-consuming.
                                            In this study, we introduce an automated method for identifying emerging trends through a quantitative approach that utilizes extensive
                                            publicly available data, including patents, publications, and Wikipedia Pageview statistics. Our method proposes four criteria – novelty,
                                            growth, impact, and coherence – to automatically score technologies, based on a mathematical foundation. This approach enables the
                                            monitoring of tech trends across various sectors in an automated manner, without the need for domain experts. The results obtained
                                            through rigorous evaluation, benchmarked against similar reports from leading market research firms, illustrate a low recall rate paired
                                            with high precision, affirming the reliability of our proposed method. Furthermore, our method identifies emerging technologies not
                                            present in similar market reports, highlighting its unique capabilities.

                                             Keywords
                                             technology monitoring, emerging technologies, attributes of emergence, scientometrics, open source data, machine learning, informetrics,
                                             natural language processing



                         1. Introduction                                                                                                      solid mathematical foundation. However, most studies fo-
                                                                                                                                              cus on specific predetermined sets of technologies, making
                         Understanding emerging technologies is crucial for vari-                                                             it challenging to devise a general method for identifying
                         ous entities, including industry, academia, and government                                                           emerging technologies [6].
                         agencies. It can shape strategic decisions, improve com-                                                                In this paper, we introduce a novel approach for iden-
                         petitive positions, and create opportunities for technology                                                          tifying emerging technologies based on their coverage in
                         strategies. Owing to these considerations, there is a substan-                                                       publicly available data sources, including patents, publica-
                         tial need for identifying emerging technologies, prompting                                                           tions, and Wikipedia Pageview statistics. Unlike previous
                         widespread media coverage on the topic and leading market                                                            studies, we have not preselected any specific set of technolo-
                         research firms like Gartner and Forrester to offer services                                                          gies. Our method is transparent, does not require expert
                         promising deeper insights.                                                                                           input, and gives reproducible results for any technology.
                            Despite the common and widespread use of the term                                                                    The remainder of this paper is organized as follows: Sec-
                         ’emerging technologies,’ there is no single standard agree-                                                          tion 2 provides a survey of existing research. In Section 3,
                         ment on what constitutes the term. This lack of a clear                                                              we offer a description of the data used. Section 4 outlines
                         definition makes it challenging to develop a scientifically                                                          the proposed methodology. We present the evaluation re-
                         sound methodology to identify emerging technologies. Gart-                                                           sults in Section 5. The limitation of our proposed method
                         ner’s renowned Hype Cycle for Emerging Technologies,                                                                 is discussed in Section 6. Finally, Section 7 concludes the
                         while intuitive, cannot serve as an underlying model and                                                             paper with future work.
                         has faced criticism in the literature for being considered
                         unscientific, inconsistent, generic, and subjective [1]. Other
                         market research firms, such as Forrester and IHS Markit,                                                             2. Related Work
                         also produce annual reports on emerging technologies, yet
                         the methodology for identifying these technologies remains                                                           Definitions for the term ’emerging technologies’ in the liter-
                         unclear.                                                                                                             ature often overlap but are based on distinct characteristics.
                            Research in the area of identifying emerging technolo-                                                            For example, some authors (e.g., [7, 8, 9, 10, 11]) emphasize
                         gies primarily relies on qualitative methods, expert systems,                                                        the potential impact of the technology on the economy or
                         and survey-based approaches. For quantitative methods, re-                                                           society, covering both evolutionary change and disruptive
                         searchers have utilized open datasets and S-curve models to                                                          innovations. Others, like Boon [12], prioritize uncertainty
                         identify technology emergence [2, 3, 4, 5]. S-Curve models,                                                          about a technology’s future evolution. Some researchers
                         based on logistic or Gompertz growth concepts, provide a                                                             combine both potential and uncertainty aspects [13, 14],
                                                                                                                                              while others underline novelty and growth [15].
                                                                                                                                                 The myriad of characteristics chosen to define emerg-
                         Joint Workshop of the 5th Extraction and Evaluation of Knowledge Entities
                                                                                                                                              ing technologies has given rise to diverse scientometric
                         from Scientific Documents and the 4th AI + Informetrics (EEKE-AII2024),
                         April 23 24, 2024, Changchun, China and Online                                                                       approaches for measurement [16, 17], lacking a standard-
                         *                                                                                                                    ized definition of the underlying concept of emergence. A
                           Corresponding author.
                         †
                           These authors contributed equally.                                                                                 comprehensive analysis by Rotolo, Hicks, and Martin [18]
                         $ ljiljana.dolamic@ar.admin.ch (L. Dolamic);                                                                         explores existing research on the definition of emerging
                         julian.jang-jaccard@ar.admin.ch (J. Jang-Jaccard);                                                                   technologies, aggregating comparable approaches. They
                         alain.mermoud@ar.admin.ch (A. Mermoud);                                                                              identify five main characteristics—radical novelty, rela-
                         vincent.lenders@ar.admin.ch (V. Lenders)
                          0000-0002-0656-5315 (L. Dolamic); 0000-0002-1002-057X                                                              tively fast growth, coherence, prominent impact, and un-
                         (J. Jang-Jaccard); 0000-0001-6471-772X (A. Mermoud);                                                                 certainty—commonly appearing across the studied research.
                         0000-0002-2289-3722 (V. Lenders)                                                                                     We adopt this definition as a foundational framework for
                                     © 2024 Copyright 2024 for this paper by its authors. Use permitted under Creative Commons License
                                     Attribution 4.0 International (CC BY 4.0)



CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings

                                                                                                                                         24
our study.                                                              patents granted by the USPTO since 2013. We utilize a subset
   Predicting emerging technologies often relies on pub-                of around 6.6 million patent records for our study.
licly available datasets, commonly leveraging patents such                 Publications from arXiv2 : We employ arXiv as a pri-
as those from the United States Patent and Trademark Of-                mary publication source, taking advantage on its free distri-
fice (USPTO), Global Patent Index (GPI), and Thompson                   bution model for open-access scholarly articles. The reposi-
Innovation. Numerous publications advocate for the use                  tory hosts over 2.4 million publications spanning computer
of bibliometric methods to extract data and identify emerg-             science and diverse scientific disciplines since 1993. Figure
ing technologies, followed by deploying growth models for               2 displays the number of submissions to arXiv since August
prediction. In the work of Daim et al. [19], bibliometric               1991. Our study focuses on a subset of approximately 1.4
methods, US patent analysis, and S-curves were employed                 million arXiv publications.
for forecasting technologies such as fuel cells, food safety,
and optical storage. Similarly, Ranaei et al. [3] used ex-
pert interviews to fit data acquired by text-mining patents
into growth curve models for predicting hybrid cars and
fuel cells. Text-mining on patents and fitting to S-curves
were also proposed in [20], and Bengisu et al. [21] found
correlations between patent and publication data extracted
by scientometric methods for 20 technologies, deploying
S-curves for forecasting. S-Curve models for predicting
emerging technologies were also proposed by [2, 22].
   In recent times, artificial intelligence has regained signif-
icant attention, leading to the use of machine learning to
model and predict emerging technologies. Kyebambe and
Hwang [23, 24] employed supervised learning on citation
graphs from USPTO data to automatically label and forecast
emerging technologies. Similarly, Zhou [25] applied super-
                                                                        Figure 2: Number of arXiv submissions since 1991 (Source from
vised deep learning on worldwide patent data, with training             [27])
sets labeled based on Gartner’s Hype Cycle.

                                                                           Wikipedia Pageview Statistics 3 : In addition, we incor-
3. Data                                                                 porate Wikipedia Pageview statistics which indicates the
                                                                        number of visitors to a Wikipedia article within a specified
We primarily use three different datasets: patent data from             time frame. This offers insight into real-time public inter-
USPTO, publication data from arXiv, and statistical data                est and engagement, serving as a dynamic and accessible
from Wikipedia Pageviews.                                               indicator of emerging trends and technologies. Figure 3
                                                                        illustrates an example of a monthly pageview statistics for
                                                                        the keyword ’deep learning’.
                                                                           Leveraging the Wikipedia API, we retrieved the monthly
                                                                        views for 50,954 articles relevant to the technology.




Figure 1: Top 200 locations by patent count for granted patents         Figure 3: Number of Pageviews of the topic ’deep learning’
during 2013 - 2023 (Source from [26])                                   during Jan 2023 - Jan 2024 (Source from [28])


   Patents from PatentsView1 : Patent information pro-
vides valuable insights into the latest innovations, trends,            4. Methodology
and competitive landscapes within various industries. We
utilize PatentsView to acquire patent information from the              In this section, we outline our methodology, and Figure 4
USPTO for granted patents since 1976. As of December 5,                 offers a comprehensive overview of the entire process.
2023, there are over 8 million records of granted patents                  The proposed method is initiated by classifying each
available for free download for further analysis. Figure 1              Wikipedia article as either technology-related or not, em-
provides a glimpse of the top 200 locations worldwide for               2
                                                                            https://arxiv.org/
1                                                                       3
    https://patentsview.org/                                                https://en.wikipedia.org/wiki/Wikipedia:Pageview_statistics




                                                                   25
             Figure 4: Overview of the Proposed Methodology



ploying a binary classification approach termed as technol-             To address this issue, we devised a two-step method-
ogy classification.                                                     ology named ’technology classification,’ which involves
   Once this classification is established, we extract abstracts        the process of selecting relevant technology articles from
from USPTO and scholarly arXiv publications. These ab-                  Wikipedia.
stracts undergo annotation using the DBPedia tool 4 , align-
ing the text with Wikipedia articles. This annotation process              Step 1: Cleaning and Selecting Relevant Categories
aims to link the abstract content to relevant Wikipedia en-             Each Wikipedia article is linked to categories, forming a
tries. To reduce noise, we eliminate annotations occurring              complex graph with parent-child relationships. The edges
fewer than 5 times and those not aligned with the technol-              between categories are loosely defined as "is related to,"
ogy classification.                                                     often connecting different Wikipedia articles from non-
   The resulting filtered annotations, all within the technol-          technology areas. This correlation appears to limit the re-
ogy classification, serve as the basis for constructing time            liability of extracting only technology articles using these
series. The count of mentions for each technology 𝑡 ∈ 𝑇 per             graph-based relationships.
year is summed across each data source 𝑑 ∈ 𝐷, reflecting                   To address this, we first clean up the directed categories
the increasing occurrences of patents and publications over             graph by removing hidden categories, admin and user pages.
time. Mathematically, this can be represented as:                       Furthermore, we apply regular expression filters to eliminate
                                   ∑︁                                   categories not related to technologies, such as companies,
              Total Count(𝑡) =         count(𝑡, 𝑑)                      people names, brands, currencies, and countries.
                                   𝑑∈𝐷                                     Additionally, we utilize Wikipedia’s Main Topic Classifi-
                                                                        cations (MTC), encompassing categories like Technology,
   where count(𝑡, 𝑑) is the count of mentions for technology
                                                                        Business, Arts, Health, etc. Subsequently, we calculate the
𝑡 in data source 𝑑. We then compute relative counts in
                                                                        shortest path for each category in the filtered graph corre-
relation to the total number of technology mentions per
                                                                        sponding to 28 MTC to retain the articles with the smallest
year, represented as:
                                                                        distance to Technology, Science, or Engineering concepts.
                                                                        This resulted in 7,876 technology classification candidates,
                                     Total Count(𝑡)                     still containing some categories that may not belong to tech-
Relative Count(𝑡) =                                                     nology. By having a human domain expert manually go
                           Total Technology Mentions per Year
                                                                        through the 7,876 technology classification candidates, we
   Furthermore, monthly Wikipedia Pageviews are obtained                ultimately create a list of 1,356 technology categories.
for all technologies and transformed into time series. These               Succinctly, this process can be written as the following
time series, along with Wikipedia categories, contribute to             pseudocode in Algorithm 1.
the computation of four scores—Novelty, Growth, Impact,
and Coherence—each derived from the definitions provided                   Step 2: Technology Classification using SVM
by [18]. Finally, we aggregate and normalize these four                 The overall process of machine learning-based training to
scores to generate an emergence score for each technology.              obtain the final technology classification is detailed in Algo-
                                                                        rithm 2.
4.1. Technology Classification                                             To create an input dataset for the Support Vector Ma-
                                                                        chine (SVM), which serves as our classifier, we extract ab-
The output of annotated abstracts from patents and pub-                 stracts from Wikipedia articles identified within the tech-
lications contains noise, as each annotation refers to a                nology categories established in Step 1. The abstracts from
Wikipedia article, not necessarily related to technology.               all Wikipedia pages directly linked to a technology cate-
                                                                        gory are concatenated, stemmed, and then subjected to TF-
4
    https://www.dbpedia.org/                                            IDF-based weighting. This process generates a weighted




                                                                   26
Algorithm 1 Cleaning and Selecting Relevant Categories                particular technology has a significant portion of references
 1: procedure CleanUpDirectedGraph                                    occurring in the last few years, it receives a high novelty
 2:      Remove hidden categories, admin and user pages               score. To implement this, we considered the time span of the
    from the directed categories graph                                last 10 years and calculated the percentage of annotations
 3:      Apply regular expression filters to eliminate irrele-        for each year. Linearly decreasing weights ranging from
    vant categories (e.g., companies, people names, brands,           10 to 1 were assigned, respectively, thereby giving higher
    currencies, and countries)                                        weight to more recent years. Technologies for which the
 4: end procedure                                                     majority of annotations occurred more than 10 years ago
 5: procedure UtilizeMainTopicClassifications                         are considered not meeting the novelty criterion and are
 6:      Use Main Topic Classifications (MTC) encompassing            consequently discarded.
    categories like Technology, Business, Arts, Health, etc.             To express this more mathematically, we first define the
 7:      Calculate the shortest path for each category in the         yearly time series 𝑋𝑡,𝑑 using Eq. 1:
    filtered graph to MTC
 8: end procedure                                                                      𝑋𝑡,𝑑 = {𝑋𝑡,𝑑,𝑦 : 𝑦 ∈ 𝑌 }                 (1)
 9: procedure FilterByDistanceToMTC
                                                                        where:
10:      Retain articles with the smallest distance to Tech-
    nology, Science, or Engineering concepts within MTC                    • 𝑋𝑡,𝑑,𝑦 is the number of times technology 𝑡 is refer-
11: end procedure                                                            enced in dataset 𝑑 during year 𝑦.
                                                                           • 𝑦 ∈ 𝑌 denotes the year within the specified range.

bag-of-words for each technology category. Subsequently,                 Thus, the total number of occurrences of all technologies
feature reduction is applied to form usable feature vectors.          𝑡 ∈ 𝑇 in a dataset 𝑑 ∈ 𝐷 over a given year 𝑦 is represented
It is worth noting that optimal results were observed us-             mathematically as Eq. 2:
ing mutual information-based feature reduction, targeting                                             ∑︁
a vector length of 1000. Distances to each MTC topic are                                Total(𝑡, 𝑑) =     𝑋𝑡,𝑑,𝑦               (2)
                                                                                                      𝑦∈𝑌
appended to this vector, producing the final feature vectors
as input features.                                                      where:
   To address the imbalance in class distribution caused by
our small training set of 1,356 positive samples, we employ                • Total(t,d) denotes the total count of mentions or oc-
oversampling techniques, using Borderline-SMOTE [29], to                     currences of technology (𝑡) in dataset (𝑑).
increase the size of the input samples. The list of technolo-              • 𝑋𝑡,𝑑,𝑦 is the number of times technology 𝑡 is refer-
gies identified through SVM training is considered the final                 enced in dataset 𝑑 during year 𝑦.
                                                                             ∑︀
list pertaining to technology.                                             •    𝑦∈𝑌 signifies the summation over all years (𝑦)
   This final list is subsequently used to filter annotations                within the specified range 𝑌 .
from patents and publications.
                                                                        The novelty score Novelty(𝑡) of a technology 𝑡 ∈ 𝑇 is
                                                                      then expressed mathematically as Eq. 3:
Algorithm 2 Technology Classification using SVM
 1: procedure CreateDataset
 2:     Extract abstracts from Wikipedia articles in identi-                           ∑︁ ∑︁ (︂ 𝑋𝑡,𝑑,𝑦                )︂
                                                                        Novelty(t) =                       × 100 × 𝑤𝑦           (3)
    fied technology categories                                                            𝑦∈𝑌
                                                                                               Total(𝑡, 𝑑)
                                                                                       𝑑∈𝐷
 3:     Concatenate and stem abstracts, apply TF-IDF-
    based weighting                                                     where:
 4:     Perform feature reduction for usable feature vectors
 5:     Append distances to each MTC topic to create final                 • Novelty(t) represent novelty score for technology
    feature vectors                                                          (𝑡).
 6: end procedure                                                          • 𝑋𝑡,𝑑,𝑦 is the number of times technology 𝑡 is men-
 7: procedure HandleClassImbalance                                           tioned in dataset 𝑑 during year 𝑦.
 8:     Employ Borderline-SMOTE for oversampling                           • Total(t,d) represents the total occurrences of tech-
 9: end procedure                                                            nology (𝑡) in dataset (𝑑).
10: procedure FinalizeTechnologyList                                       • 𝑤𝑦 is a weight assigned to each year based on Eq. 4.
                                                                             ∑︀      ∑︀
11:     Use SVM training outcome as the final list of tech-                •    𝑑∈𝐷     𝑦∈𝑌 denotes double summation over all
    nologies                                                                 datasets(𝐷) and years (𝑌 ).
12: end procedure
                                                                         The formula computes the weight for each year based
                                                                      on its relative position within the given range. The weight
                                                                      increases linearly with the year’s proximity to the earliest
4.2. Emergence Score                                                  year, providing a higher weight to more recent years, as Eq.
                                                                      4:
Novelty Score: Novelty in emerging technologies signi-
fies their distinctive newness, pioneering concepts, break-
                                                                                       𝑤𝑦 = (𝑦 + 1 − min
                                                                                                      ′
                                                                                                         𝑦′ )                   (4)
through advancements, and creative problem-solving, dis-                                                ∀𝑦 ∈𝑌
tinguishing them from existing solutions and suggesting
                                                                        where:
transformative potential [15, 18].
   In our study, we define novelty for a technology based                  • 𝑦 denotes the specific year for which the weight is
on increased mentions in recent years. For instance, if a                    calculated.




                                                                 27
         • min∀𝑦′ ∈𝑌 𝑦 ′ signifies the minimum value among all                        where:
           years in the defined range 𝑌 .
                                                                                         • Slope(𝑡, 𝑑) denotes the scope of the growth curve
                                                                                           for technology (𝑡) in dataset (𝑑).
   Growth Score: Emerging technologies exhibit relatively
                                                                                         • min(Slope(𝑇, 𝑑)) represents the minimum slope
fast growth rates compared to non-emerging technologies
                                                                                           value among all technologies in dataset (𝑑).
[18]. The growth rate of a technology, assessed through
                                                                                         • max(Slope(𝑇, 𝑑)) represents the maximum slope
growth curves in patents and publications, has been studied
                                                                                           value among all technologies in dataset (𝑑).
extensively [30, 31, 32]. Using the concept of growth curves,
we employ a two-step approach to compute the growth                                    This normalization process facilitates comparative analy-
score of a technology.                                                              sis across different technologies and datasets.
   In Step 1, we apply regression techniques to fit the num-                           The technology’s final growth score is then computed by
ber of yearly technology mentions to four different curve                           integrating both the model score, which is determined based
models: Linear, Quadratic, Gaussian, and Exponential 5 . We                         on the best-fitting growth curve model, and the slope score,
select the model with the highest R-squared (𝑅2 ) measure                           reflecting the rate of change in the technology’s mentions
[33] and compute the slope of the curve based on the regres-                        over time, using Eq. 7.
sion coefficients. It is important to note that we assume the
positive or negative sign of the slope determines whether                                         ∑︁
the trend is increasing or decreasing. Subsequently, based                          Growth(t) =         (𝑀 𝑜𝑑𝑒𝑙_𝑠𝑐𝑜𝑟𝑒(𝑡, 𝑑)+𝑁 𝑜𝑟𝑚_𝑠𝑙𝑜𝑝𝑒(𝑡, 𝑑))
on the best-fitting model and the slope, we assign the tech-                                      𝑑∈𝐷
nology to one of the classes defined in Table 1 to compute                                                                                    (7)
the model_score.                                                                      where:
                                                                                         • 𝑀 𝑜𝑑𝑒𝑙_𝑠𝑐𝑜𝑟𝑒(𝑡, 𝑑) denotes the model_score for the
Table 1                                                                                    specified technology (𝑡) in the given dataset (𝑑).
Curve models and growth scores
                                                                                         • 𝑁 𝑜𝑟𝑚_𝑠𝑐𝑜𝑟𝑒(𝑡, 𝑑) denotes the normalized slope for
                   curve model                   model_score                               the specified technology (𝑡) in the given dataset (𝑑).
             Exponent increase/decrease            +/- 1.00                              •
                                                                                           ∑︀
                                                                                              𝑑∈𝐷 indicates the summation across all datasets
             Quadratic increase/decrease           +/- 0.75                                (𝐷) for the specified technology.
             Gaussian increase/decrease            +/- 0.05
              Linear increase/decrease             +/- 0.25
                    Nothing fits                     0.00                              Impact Score: Wikipedia Pageviews represent the num-
                                                                                    ber of times a particular article has been accessed on the
   In Step 2, the slope of the technology growth curve                              Wikipedia website, providing insights into the level of pub-
Slope(𝑡, 𝑑) is calculated by taking the difference between                          lic interest and engagement with specific topics or content.
the absolute counts of the last and the first year and divid-                       Utilizing this information, we leverage Wikipedia Pageview
ing it by the total number of years, as depicted in Eq. 5.                          statistics to compute the impact score of a technology. We
This equation quantifies the rate of change in technology                           use a monthly views to gather more data points. After ex-
mentions over time for a specific technology (𝑡) within a                           tracting the monthly views, denoted as (𝑤), we apply a
dataset (𝑑).                                                                        3-month moving average filter to smooth the time series.
                                                                                    This filter calculates the average of each data point along
                        Count(𝑡, 𝑑, 𝑌final ) − Count(𝑡, 𝑑, 𝑌begin )                 with the two preceding and two succeeding months, effec-
      𝑆𝑙𝑜𝑝𝑒(𝑡, 𝑑) =
                                     𝑌final − 𝑌begin                                tively reducing noise and revealing underlying trends - see
                                                                        (5)         Eq. 8.
      where:
                                                                                                    𝑤𝑖−2 + 𝑤𝑖−1 + 𝑤𝑖 + 𝑤𝑖+1 + 𝑤𝑖+2
         • 𝑌𝑓 𝑖𝑛𝑎𝑙 represents the final year for which the counts                         𝑀 𝐴𝑖 =                                              (8)
                                                                                                                  5
           are considered.
         • 𝑌𝑏𝑒𝑔𝑖𝑛 represents the initial year for which the                            The smoothed data (𝑀 𝐴𝑖 ) then replaces (𝑑) in the two-
           counts are considered.                                                   step approach used for the growth score. We classify the
         • 𝐶𝑜𝑢𝑛𝑡(𝑡, 𝑑, 𝑌𝑓 𝑖𝑛𝑎𝑙 ) denotes the absolute count of                      trends into the same five classes (as seen in Table 1).
           mentions of the technology (𝑡) in the dataset (𝑑)
           during the final year.
                                                                                    Impact(t) = 𝑀 𝑜𝑑𝑒𝑙_𝑠𝑐𝑜𝑟𝑒(𝑡, 𝑀 𝐴𝑖 )+𝑁 𝑜𝑟𝑚_𝑠𝑙𝑜𝑝𝑒(𝑡, 𝑀 𝐴𝑖 )
         • 𝐶𝑜𝑢𝑛𝑡(𝑡, 𝑑, 𝑌𝑏𝑒𝑔𝑖𝑛 ) denotes the absolute count of
                                                                                                                                             (9)
           mentions of the technology (𝑡) in the dataset (𝑑)
                                                                                       Eq. 9 represents the calculation of the impact score
           during the initial year.
                                                                                    Impact(𝑡) for a technology (𝑡). It combines the model score
   Subsequently, all calculated slope values are normalized                         𝑀 𝑜𝑑𝑒𝑙_𝑠𝑐𝑜𝑟𝑒(𝑡, 𝑀 𝐴𝑖 ) and the normalized slope score
to the range [0.0;1.0] using Eq. 6, where Norm_slope(𝑡, 𝑑)                          𝑁 𝑜𝑟𝑚_𝑠𝑙𝑜𝑝𝑒(𝑡, 𝑀 𝐴𝑖 ) obtained from the 3-month mov-
represents the normalized slope.                                                    ing average (𝑀 𝐴𝑖 ) of Wikipedia Pageviews. This score
                                                                                    reflects both the growth pattern and the temporal trends in
                                 Slope(𝑡, 𝑑) − min(Slope(𝑇, 𝑑))                     Wikipedia Pageviews, providing a comprehensive assess-
𝑁 𝑜𝑟𝑚_𝑠𝑙𝑜𝑝𝑒(𝑡, 𝑑) =
                               max(Slope(𝑇, 𝑑)) − min(Slope(𝑇, 𝑑))                  ment of the technology’s impact.
                                                              (6)
5                                                                                      Coherence Score: In our study, we consider coherence
    We utilize Apache Commons SimpleRegression and OLSMultipleLin-
    earRegression for the linear and quadratic models. The same regression          as the persistence of a technology over time, as referred to
    tools are used with the logarithm of the data points to derive the expo-        by [18]. When identifying emerging technologies, we as-
    nential and Gaussian models, respectively.                                      sume that the presence of a category on Wikipedia signifies




                                                                               28
a thematic grouping that brings together related techno-              classification method identifies 50,954 technologies from
logical concepts. The coherence within such categories is             the 4,996,310 Wikipedia articles we utilized in our study.
established through shared characteristics, applications, and
underlying principles of the technologies they encompass.             5.1. Results
This alignment allows for consistent trends to emerge within
the category over time, reflecting the collective evolution of        In this section, we discuss the observations obtained after
technologies. Wikipedia categorization serves as a valuable           applying our proposed methodology to the public dataset
indicator of how various technologies within a category               discussed earlier.
develop in tandem, providing insights into the overarching
trends and advancements in related technological domains.                Individual Scores: Table 2 displays the top 20 technolo-
   To compute the coherence score, we begin by collecting             gies with the highest novelty, growth, and impact scores.
all unique categories from Wikipedia, forming what we                 Notably, technologies related to Artificial Intelligence (AI)†
refer to as the ’Category Set.’ Subsequently, we perform              appear among the top 20 across all scores, including Deep
a mapping process, converting plural category names to                Learning and Convolutional Neural Network (CNN) for nov-
their singular counterparts, and then matching them with              elty, and Artificial Intelligence, Machine Learning, and Arti-
articles sharing identical names. The coherence score is              ficial Neural Network for impact; all except CNN correspond
then computed with the following Eq. 10:                              to categories in Wikipedia and are considered coherent.
                                                                         In the top 20 novel technologies, alongside AI-related
                       {︃                                             technologies, there are notable mentions of vehicle-related
                            0.5,   if 𝑡 ∈ Category Set                technologies such as Multirotor, Autonomous Car, and
     Coherence(t) =                                      (10)
                            0,     otherwise                          Vehicle-to-everything. The Nanosheet closes the novelty
                                                                      list, being the only technology not related to either computer
In other words, if the technology (𝑡) is part of the Cate-            science or vehicle technology. Communication ranks first in
gory Set, the coherence score is 0.5; otherwise, it is 0. This        the list of the top 20 technologies according to the growth
mathematical expression reflects the coherent presence of a           score, with Communication-related technologies like Wire-
technology within a specific thematic category.                       less and Data Transmission being other fast-growing terms.
                                                                      The list also includes older technologies that receive con-
   Emergence Score: Towards calculating the emergence                 tinuous or renewed interest, such as Lidar or Rechargeable
score, we sum the novelty, growth, impact, and coherence              Battery. Apart from vehicle-related technologies like Un-
scores. We then normalize the result to the range [0.0;1.0],          manned Aerial Vehicle and Autonomous Car, this list is
as shown in Eq. 11.                                                   completed by the Internet of Things and Quantum Comput-
                                                                      ing.
              Emergence(t) = 𝑁 𝑜𝑟𝑚[𝑛 * 𝑁 𝑜𝑣𝑒𝑙𝑡𝑦(𝑡)+
  𝑔 * 𝐺𝑟𝑜𝑤𝑡ℎ(𝑡) + 𝑖 * 𝐼𝑚𝑝𝑎𝑐𝑡(𝑡) + 𝑐 * 𝐶𝑜ℎ𝑒𝑟𝑒𝑛𝑐𝑒(𝑡)]                      Overall Score: Table 3 presents the overall top 20 tech-
                                                 (11)                 nologies after combining the individual scores.
                                                                         Deep Learning emerges as the top technology in our
We introduce control variables, including n, g, i, and c, to          methodology, with Convolutional Neural Network (CNN)
empirically manage the impact of biases arising from data             also making the list as a sub-category of Deep Learning. As
imbalance, aiming to achieve the highest precision.                   anticipated, Machine Learning is present, alongside the In-
                                                                      ternet of Things, both demonstrating coherence and ranking
   Technology Class and Technology Class Score: Indi-                 in the top 20 for impact and novelty, respectively. Cyber-
viduals often generate multiple articles on Wikipedia that            attack holds a high position, accompanied by various tech-
closely relate to one another, such as those on Machine               nologies related to Computer security, forming the second
Learning, Deep Learning, and Artificial Neural Networks.              group in the result list. Key-Value Database, the simplest
To establish connections between these closely related tech-          form of NoSQL databases, secures the seventh spot in the
nologies, we employ Wikidata properties such as ’subclass             top 20 emerging technologies. Communication and Smart-
of,’ ’part of,’ ’instance of,’ or ’said to be the same as.’ We        phone, technologies that have garnered attention for years,
refer to this group of related technologies as a ’Technology          are also on the list. We observe the inclusion of technologies
Class.’ The Technology Class score (TCs) is computed by               such as Autonomous Car, Knowledge Graph, and 5G in the
taking the emergence score of the technology within the set           top 20 scored technologies.
of related technologies, selecting the one with the maximum              Our findings align well with similar observations made by
emergence score, as shown in Eq. 12:.                                 Zhou et al. [34] and Daim et al. [35], returning four Conver-
                                                                      gence Emerging Technologies (CET) in the top five results,
                TCs = max Emergence (t)                  (12)         with the fifth (CNN) being a sub-class of Deep Learning.
                        𝑡∈𝐸𝐶
                                                                         Table 4 displays the top 20 technology classes identified
                                                                      from the top 100 technologies based on the emergence score.
5. Evaluation                                                         This method of presenting results enhances the visibility of
                                                                      other technologies, such as Virtual Assistant or Exoskeleton.
For patents, we gathered the abstracts of 6,647,699 patents
from PatentsView. From this dataset, we derived 112,199
unique annotations, of which 77,995 had more than 5 oc-               5.2. Benchmarking
currences. Similarly, for publications, we collected the ab-
                                                                      To benchmark the compatibility of our proposed emergence
stracts of 1,425,558 research papers from arXiv. Within this
                                                                      scoring to other similar works, we compiled the union
dataset, we identified 111,627 unique annotations with tech-
                                                                      set of emerging technologies identified by leading technol-
nology classification, and among them, 65,162 articles had
                                                                      ogy analysts, including Gartner, Forrester, IHS Markit, and
occurrences exceeding 5 times. Our proposed technology




                                                                 29
     Table 2
     Top 20 Technologies in Novelty, Growth, and Impact scores
                          Novelty                           Growth                                  Impact
                         Smart City                     Communication                                 URL
                      Deep Learning†                        Wireless                              LED Lamp
                          POWER8                              Pixel                          Machine Learning†
                   Vehicle To Everything                   Web Server                    Artificial Neural Network†
                        Data Science             Convolutional Neural Network†                  Neural Coding
                     Knowledge Graph                   Data Transmission                     Robot Locomotion
                     Internet of Things           Mathematical Optimization                      HTTP Cookie
              Return-Oriented Programming                    Stator                               Blockchain
                        Smartwatch                   Rechargeable Battery                  Artificial Intelligence†
                         Multirotor              Radio-Frequency Identification               Computer Science
                        Ransomware                 Unmanned Aerial Vehicle                   Sustainable Energy
                       Row Hammer                      Internet of things                      BNC Connector
               Software-Defined Networking           Quantum Computing                Electron Backscatter Diffraction
              Convolutional Neural Network†         Computer Data Storage                        Slurry Pump
                  Virtual Reality Headset              Object Detection                        Cryptocurrency
                       High Efficiency                   Video Coding                    Lidar Precision and Recall
                  Cyber-Physical System               Transfer Learning†                        XLR Connector
                       Insider Threat               Unsupervised Learning†                          Phishing
                      Autonomous Car                         HVAC                                  QR Code
                         Nanosheet                     Autonomous Car                                 PDF



Table 3                                                               Table 4
Overall Top 20 Technologies                                           Overall Top 20 Technology Classes
                        Technology                                                       Technology Classes
                      Deep Learning†                                                     Artificial Intelligence
                      Autonomous Car                                                      Autonomous Driving
                     Internet of Things                                                     Internet of Thing
           Convolutional Neural Network (CNN)†                                             Computer Security
                    Machine Learning†                                                           Database
                        Ransomware                                                         Knowledge Graph
                    Key-Value Database                                              Augmented, Virtual, Mixed Reality
              Shard (Database Architecture)                                                   Connectivity
                        Cyberattack                                                       Telecommunication
                     Knowledge Graph                                                    Cloud and Virtualization
                    Augmented Reality                                                         Data Science
                        Smartphone                                                         Optical Instrument
                      Communication                                                         Virtual Assistant
                   Side-Channel Attack                                                         Exoskeleton
                       Cloud Gaming                                                         Computer Vision
                            5G                                                              Satellite Imagery
                        Data Science                                                   Heterogeneous Computing
              Return Oriented Programming                                               Distributed Computing
                           Lidar                                                             Medical Device
                      Push Technology                                                          3D Printing


the World Economic Forum (WEF). Gartner predicted 35                  single technology class.
technologies in its technology hype cycle, Forrester pre-                Table 6 illustrates the performance metrics of Average
dicted 12, IHS Markit 8, and WEF 10 emerging technologies.            Precision (AP) and Recall (R) for the top 20 technologies (T)
Upon merging the overlapping technologies from these four             and Technology Classes (TC) identified in the evaluation
lists, we derived a consolidated list of 36 unique technology         set.
classes which we use as ground truth. Table 5 provides an                In the ’base’ run, all control variables in Eq. 10 are set to
overview of these classes.                                            1. Additionally, alongside the ’max_prec’ parameter set, we
   Notably, the majority of technologies in this table appear         present the average precision and recall of the Computer
to belong to the Computer Science-related domain, with                Science technology class (max_prec_cs). Within the top 20
72% of them being linked to it. Technologies marked with              technologies with the highest emergence score, only one
’†’ are those we were unable to directly map to a Wikipedia           non-technology result was observed. The average precision
article or category. Additionally, articles judged as non-            (AP) was 0.72 for the base run. However, all the relevant
technologies by the SVM classifier are indicated in the table         concepts from this subset relate to only 6 out of the 36
with ’.’                                                              technologies mentioned before, resulting in a recall (R) of
   It is worth mentioning that Wikipedia articles on Aug-             0.16. By changing the control variables for the max_prec,
mented, Mixed, and Virtual Reality are collectively pre-              where non-Computer Science technology does not grow and
sented, following Forrester’s proposal to consider them as a          have entries in Wikipedia articles, we were able to increase




                                                                 30
Table 5                                                                  our evaluation set but are present in our technology result
Evaluation Set: Technology classes based on Gartner, Forrester,          set, ranked 4,897 and 12,421, respectively. To address this
IHS Markit and WEF                                                       bias, we split the result set as well as the evaluation set into
                     Technology Classes                                  distinct domains (CS, Nanotechnology, Medicine, etc.). This
                       Tissue Engineering                                approach allowed us to navigate around the bias. The third
                   Unmanned Aerial Vehicle                               row (CS TC) of Table 6 provides the average precision and
                            Smartdust                                    recall when only results related to the Computer Science
                     Artificial Intelligence                             field are considered, as this class is predominant in our re-
                           4D Printing                                   sult/evaluation sets. Although this approach results in only
               Ontology (Information Science)
                                                                         a 10% increase in average precision, the increase in recall
                  Neuromorphic Engineering
                           Exoskeleton                                   rises to 30%.
                        Edge Computing
                     Autonomous Driving
              Self-Healing System Technology†
                                                                         7. Conclusion
                      Volumetric Display
                                                                         This paper presents an automated method for identifying
                                5G
                     Quantum Computing                                   emerging technologies using publicly available data. Our
                     Platform as a Service                               approach is applicable across various technology sectors
            Application Specific Integrated Circuits                     without the need for human domain experts, as it relies on
                      Autonomous Robot                                   a clear mathematical foundation.
                          Mobile Robot                                      We propose an emergence scoring system based on nov-
                  Brain Computer Interface                               elty, growth, impact, and coherence scores. Novelty and
                       Internet of Things                                growth scores are computed from time series data of an-
                             Biochip                                     notations applied to USPO patents and arXiv publications.
                          Digital Twin                                   The impact score is derived from the Wikipedia Pageview
                        Nanotechnology
                                                                         time series, while the coherence score utilizes Wikipedia
                        Virtual Assistant
                    Lithium-Silicon Battery                              categories.
                           Blockchain                                       To assess the effectiveness of our proposed methods, we
              Augmented, Virtual, Mixed Reality                          compiled an evaluation set of 36 emerging technologies by
                            E-textiles                                   amalgamating lists from prominent market research firms
                       Cloud Computing                                   like Gartner and Forrester Research. The evaluation un-
                        Computer Vision                                  veiled a low recall (0.16) in identifying emerging technolo-
                       Ubiquitous Video†                                 gies.
                Natural Language Generation                                 This research lays the groundwork for further investi-
                        Switched Fabric                                  gations, including the development of a methodology to
                    Personalized Medicine
                                                                         determine the more fine-grained stages of emergence (e.g.,
                       Cell Encapsulation
                           Gene drive
                                                                         pre-emergence, emergence, post-emergence) for a particular
                                                                         technology within different timeframes.
                                                                            Our study can be enhanced by incorporating the Ope-
Table 6                                                                  nAlex concept 6 , which has gained more popularity com-
Average Precision (AP) and Recall (R) of Technologies (T) and            pared to the now-defunct DBpedia concepts. Additionally,
Technology Classes (TC)                                                  we plan to employ more advanced deep learning models
            Parameters       Classes     AP      R                       instead of the SVM model, as mentioned in [36, 37], specifi-
               base             T        0.72   0.16                     cally a combination of LSTM and Transformer [38, 39], to
                                T        0.81   0.19                     conduct more efficient time series analysis. This will be
              max_prec         TC        0.72   0.28                     performed using a larger publication dataset than arXiv,
                             CS TC       0.79   0.36                     such as the one available on OpenAlex 7 . Additionally, since
            max_prec_cs      CS TC       0.90   0.36
                                                                         our methodology still requires a certain degree of manual
                                                                         intervention, such as inspecting Wikipedia categories and
                                                                         adjusting bias variables, we want to explore techniques that
both AP (0.81) and R (0.19). In this setting, the control                can minimize these manual components to enhance scala-
variables were chosen to facilitate the maximum precision                bility and reduce potential subjectivity.
(e.g., g, n, i, and c set to 1, 0.3, 0.1, and 0.3, respectively).

                                                                         Acknowledgments
6. Limitations
                                                                         We extend our thanks to the developers at Trivo Sys-
A bias is evident when examining the results of identified               tems—Pratiksha Jain, Himanshu Jain, and Marc Liechti—for
emerging technologies toward Computer Science, as no-                    their work on the Technology Market Monitoring 1.0 project.
ticed within the evaluation set, with 70% of technologies                We appreciate their valuable contributions to shaping the
within the top 100 results belonging to this domain. This                initial stage of our study. We also extend our thanks to ar-
bias complicates the exploration of trends in other domains.             masuisse Science and Technology for supporting the study.
Taking chemistry as an example, the International Union of
Pure and Applied Chemistry (IUPAC) issued a list of emerg-
ing technologies for this domain, containing, among others,              6
                                                                             https://docs.openalex.org/api-entities/concepts
3D bioprinting or Flow chemistry, none of which figure in                7
                                                                             https://openalex.org/




                                                                    31
References                                                                    Measuring technological convergence in encryption
                                                                              technologies with proximity indices: A text min-
 [1] O. Dedehayir, M. Steinert, The hype cycle model: A re-                   ing and bibliometric analysis using openalex, arXiv
     view and future directions, Technological Forecasting                    preprint arXiv:2403.01601 (2024).
     and Social Change 108 (2016) 28–41.                                 [18] D. Rotolo, D. Hicks, B. R. Martin, What is an emerging
 [2] G. Intepe, T. Koc, The use of s curves in technology                     technology?, Research policy 44 (2015) 1827–1843.
     forecasting and its application on 3d tv technology,                [19] T. U. Daim, G. Rueda, H. Martin, P. Gerdsri, Forecast-
     International Journal of Industrial and Manufacturing                    ing emerging technologies: Use of bibliometrics and
     Engineering 6 (2012) 2491–2495.                                          patent analysis, Technological forecasting and social
 [3] S. Ranaei, M. Karvonen, A. Suominen, T. Kässi, Fore-                     change 73 (2006) 981–1012.
     casting emerging technologies of low emission vehicle,              [20] D. Kucharavy, E. Schenk, R. De Guio, Long-run fore-
     in: Proceedings of PICMET’14 Conference: Portland                        casting of emerging technologies with logistic models
     International Center for Management of Engineering                       and growth of knowledge, in: 19th CIRP design con-
     and Technology; Infrastructure and Service Integra-                      ference, 2009, p. 277.
     tion, IEEE, 2014, pp. 2924–2937.                                    [21] M. Bengisu, R. Nekhili, Forecasting emerging technolo-
 [4] J. W. Z. Sossa, F. P. Marro, B. A. Alzate, F. M. V. Salazar,             gies with the aid of science and technology databases,
     A. F. A. Patiño, S-curve analysis and technology life cy-                Technological Forecasting and Social Change 73 (2006)
     cle. application in series of data of articles and patents,              835–844.
     Revista ESPACIOS| Vol. 37 (Nº 07) Año 2016 (2016).                  [22] M. Nieto, F. Lopéz, F. Cruz, Performance analysis of
 [5] S. Kar, A. K. Kar, M. P. Gupta, Understanding the s-                     technology using the s curve model: the case of digital
     curve of ambidextrous behavior in learning emerging                      signal processing (dsp) technologies, Technovation 18
     digital technologies, IEEE Engineering Management                        (1998) 439–457.
     Review 49 (2021) 76–98.                                             [23] M. N. Kyebambe, G. Cheng, Y. Huang, C. He, Z. Zhang,
 [6] R. Adner, R. Kapoor, Innovation ecosystems and                           Forecasting emerging technologies: A supervised
     the pace of substitution: Re-examining technology                        learning approach through patent analysis, Technolog-
     s-curves, Strategic management journal 37 (2016) 625–                    ical Forecasting and Social Change 125 (2017) 236–244.
     648.                                                                [24] S.-Y. Hwang, D.-J. Shin, J.-J. Kim, Systematic review on
 [7] A. L. Porter, J. D. Roessner, X.-Y. Jin, N. C. Newman,                   identification and prediction of deep learning-based cy-
     Measuring national ‘emerging technology’capabilities,                    ber security technology and convergence fields, Sym-
     Science and Public Policy 29 (2002) 189–200.                             metry 14 (2022) 683.
 [8] B. R. Martin, Foresight in science and technology,                  [25] Y. Zhou, F. Dong, Z. Li, J. Du, Y. Liu, L. Zhang, Forecast-
     Technology analysis & strategic management 7 (1995)                      ing emerging technologies with deep learning and data
     139–168.                                                                 augmentation: convergence emerging technologies vs
 [9] N. Corrocher, F. Malerba, F. Montobbio, The emer-                        non-convergence emerging technologies (2017).
     gence of new technologies in the ICT field: main ac-                [26] P. USPTO, Locations that drive innovation, 2023. URL:
     tors, geographical distribution and knowledge sources,                   https://datatool.patentsview.org/, accessed: December
     Technical Report, Department of Economics, Univer-                       9, 2023.
     sity of Insubria, 2003.                                             [27] arXiv, Monthly submissions, 2024. URL: https://arxiv.
[10] M. Halaweh, Emerging technology: What is it, Journal                     org/stats/monthly_submissions, accessed: February 5,
     of technology management & innovation 8 (2013) 108–                      2024.
     115.                                                                [28] P. Analysis, Comparison of pageviews across multi-
[11] S.-C. Hung, Y.-Y. Chu, Stimulating new industries                        ple pages, 2023. URL: https://pageviews.wmcloud.org/,
     from emerging technologies: challenges for the public                    accessed: February 12, 2024.
     sector, Technovation 26 (2006) 104–110.                             [29] H. Han, W.-Y. Wang, B.-H. Mao, Borderline-smote: a
[12] W. Boon, E. Moors, Exploring emerging technologies                       new over-sampling method in imbalanced data sets
     using metaphors–a study of orphan drugs and phar-                        learning, in: International conference on intelligent
     macogenomics, Social science & medicine 66 (2008)                        computing, Springer, 2005, pp. 878–887.
     1915–1927.                                                          [30] B. Andersen, The hunt for s-shaped growth paths in
[13] S. Cozzens, S. Gatchair, J. Kang, K.-S. Kim, H. J. Lee,                  technological innovation: a patent study, Journal of
     G. Ordóñez, A. Porter, Emerging technologies: quan-                      evolutionary economics 9 (1999) 487–526.
     titative identification and measurement, Technology                 [31] M. Meyer, Patent citation analysis in a novel field
     Analysis & Strategic Management 22 (2010) 361–376.                       of technology: An exploration of nano-science and
[14] B. C. Stahl, What does the future hold? a critical view                  nano-technology, Scientometrics 51 (2001) 163–183.
     of emerging information and communication technolo-                 [32] G. S. Day, P. J. Schoemaker, Avoiding the pitfalls of
     gies and their social consequences, in: Researching the                  emerging technologies, California management re-
     Future in Information Systems: IFIP WG 8.2 Working                       view 42 (2000) 8–33.
     Conference, Turku, Finland, June 6-8, 2011. Proceed-                [33] D. S. Moore, Introduction to the Practice of Statistics,
     ings, Springer, 2011, pp. 59–76.                                         WH Freeman and company, 2009.
[15] H. Small, K. W. Boyack, R. Klavans, Identifying emerg-              [34] Y. Zhou, F. Dong, Y. Liu, Z. Li, J. Du, L. Zhang, Forecast-
     ing topics in science and technology, Research policy                    ing emerging technologies using data augmentation
     43 (2014) 1450–1467.                                                     and deep learning, Scientometrics 123 (2020) 1–29.
[16] W. Glänzel, B. Thijs, Using ‘core documents’ for detect-            [35] T. Daim, K. K. Lai, H. Yalcin, F. Alsoubie, V. Kumar,
     ing and labelling new emerging topics, Scientometrics                    Forecasting technological positioning through technol-
     91 (2012) 399–416.                                                       ogy knowledge redundancy: Patent citation analysis
[17] A. Tavazzi, D. P. David, J. Jang-Jaccard, A. Mermoud,                    of iot, cybersecurity, and blockchain, Technological




                                                                    32
     Forecasting and Social Change 161 (2020) 120329.
[36] Y. Zhang, C. Zhang, P. Mayr, A. Suominen, Y. Ding,
     An editorial of “ai+ informetrics”: Robust models for
     large-scale analytics, Information Processing and Man-
     agement (2023) 103495.
[37] W. Xu, J. Jang-Jaccard, A. Singh, Y. Wei, F. Sabrina, Im-
     proving performance of autoencoder-based network
     anomaly detection on nsl-kdd dataset, IEEE Access 9
     (2021) 140136–140146.
[38] Y. Wei, J. Jang-Jaccard, W. Xu, F. Sabrina, S. Camtepe,
     M. Boulic, Lstm-autoencoder-based anomaly detection
     for indoor air quality time-series data, IEEE Sensors
     Journal 23 (2023) 3787–3800.
[39] Y. Wei, J. Jang-Jaccard, F. Sabrina, W. Xu, S. Camtepe,
     A. Dunmore, Reconstruction-based lstm-autoencoder
     for anomaly-based ddos attack detection over
     multivariate time-series data,          arXiv preprint
     arXiv:2305.09475 (2023).




                                                                 33