Link Prediction Method in Graph Objects by Auto
Encoding in Graph Neural Networks
Vladyslav Shlianin1 , Yuri Gordienko1 and Sergii Stirenko1
1
 National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, 37 Peremohy Aveniu, 03056, Kyiv,
Ukraine


                                         Abstract
                                         Link prediction problem is significant for a better understanding of the hidden or lost connections
                                         between objects in hierarchical structures like networks, for example, in social, business, biological,
                                         medical, and other domains. Recently, Graph Autoencoders (GAE) and Variational Graph Autoencoders
                                         (VGAE) deep neural networks (DNNs) emerged as effective tools for resolving various problems. In
                                         this paper, their variations were used to solve the link prediction problem for graph objects in the
                                         legal document context. For this purpose, the customized dataset in the shape of the hierarchical set of
                                         Ukrainian legal acts adopted by the Ukrainian parliament (Verkhovna Rada of Ukraine) and Ukrainian
                                         government (the Cabinet of Ministers of Ukraine) was constructed, and its exploratory data analysis
                                         (EDA) was performed. Several GAE and VGAE models were proposed and applied for the dataset, the
                                         comparison analysis was performed for all of the models considered, and a conclusion was made as
                                         to possible further improvements of the method proposed for other real-world graph data in various
                                         domains.

                                         Keywords
                                         Neural networks, deep learning, graph, graph neural networks, autoencoder, graph autoencoding, link
                                         prediction


1. Introduction
A link prediction task is a prediction of the availability of a link between two nodes in a network.
Many examples of link prediction can be found in various everyday applications like a search
of friendship connections between users in social networks, estimation of potential business
connections between companies in markets, prediction of gene-protein or protein-protein
interactions in biological networks, etc [1, 2].
   Traditionally, the link prediction problem is solved by assuming that the more similar nodes
in a graph, the more likely they are to have edges [3]. In these approaches, link prediction
is calculated by investigating the similarity between nodes in a graph, taking into account
information about a graph topology. However, not all relations in real-world graphs are based
on similarity. For instance, in some graphs like the legal acts network (see details below),
connections are based on auxiliary references and facts about the availability of other legal acts.

MoMLeT+DS 2022: 4th International Workshop on Modern Machine Learning Technologies and Data Science, November,
25-26, 2022, Leiden-Lviv, The Netherlands-Ukraine.
$ vladyslav.shlianin@gmail.com (V. Shlianin); yuri.gordienko@gmail.com (Y. Gordienko);
sergii.stirenko@gmail.com (S. Stirenko)
 0000-0003-3833-4957 (V. Shlianin); 0000-0003-2682-4668 (Y. Gordienko); 0000-0002-9395-8685 (S. Stirenko)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
   With the advancement of deep neural networks (DNNs) and graph neural networks (GNNs),
graph autoencoders (GAEs) and variational graph autoencoders (VGAEs) [4] have been proposed
to learn graph embeddings in an unsupervised way. It has been shown that these methods are
effective for link prediction tasks. It is worth noting that GAEs and VGAEs mostly rely on graph
convolutional networks (GCN) to encode nodes [5].
   Such approach works well both on homogeneous and heterogeneous graphs, also known as
knowledge graphs [6]. One of the notable examples even performs a study on the knowledge
graph of Austrian judicial and legal acts [7]. However, this example implements preexisting
models, such as Word2Vec and Doc2Vec.
   Usually, to get data about links between legal acts, Ukrainian lawyers have to visit the
official parliament portal to get it there. However, the data entry process is not automated
and, therefore, incomplete, especially for codified laws and older legal acts. The link prediction
method described in this paper aims to reduce this problem by helping lawyers and data entry
specialists to get more accurate representations of links between acts with the highest possible
precision. In addition to this practical use case, the link prediction problem generally has many
real-life applications, from modelling recommendation systems to predicting user interactions in
social networks. Recently, link prediction problem appeared in numerous medical applications
like disease-gene association prediction problem [8] and other critically important medical
problems related to cancer disease diagnostics, and treatment [9, 10].
   In this paper, we focus on the link prediction problem, particularly in the context of a hierarchy
of Ukrainian legal acts relations. To solve this problem, some homogeneous graphs can be
constructed and experimentally compared by application of different GAE and VGAE models
with subsequent performance comparison.
   The paper has the following structure: section 2. Background and Related Work contains a
short outline of similar attempts to use various GCNs to investigate the link prediction problem,
section 3. Methodology presents the dataset, structure of DNNs, and metrics used, section 4.
Experimental describes the results obtained, section 5. Discussion gives the analysis of the
methods used, and section 6. Conclusions proposes a summary of the further improvements.


2. Background and Related Works
Recently, GCN have been studied to extend the possibilities of neural networks on working
with data, represented as a graph. Designing a convolutional operator is a key issue and can be
classified into two categories:

    • Spectral methods [11]
    • Spatial methods [12]

  In this paper, we utilize the widely used convolution operator [13], which can be regarded as
both the spectral operator and spatial operator.
  Graph convolutions are applied to graph networks in the non-probabilistic GAE [4] and
VGAE [4] architectures.
  GAE firstly transforms each node into latent representation (i.e., embedding) via GCN and
then aims to reconstruct some part of the input. GAEs proposed in [4], [14], and [15] intend to
reconstruct the adjacency via decoder while GAEs developed in [16] attempt to reconstruct the
content.
   Variational Graph Autoencoders (VGAE) propose similar to GAE approach but with some
differences. The difference between VGAEs and GAEs is that VGAE embeds the input to a dis-
tribution rather than a point, and decoder produces an output using a variational approximation
[17]. Such architecture allows Variational Auto encoders to generate new data from the original
source dataset. At the same time, regular autoencoders only produce output similar to the input
[18].
   While VGAE is a framework, there are different variations and implementations of it. For
example, an attributed network embedding model using VGAE is proposed [19] for learning
both node and attribute representations in the same space.
   Also, there is a variance of VGAE, which was created to learn rating embeddings by consider-
ing them for users and review texts [20].
   It is also worth noting some legal domain studies related to the usage of DNNs and GCNs
to investigate legal datasets. Text-guided Graph Reasoning approach [21] was introduced for
combining text representation and structure knowledge. This approach solves graph completion
tasks and utilizes R-GCN and GAT networks. It is also model agnostic and can be implemented
in other GNNs. Another approach for working with legal graphs is to use a DNN hybrid model
[22], which extracts events in the knowledge map. In addition, this approach combines the
advantages of convergence and iterative DNNs for extracting events for common convergence
and bidirectional iterative DNNs. In a fuzzy DNN approach, input data is converted into a
double precision variable [23]. With such an approach, each character sequence is forcibly
transformed into integer variables, which are transformed into floating-point double precision
variables.


3. Methodology
3.1. Dataset
For this work, the hierarchical set of Ukrainian legal acts adopted by the Ukrainian parlia-
ment (Verkhovna Rada of Ukraine) and Ukrainian government (the Cabinet of Ministers of
Ukraine) was used to prepare the customized dataset with the structure shown in Table 1. The
representative part of the dataset can be accessed on Kaggle open data platform [24].
  Some of the features in Table 1 have the following additional characteristics:

    • Status (one of the values): undefined (0), taking effect (2), renewed (4), effective (5).
      “undefined” indicates the missing data about the status of the current act.
    • Types: Law (1), Decree (20), Order (30), Codified Law (124), Agenda (201), Constitution
      (100), duplicated value for Constitution (216). It is possible for the act to contain more
      than one type; for instance, codified laws (124) are just laws (1) too.
    • Institutions: The set of state institutions IDs, which passed the law.

   Also, the dataset contains some relations data between the legal acts in the hierarchical
structure that are described in Table 2.
Table 1
The structure of the customized dataset for the hierarchical set of Ukrainian legal acts
                          Feature               Type          Example
                       Document ID               ID           12
                                                              "Constitution of
                            Title             ASCII text
                                                              Ukraine"
                           Status         Integer Enum        2
                                             Array of
                           Types                              124|1
                                         integer Enums
                        Institutions       Array of IDs       123|15|7
                        Text content        ASCII text        "This law regulates ..."


Table 2
Relation between the legal acts (in Table 1)


                                    Feature                Type       Example
                            Source document                ID         124
                            Target document                ID         305
                              Relation type           Integer Enum    2

Relation type (one of the values): Origin/root (2), Relates to (6)


   Legal acts and, more specifically, links between them are represented with a directed graph,
in which acts are nodes and relations are represented through edges.


Figure 1: Example of dataset structure.
  However, the graph is incomplete, and there are cliques, which are not connected to each
other, which results in having a certain amount of sub-graphs.
  Here and later, such attributes are used to evaluate data:
    • Number of edges
    • Depth: Median and mean depth of each node in graph
    • Maximum clique size
    • Degree centrality: Some legal acts have more connections, than others (i.e: Constitution
      of Ukraine, Codified Laws)
    • Eigenvector centrality: Another metric for node centrality. The difference with regular
      degree centrality is that Eigenvector centrality measures a node’s importance while
      considering the importance of its neighbors.


Table 3
Dataset characteristics

                              Characteristic              Value
                              Number of graphs            427
                              Number of nodes             68841
                              Number of edges             89916
                              Mean depth                  5.95
                              Median depth                11
                              Maximal clique size         8
                              Mean degree centrality      6.7302e-05
                              Median degree centrality    5.1527e-05
                              Mean eigen centrality       0.00029
                              Median eigen centrality     2.1227e-15


3.2. Workflow
In order to construct GNNs, non-categorical strings (acts titles and plain text contents) needed
to be transformed into tensor features. It was done by utilizing the sentence-transformer all-
MiniLM-L6-v2 model based on MiniLM model [25], which maps sentences and paragraphs to a
384 dimensional dense vector space.
   After transforming the dataset to the graph structure, it was split into 3 subsets - train (Table
4), validation (Table 5), and test (Table 6). This split was done by randomly splitting edges of
the graph and was performed such that the train split does not include some edges that were
present in the validation and test splits (see explanations in Figure 2 and Figure 3).
   In the same way, the validation split does not include some edges that were present in the
test split (see explanations in Figure 3). It is also worth noting that the set of nodes and node
features were the same in all the sets because the model was used to predict links only.
   To ensure that all the sets are represented equally, they were evaluated with regard to the
same characteristics as the whole base dataset.
Figure 2: Structure of train set. The red arrows represent message edges which are included in validation
and test subsets but are not included in the train subset.


Figure 3: Structure of validation set. The red arrows represent message edges that are included in the
test subset but not in the validation subset.


3.3. Exploratory Data Analysis
As one can see from applying Exploratory Data Analysis (EDA) to train, validation, and test
subsets, they have an equal number of nodes but a different number of edges. Also, one can
see some differences in mean and median values of depth, degree centrality, and eigenvector
centrality (Tables 4-6). To investigate this issue, distribution charts were constructed for each of
these characteristics: depth (Figure 4), degree centrality (Figure 5) and eigenvector centrality
Table 4
Train subset characteristics
                                Characteristic             Value
                                Number of graphs           466
                                Number of nodes            68841
                                Number of edges            56648
                                Mean depth                 4.9106
                                Median depth               11
                                Maximal clique size        6
                                Mean degree centrality     5.2208e-05
                                Median degree centrality   3.1722e-05
                                Mean eigen centrality      0.0002
                                Median eigen centrality    2.4598e-14


Table 5
Validation subset characteristics
                                Characteristic             Value
                                Number of graphs           467
                                Number of nodes            68 841
                                Number of edges            62942
                                Mean depth                 4.7862
                                Median depth               10
                                Maximal clique size        6
                                Mean degree centrality     5.4953e-05
                                Median degree centrality   6.0103e-05
                                Mean eigen centrality      0.0002
                                Median eigen centrality    6.4074e-14


(Figure 6) distributions. While mean and median values may differ, the overall distributions
are very similar across all subsets. Validation and test subsets can be considered representative
ones with regard to the train subset.
   It is also worth mentioning, that edges represented in Fig.2 and Fig.3 are used only for message
passing. This is done to exchange neighborhood information and enhance node representations.
Edge labels and edge label indices are completely isolated and are not shared between sets.
Usually, Graph Autoencoders use the same edges for message passing and train in validation
sets, but here we enforce additional isolation by additionally removing message passing edges
from the training subset to prevent possible data leaks.

3.4. Models
Four different models were created to investigate the effect of GAE/VGAE for efficient link
predictions in the customized dataset. All models implement GAE architecture, consisting
of an encoder and a decoder. The encoder takes data from input and transforms it into a
lower dimensional embedding. Then the decoder takes this lower dimensional embedding and
Table 6
Test subset characteristics
                               Characteristic             Value
                               Number of graphs           437
                               Number of nodes            68841
                               Number of edges            80925
                               Mean depth                 5.7279
                               Median depth               11
                               Maximal clique size        8
                               Mean degree centrality     6.3060e-05
                               Median degree centrality   5.3643e-05
                               Mean eigen centrality      0.0002
                               Median eigen centrality    6.4101e-15


Figure 4: Depth distribution across sets


reconstructs the original input [26]. In GAE architecture, the loss function determines the
amount of information lost during decoding.
  In this experiment, the following GCN models were used:

    • GAE with two layer GCN encoder (GCN) shown in Fig. 7,
    • Single layer Linear GCN encoder (LGCN) shown in Fig. 8,
    • Variational two-layer GCN encoder (VGCN) shown in Fig. 9,
    • Variational Single layer Linear Graph Convolutional network (VLGCN) shown in Fig. 10.

  All models implement GCN encoder and dot product decoder. In VGAE-based models,
encoders output mean and variance vectors, which are converted to z-embedding. Encoders of
regular GAE models output z-embedding directly.
  The training was performed with a calculation of some standard metrics (like area under the
curve (AUC), precision, recall, mean squared error, r2 score, f1) after 1000 epochs for training,
validation, and test subsets. Each model’s training process was performed 10 times to calculate
Figure 5: Degree centrality distribution across sets


Figure 6: Eigenvector centrality distribution across sets


Figure 7: GCN model schema


Figure 8: LGCN model schema
Figure 9: VGCN model schema


Figure 10: VLGCN model schema


mean performance and standard deviation. Each training iteration was done with a randomly
generated seed, and models were reset each time.


4. Experimental
As one can see in Table 7, all models demonstrate relatively high performance metrics. However,
model performance is pretty similar, and the difference between AUC values in the worst and
best performing models is lower than 2%. As one can observe in Figure 15, ROC curves of all
models almost overlap each other.

Table 7
Test metrics for researched models
     Metric            GCN                 LGCN                VGCN               VLGCN
     AUC         0.9623 ± 0.0005       0.9416 ± 0.0002     0.9585 ± 0.0015     0.9515 ± 0.0003
     MSE         0.1546 ± 0.0002     0.1636 ± 3.0397e-05   0.1625 ± 0.0005     0.1796 ± 0.0015
     R2          0.3814 ± 0.0011       0.3452 ± 0.0001     0.3499 ± 0.0021     0.2812 ± 0.0062
     Precision   0.9730 ± 0.0003       0.9631 ± 0.0001     0.9718 ± 0.0008     0.9719 ± 0.0001
     Recall      0.9962 ± 0.0004     0.9813 ± 5.4487e-05   0.9942 ± 0.0004   0.9762 ± 5.5481e-05
     F1          0.9845 ± 0.0003     0.9721 ± 6.1622e-05   0.9828 ± 0.0005   0.9740 ± 9.7375e-05
Figure 11: ROC curve for GCN model


Figure 12: ROC curve for LGCN model


Figure 13: ROC curve for VGCN mode

   Such a high performance may be explained by the fact that usually, the legal acts references
consist of mentioning them in the text content part of other acts. Thus, the model is mostly
trained to find embeddings of the titles of the legal acts in the legal acts text content embedding.
   It should be noted that VGCN performed worse than the traditional GCN model, even though
the VLGCN model performed better than the LGCN model. However, by most metrics, traditional
GCN has the best values among all of the 4 models.
Figure 14: ROC curve for VLGCN model


Figure 15: ROC of all of evaluated models


Figure 16: Combined model AUC, Precision, Recall and F1 metrics


   The reason for poorer VGAE model behavior may be that relations between acts are strict
and direct, and any augmentation of existing data or generating new data may actually reduce
model performance.
   Also, for this specific dataset, the amount of GCN layers seems to be more important than
differences between GAE and VGAE architectures, which is especially clearly seen in Fig 16
where LGCN performs worse than GCN and VLGCN performs worse than VGCN.
Figure 17: Combined model MSE and R2 metrics


5. Discussion
Another popular solution to link prediction task in GNNs is LightGCN model [27]. LightGCN
model implements GCN by using only neighborhood aggregation for collaborative filtering. It
is done by linearly propagating nodes embeddings on interaction graph [27].
   To further investigate GAE and VGAE performance, the LightGCN model was trained on the
same dataset and evaluated with the same metrics, which are shown in Table 8. Also, the ROC
curve was constructed, as shown in Fig 18. Considering all LightGCN metrics, we can conclude
that GAE/VGAE architecture is better suited for the link prediction task.

Table 8
LightGCN and GCN metrics comparison
                                 Metric      LightGCN     GCN
                                 AUC          0.9253     0.9623
                                 MSE          0.1840     0.1546
                                 R2           0.2788     0.3814
                                 Precision    0.9405     0.9730
                                 Recall       0.9466     0.9962
                                 F1           0.9486     0.9845

   The following observations were made after these experiments. All models have quite similar
performance; however, the GCN model (the one with two GCN layers) performs better than
other models.
   Moreover, models with singular GCN (LGCN and VLGCN) perform observably worse than
models with 2 GCNs (GCN and VGCN). The difference between GAE and VGAE based models
is insignificant; however, variational models perform slightly worse. The cause of such an effect
may be the nature of the specific dataset, upon which these models were evaluated. This comes
to the conclusion that the amount of GCNs may be more significant in some cases than the
Figure 18: ROC curve for LightGCN model


differences between GAE and VGAE architectures.
   By experimenting with the configuration of GCN layers, prediction performance may be
further improved. For example, the other further improvements of the proposed approach can
be implemented due to hybridization of graph, convolutional, variational, and other network
components that was verified in our previous works [28, 29, 30] and other researches [31, 32].
   Another possible research subject may include training models on a dataset with act content
split into different parts (for example, by articles) instead of training by the complete act text
content.
   Another promising research topic may be studying GAE/VGAE models behaviour on knowl-
edge graphs constructed on the legal acts dataset. Such graph may include information about
institutions, date of publication, and even data about parliament members.
   Also, the aforementioned models were compared with the LightGCN model, which was
trained on the same dataset. Comparing LightGCN model evaluation metrics with GAE / VGAE
leads to the conclusion that the latter produces more competitive results.


6. Conclusions
Finally, in the context of many practical fields, the link prediction problem is fundamental to
better understanding links between objects in hierarchical constructions such as networks. To
solve it, various graph-based DNNs, like GAE and VGAE, have become powerful tools. In this
work, some variants of graph-based DNNs have been demonstrated to solve the link prediction
problem among text objects, specifically, legal acts. The correspondent customized dataset in the
form of a hierarchical set of Ukrainian legislation documents (based on the legal acts adopted
by the Ukrainian parliament and the Government of Ukraine). The exploratory data analysis
(EDA) was performed, and the structure of training, validation, and test subsets was analyzed.
Several variants of GAE and VGAE models were proposed and applied to the dataset, namely,
single layer Linear GCN encoder (LGCN), variational two-layer GCN encoder (VGCN), and
variational LGCN (VLGCN). The comparative analysis of all considered models was performed
based on several standard metrics. The conclusion was drawn that all models have quite similar
performance; however, the GCN model (the one with two GCN layers) performs better than
other models. The LightGCN model, in comparison with GAE/VGAE, leads to the conclusion
that the latter has the higher performance metrics. The models with singular GCN (LGCN and
VLGCN) perform observably worse than models with 2 GCNs (GCN and VGCN). In the context
of legal act hierarchy, prediction performance may be further improved by considering other
information about institutions, date of publication, and even data about parliament members. In
general, other further improvements of the proposed method are possible for other hybridization
variants of graph, convolutional, variational, and other network components.


Acknowledgments
The work was partially supported by “Knowledge At the Tip of Your fingers: Clinical Knowledge
for Humanity” (KATY) project funded from the European Union’s Horizon 2020 research and
innovation program under grant agreement No. 101017453.


References
 [1] L. Getoor, C. P. Diehl, Link mining: a survey, Acm Sigkdd Explorations Newsletter 7 (2005)
     3–12.
 [2] L. Lü, T. Zhou, Link prediction in complex networks: A survey, Physica A: statistical
     mechanics and its applications 390 (2011) 1150–1170.
 [3] S. Kerrache, R. Alharbi, H. Benhidour, A scalable similarity-popularity link prediction
     method, Scientific Reports 10 (2020) 6394. doi:10.1038/s41598-020-62636-1.
 [4] T. N. Kipf, M. Welling, Variational graph auto-encoders, 2016. URL: https://arxiv.org/abs/
     1611.07308. doi:10.48550/ARXIV.1611.07308.
 [5] S. Zhang, H. Tong, J. Xu, R. Maciejewski, Graph convolutional networks: a comprehensive
     review, Computational Social Networks 6 (2019). doi:10.1186/s40649-019-0069-y.
 [6] Z. Ye, Y. J. Kumar, G. O. Sing, F. Song, J. Wang, A comprehensive survey of graph neural
     networks for knowledge graphs, IEEE Access 10 (2022) 75729–75741. doi:10.1109/
     ACCESS.2022.3191784.
 [7] E. Filtz, Knowledge Graphs for Analyzing and Searching Legal Data, Ph.D. thesis, Vienna
     University of Economics and Business, 2021.
 [8] V. Singh, P. Lio, Towards probabilistic generative models harnessing graph neural networks
     for disease-gene prediction, arXiv preprint arXiv:1907.05628 (2019).
 [9] O. Alienin, O. Rokovyi, Y. Gordienko, Y. Kochura, V. Taran, S. Stirenko, Artificial intelli-
     gence platform for distant computer-aided detection (cade) and computer-aided diagnosis
     (cadx) of human diseases, in: The International Conference on Artificial Intelligence and
     Logistics Engineering, Springer, 2022, pp. 91–100.
[10] Y. Yakimenko, S. Stirenko, D. Koroliouk, Y. Gordienko, F. M. Zanzotto, Implementation of
     personalized medicine by artificial intelligence platform, in: 2nd International Conference
     on Soft Computing for Security Applications, Springer, 2022.
[11] M. Niepert, M. Ahmed, K. Kutzkov, Learning convolutional neural networks for graphs, in:
     M. F. Balcan, K. Q. Weinberger (Eds.), Proceedings of The 33rd International Conference
     on Machine Learning, volume 48 of Proceedings of Machine Learning Research, PMLR,
     New York, New York, USA, 2016, pp. 2014–2023. URL: https://proceedings.mlr.press/v48/
     niepert16.html.
[12] J. Bruna, W. Zaremba, A. Szlam, Y. LeCun, Spectral networks and locally connected
     networks on graphs, 2013. URL: https://arxiv.org/abs/1312.6203. doi:10.48550/ARXIV.
     1312.6203.
[13] T. N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks,
     CoRR abs/1609.02907 (2016). URL: http://arxiv.org/abs/1609.02907. arXiv:1609.02907.
[14] C. Wang, S. Pan, R. Hu, G. Long, J. Jiang, C. Zhang, Attributed graph clustering: A deep
     attentional embedding approach, in: Proceedings of the Twenty-Eighth International Joint
     Conference on Artificial Intelligence, IJCAI-19, International Joint Conferences on Artificial
     Intelligence Organization, 2019, pp. 3670–3676. URL: https://doi.org/10.24963/ijcai.2019/509.
     doi:10.24963/ijcai.2019/509.
[15] S. Pan, R. Hu, G. Long, J. Jiang, L. Yao, C. Zhang, Adversarially regularized graph autoen-
     coder for graph embedding, 2018. URL: https://arxiv.org/abs/1802.04407. doi:10.48550/
     ARXIV.1802.04407.
[16] C. Wang, S. Pan, G. Long, X. Zhu, J. Jiang, Mgae: Marginalized graph autoencoder for
     graph clustering, in: Proceedings of the 2017 ACM on Conference on Information and
     Knowledge Management, CIKM ’17, Association for Computing Machinery, New York,
     NY, USA, 2017, p. 889–898. URL: https://doi.org/10.1145/3132847.3132967. doi:10.1145/
     3132847.3132967.
[17] I. Gatopoulos, J. M. Tomczak, Self-supervised variational auto-encoders, Entropy 23 (2021)
     747. URL: https://doi.org/10.3390%2Fe23060747. doi:10.3390/e23060747.
[18] W. Yu, G. Zeng, P. Luo, F. Zhuang, Q. He, Z. Shi, Embedding with autoencoder regular-
     ization, in: H. Blockeel, K. Kersting, S. Nijssen, F. Železný (Eds.), Machine Learning and
     Knowledge Discovery in Databases, Springer Berlin Heidelberg, Berlin, Heidelberg, 2013,
     pp. 208–223.
[19] Z. Meng, S. Liang, H. Bao, X. Zhang, Co-embedding attributed networks, 2019, pp. 393–401.
     doi:10.1145/3289600.3291015.
[20] X. Li, J. She, Collaborative variational autoencoder for recommender systems, in: Pro-
     ceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery
     and Data Mining, KDD ’17, Association for Computing Machinery, New York, NY, USA,
     2017, p. 305–314. URL: https://doi.org/10.1145/3097983.3098077. doi:10.1145/3097983.
     3098077.
[21] L. Li, Z. Bi, H. Ye, S. Deng, H. Chen, H. Tou, Text-guided legal knowledge graph reasoning,
     in: B. Qin, Z. Jin, H. Wang, J. Pan, Y. Liu, B. An (Eds.), Knowledge Graph and Semantic
     Computing: Knowledge Graph Empowers New Infrastructure Construction, Springer
     Singapore, Singapore, 2021, pp. 27–39.
[22] L. Zhou, Event scene method of legal domain knowledge map based on neural network
     hybrid model, Applied Bionics and Biomechanics 2022 (2022) 5880595. URL: https://doi.
     org/10.1155/2022/5880595. doi:10.1155/2022/5880595.
[23] Y. Xie, Application of deep neural network algorithm in the analysis of legal precedent
     citation basis, Mobile Information Systems 2022 (2022) 3383428. URL: https://doi.org/10.
     1155/2022/3383428. doi:10.1155/2022/3383428.
[24] Ukrainian legal acts dataset, 2022. URL: https://www.kaggle.com/datasets/
     vladyslavshlianin/ukrainian-legal-acts, accessed on Sep, 24, 2022.
[25] W. Wang, F. Wei, L. Dong, H. Bao, N. Yang, M. Zhou, Minilm: Deep self-attention distillation
     for task-agnostic compression of pre-trained transformers, in: H. Larochelle, M. Ranzato,
     R. Hadsell, M. Balcan, H. Lin (Eds.), Advances in Neural Information Processing Systems,
     volume 33, Curran Associates, Inc., 2020, pp. 5776–5788. URL: https://proceedings.neurips.
     cc/paper/2020/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
[26] W. Wang, Y. Huang, Y. Wang, L. Wang, Generalized autoencoder: A neural network
     framework for dimensionality reduction, in: Proceedings of the IEEE Conference on
     Computer Vision and Pattern Recognition (CVPR) Workshops, 2014.
[27] X. He, K. Deng, X. Wang, Y. Li, Y. Zhang, M. Wang, Lightgcn: Simplifying and powering
     graph convolution network for recommendation, in: Proceedings of the 43rd International
     ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR
     ’20, Association for Computing Machinery, New York, NY, USA, 2020, p. 639–648. URL:
     https://doi.org/10.1145/3397271.3401063. doi:10.1145/3397271.3401063.
[28] Y. Gordienko, K. Kostiukevych, N. Gordienko, O. Rokovyi, O. Alienin, S. Stirenko, Deep
     learning with noise data augmentation and detrended fluctuation analysis for physical
     action classification by brain-computer interface, in: 2021 8th International Conference
     on Soft Computing & Machine Intelligence (ISCMI), IEEE, 2021, pp. 176–180.
[29] K. Kostiukevych, Y. Gordienko, N. Gordienko, O. Rokovyi, O. Alienin, S. Stirenko, Convo-
     lutional and recurrent neural networks for physical action forecasting by brain-computer
     interface, in: 11th IEEE Int. Conf. on Intelligent Data Acquisition and Advanced Computing
     Systems: Technology and Applications, IEEE, 2021.
[30] K. Kostiukevych, Y. Gordienko, N. Gordienko, O. Rokovyi, S. Stirenko, Hierarchy of hybrid
     deep neural networks for physical action classification by brain-computer interface, in:
     Modern Machine Learning Technologies and Data Science, CEUR, 2022.
[31] S. Zhang, H. Tong, J. Xu, R. Maciejewski, Graph convolutional networks: Algorithms,
     applications and open challenges, in: International Conference on Computational Social
     Networks, Springer, 2018, pp. 79–91.
[32] F. Yang, H. Zhang, S. Tao, Hybrid deep graph convolutional networks, International
     Journal of Machine Learning and Cybernetics (2022) 1–17.