<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Graph Neural Networks for Hotel Recommendation and Quality Assessment</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Iveta Mrázová</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marek Behún</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Mathematics and Physics, Charles University</institution>
          ,
          <addr-line>Prague</addr-line>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Popular travel-related forums provide prodigious support to customers. We scraped 3,125,631 Tripadvisor reviews for 3,260 hotels from 2,296,247 unique authors. This data allows for an extensive exploration from the perspective of social networks. While the hotels and review authors correspond to the nodes of the corresponding network graph, the ratings represent edges. In this paper, we inspect the prospects of graph neural networks for recommender systems. The experiments conducted so far yielded promising results regarding modeling travel-related data despite disregarding textual or image-related parts of the reviews. We see the core contribution of our research in providing a proof of concept for amplifying the power of recommender systems with the principles of social networks and advanced machine learning.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;graph neural networks</kwd>
        <kwd>social network analysis</kwd>
        <kwd>link prediction</kwd>
        <kwd>node</kwd>
        <kwd>s score evolution</kwd>
        <kwd>review rating prediction</kwd>
        <kwd>hotel</kwd>
        <kwd>s star category assessment</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Travel-related forums provide prodigious support to
customers. In addition to possible accommodation
mediation, these websites tender invaluable information on
the equipment and overall quality of the hotels and their
neighborhood. For example, Tripadvisor, Inc. [19]
comprises about one billion reviews on eight million
establishments. Successful recommender systems [15] would
utilize the collected information, e.g., to suggest an
alternative accommodation option.</p>
      <p>Previous research in this field focused on text-based
sentiment analysis of user opinions. Yet, the scraped data Figure 1: Screenshot of the top section of the Tripadvisor
records allow for a study from the perspective of social page for the Casablanca Hotel in New York City [19].
networks (SNs). While the hotels and review authors
correspond to nodes of an SN graph, the ratings represent
edges. To facilitate learning in SNs, e.g., when modeling erogeneous GNNs [9] and graph convolutional recurrent
the evolution of a network, the concept of graph neural networks (GCRN) [18] we have used. The analyzed tasks
networks (GNNs) [17] might constitute a viable approach. refer to review rating prediction, hotel category
assess</p>
      <p>Our ultimate objective is thus to explore the prospects ment, and hotel score evolution. Conclusions summarize
of GNNs in support of a travel-related recommender sys- the results and outline future enhancements.
tem. We see the core contribution of our research in
providing a proof of concept for utilizing extensive data
rich in structure in a way that allows autonomous learn- 2. Related Work
ing of their mutually intertwined relationships.</p>
      <sec id="sec-1-1">
        <title>Section 2 explains the principles of SNs, GNNs, and visualization of high-dimensional data. Section 3 outlines the construction of relevant SNs. Section 4 specifies het</title>
        <p>Static SNs are defined as a tuple (, ,  ,  ). (, )
denotes the graph of actors and relationships between
them,  states the actor and  the edge attributes,
resp. For the Tripadvisor data,  might reflect the star
category of the hotels or the ID of review authors, and
 can indicate the awarded review rating. Dynamic</p>
      </sec>
      <sec id="sec-1-2">
        <title>SNs consist of a finite sequence of static SNs.</title>
        <p>
          A bipartite SN (1, 2, , 1 , 2 ,  ) admits
relationships only between two disjoint groups of actors, 1
and 2. 1 and 2 are the actor attributes,  spec- as possible. Dimensionality reduction techniques adjust
ifies the edge attributes. The actual role of an actor is
the position of low-dimensional data points through
often related to the connectivity structure of the whole
gradient descent to comply with this efort. The loss
SN. E.g., the so-called eigenvector centrality [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] sums the
function evaluates the discrepancy between high- and
importance of the neighbors of an actor with a damping
low-dimensional neighbor similarities.
or 4) layers. To boost the scalability of GNNs, the Graph- (X, X, P) is constructed for the high-dimensional data
factor  () =  − 1 ∑︀
        </p>
        <p>∈ne()  ().</p>
        <p>
          Real-world SNs form massive graphs with millions of
nodes and edges [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] that would require full-batch training
for a traditional memory setting comprising the entire
graph. [17] introduced the graph neural network model
(GNN) to facilitate learning on graphs. GNNs capture
mutual dependencies of graph nodes via message passing.
[23] proposes a general pipeline design for GNN models
and reviews state-of-the-art methods relevant to GNNs
and their applications. A comprehensive survey on recent
        </p>
      </sec>
      <sec id="sec-1-3">
        <title>GNN models can be found in [22].</title>
      </sec>
      <sec id="sec-1-4">
        <title>A generalization of the eficient convolutional neural</title>
        <p>
          network (CNN) like information processing to graphs
involves the so-called graph Fourier transform and is
principal to constructing graph convolutions [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. To produce
networks universal to any graph structure, the
Cheb
        </p>
      </sec>
      <sec id="sec-1-5">
        <title>Net model uses Chebyshev polynomials of order K to determine the filters on graphs [5].</title>
      </sec>
      <sec id="sec-1-6">
        <title>Graph Convolutional Networks (GCNs) [11] represent a first-order approximation to ChebNets. Yet, due to vanishing gradients, GCNs are limited to shallow models (3</title>
      </sec>
      <sec id="sec-1-7">
        <title>SAGE model (SAmple and aggreGatE) [8] introduces a</title>
        <p>recursive node-wise sampling. Recursive neighborhoods
may, on the other hand, impact exponential memory
complexity that grows with the number of GNN layers.</p>
        <p>The Graph Attention Networks (GATs) [21] further
learn to pay diferent attention to the respective node’s
neighbors. The Graph Autoencoders (GAEs) [12]
leverage the GCNs for encoding with ReLU used as an
activation function. Decoding may be implemented as an
inner product of the node embeddings. GCRN models
combine the ChebNet convolution with LSTM or GRU
units to process temporal SN data [18].</p>
        <sec id="sec-1-7-1">
          <title>2.1. The UMAP Visualisation Technique</title>
          <p>To grasp how the GNNs represent the high-dimensional
graph data, we will use the so-called Uniform
Manifold Approximation and Projection (UMAP) visualization
technique [13] that maps  originally high-dimensional
data points x1 . . . x ∈ R
way. In this paragraph, X
= (x1 . . . x)
ally 2D) data points y1 . . . y ∈ R′ in a non-linear

represents the high-dimensional data point matrix and
∈ R× 
 to lower-dimensional
(usuY = (y1 . . . y)
dimensionality reduction.</p>
          <p />
          <p>∈ R× ′ the data point matrix after</p>
        </sec>
      </sec>
      <sec id="sec-1-8">
        <title>Neighbor</title>
        <p>similarities
visualized in
the
low</p>
        <p>On the Tripadvisor platform, each served HTML
docudimensional space
should
match the similarities
ment contains the definition of a JavaScript Object
Noexisting for the high-dimensional data points as closely
tation (JSON) object that contains all the information</p>
        <p>To visualize of the data points, UMAP considers the
similarity of x to x (  &gt; 0; 1 ≤  ≤ ) as:
neighbor similarities | only for  nearest neighbors
of x. For  ̸= , we define the conditional neighbor
| =
⎧
⎨exp
⎩1
︂( − ((x,x )−  )</p>
        <p>︂)
 
if  (x, x ) ≥  ,
otherwise,
with a (not necessarily Euclidean) metric (· , · ) in the
high-dimensional space and   being the distance to
the nearest neighbor in this metric. The symmetrized
neighbor similarities  are then determined as  =
low-dimensional similarities  has the form of:
| + | −</p>
        <p>| · |. The formula for diferentiable
 =
︁(
1 +  ‖y − y ‖
2)︁ − 1
with the parameters  and  to be fitted by a non-linear
least squares method.</p>
        <p>Before the gradient descent like adjustment of
lowdimensional data points, a weighted graph X
=
point matrix X with the weights  . The so-called
spectral embedding method initializes the low-dimensional
data point matrix Y. UMAP uses the loss function ℒ:
( log  + (1 −  ) log (1 −  )) .
ℒ = −</p>
        <p>∑︁
{,}∈X</p>
        <p>For performance reasons, UMAP uses stochastic
gradient descent with negative sampling to optimize ℒ. In each
iteration, the algorithm updates the low-dimensional
points y by the gradient of log  and randomly
selects several y as negative samples and updates y by
the gradient of log (1 − ).</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>3. Data Source</title>
      <sec id="sec-2-1">
        <title>The Tripadvisor [19] portal provides helpful information</title>
        <p>on travel destinations, such as descriptions and photos of
the hotels and their prices, availability, and booking
options. It is also a popular travel review website that ofers
millions of reviews and ratings from travelers worldwide.</p>
        <p>After running the scraping utility for several days
nonstop (without triggering any Denial of Service
protection), we have acquired records of 3,125,631 Tripadvisor
reviews for 3,260 hotels from 2,296,247 unique authors,
see Table 1. Yet the scraped dataset contains only hotel
reviews from several US cities. Furthermore, the Covid
pandemic of the last few years seriously disturbed
traveling globally. Both these facts may have introduced a
specific bias into the studied data. Overall, the scraped
data elucidates two intriguing observations:
• The number of submitted reviews per year peaked
in 2016 and began to fall afterward.
• In 2020, the number of reviews fell even more
due to the Covid pandemic but resumed growth
in 2021 and 2022.</p>
      </sec>
      <sec id="sec-2-2">
        <title>After data acquisition, the next step is to transform</title>
        <p>the data so that each entry is flattened and contains only</p>
      </sec>
      <sec id="sec-2-3">
        <title>1The scraped data and the entire code we wrote for the study ex</title>
        <p>periments are publicly available at
https://github.com/elkablo/gnnsocial-tripadvisor.
• Flattening. Both hotel and review records
contain normalized information: hotels list their
amenities and languages, while reviews contain
lists of ratings. All these fields are flattened to
facilitate deep learning computations with tensors.
• Filtering. We remove redundant textual
information from the records, such as names,
descriptions, URLs, and addresses. Similarly, we consider
just a pre-selected number of the most frequent
amenities and languages the scraped hotels
provide. For the actual experiment settings, refer to</p>
      </sec>
      <sec id="sec-2-4">
        <title>Section 4.</title>
        <sec id="sec-2-4-1">
          <title>3.2. Graph construction</title>
          <p>Based on the preprocessed dataset, we can build a
bipartite social network  = (, , , ,  , ), where:
•  is the set of review authors,
•  is the set of reviewed hotels,
•  ⊆  ×  is the set of reviews—author-hotel
pairs associated with a rating,
•  and  are author and hotel attribute
functions that assign to each of the respective author
or hotel nodes an attribute value, e.g., author ID
or eigenvector centrality (for further examples,
refer to Section 4),
•  :  → {1, 2, 3, 4, 5} is the edge attribute
function—a function that maps author-hotel pairs
to rating values: (, ℎ) =  means that author
 gave rating  to hotel ℎ.</p>
        </sec>
      </sec>
      <sec id="sec-2-5">
        <title>Because of limited cluster machine memory, the</title>
        <p>dataset preprocessing utility allows to specify the
minimum number of reviews for each hotel ℎ and author
. Only those authors and hotels fulfilling this
requirement will be kept. The resulting graph will be called the</p>
      </sec>
      <sec id="sec-2-6">
        <title>Filtered Review Graph. Figure 3 illustrates the process.</title>
        <p>Definition 1 (Filtered Review Graph).</p>
        <p>Let  = (, , , ,  , ) be a bipartite social
network of authors , hotels  and reviews , further
let , ℎ ∈ N. The filtered review graph of 
with parameters , ℎ is the maximum subnetwork
 = ( ∪ ,  ,  ,  ,  );  ⊆ ,  ⊆ ,
and  ⊆  ; ∀ ∈  : deg() ≥  and
∀ℎ ∈  : deg(ℎ) ≥ ℎ.</p>
        <p>We need to work with monopartite networks when
analyzing the eigenvector centralities of the hotels and
authors. However, we will use the so-called bipartite
network projections in such a case. We create a network
of authors, where each author is connected to another if
they have reviewed the same hotel. The strength of such
an association grows with more reviews on hotels visited
by both authors. For an illustration of a projection, see
Figure 4. A monopartite network for the hotels can be
created in a similar way.</p>
        <p>Definition 2 (Projection to Authors and Hotels).
Let  = (, , , ,  , ) be a bipartite SN of authors
, hotels , and reviews , and let , ,  ∈ N be
the minimum number of reviews, common associations,
and neighbors, respectively.</p>
        <p>The projection  of  to the authors  with the
parameters , ,  is constructed by:
• first removing from  all author nodes  ∈  with
deg() &lt; ,
• then constructing the bipartite network projection
̃︁ = (, , ,  ) of  to  with the set
of edges between the authors  ⊆  ×  and
the author edge attribute function  given by
 ((,  )) = |ℎ| if ∃ℎ ∈ |((, ℎ) ∈  ∧
( , ℎ) ∈ ), and  ((,  )) = 0 otherwise.
• then removing all edges  ∈  from ̃︁ with
 () &lt;  (so that only edges representing at
least  common associations are kept),
• then successively removing from ̃︁ all nodes with
fewer than  neighbors,
• finally normalizing the edge attributes in ̃︀ by
setting them to ′ , with
′ () =</p>
        <p>()
max∈ ()
.</p>
        <p>The projection to hotels  = (,  , ,  ) is
constructed in an analogous way (just swapping hotels and
authors in the above definition).</p>
      </sec>
      <sec id="sec-2-7">
        <title>In the last experiment, we will aim at predicting hotel</title>
        <p>eigenvector centrality scores as they change over time.</p>
      </sec>
      <sec id="sec-2-8">
        <title>Therefore, we will work with a dynamic SN of hotels called temporal projection.</title>
        <p>Definition 3 (Temporal Projection).</p>
        <p>Let  = (, , , ,  , ). Further, let
 = {0, 1, 2, . . . ,  | 0 &lt; 1 &lt; 2 &lt; · · · &lt; }
be a partition of the original time span for the reviews
available in . Let  be the maximum subnetwork
of , which contains only those reviews written at time
; − 1 &lt;  ≤  and where each author and hotel have
at least one neighbor.</p>
        <p>The temporal projection of  to the authors  with the
parameters  , , ,  is the sequence of monopartite
networks  = (, )=1 , where , is the
projection of the network  to the authors  with the
parameters , , .</p>
        <p>The temporal projection of  to the hotels ,  , is
created analogously, just by using the projection to the
hotels instead of the authors.
convolutional GNN layers for the review rating prediction task.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Supporting Experiments</title>
      <p>In our experiments, we worked with a bipartite SN
 = (, , , ,  , ) and used a filtered review
graph with  = ℎ</p>
      <p>= 12. The resulting
preprocessed dataset contained 76,692 reviews of 1,287 hotels
from 4,404 authors. While the author features, X,
correspond to one-hot encodings of author IDs, the hotel
feature matrix X used for this task contains:
• hotel class,
• existence of hotel website (binary),
• presence of 15 most popular amenities (binary),
• availability of 5 most popular languages (binary),
• optionally, other information like the eigenvector
centrality or hotel-hotel links used by the
projection to the hotels was also included as a weighing
factor in some models.</p>
      <sec id="sec-3-1">
        <title>As the filtered review graphs contain two distinct types</title>
        <p>of nodes, we will employ the heterogeneous graph
transform [9]. See Figure 5 for an illustration of the bipartite
author-hotel-review social network and its encoding, .</p>
        <p>Definition 4 (Heterogeneous GNN).</p>
        <p>Let
thor nodes,
embeddings,
hotels and authors,
• A ∈ {0, 1}||×| | be the adjacency matrix of</p>
        <p>reviews written by the authors for the hotels,
• Xe</p>
        <p>∈ R×| | be the author-hotel edge feature
matrix, 2  being the dimensionality of review
• W ∈ R||×| | and W ∈ R||×| | be the</p>
        <p>weighted adjacency matrices of the projections to
• H()
∈ R()×| | and H()</p>
        <p>∈ R()×| | be the
current hidden state matrices of the hotel and au- (︁
2The review feature matrix Xe is not used in the review rating
prediction task, but it is used in hotel class prediction.
• , 
()
, ()</p>
        <p>∈ N be the dimensionalities of the
review, hotel and author input embeddings for layer
, while (+1), (+1)</p>
        <p>∈ N be the corresponding
dimensionalities of output embeddings for layer ,
:
:
• F→</p>
        <p>R()×| | × { 0, 1}||×| |
R(+1)×| | be a graph neural function that
computes the embeddings of hotel nodes based on the
current activation states of author nodes H() and
the adjacency matrix A,
• F→</p>
        <p>R()×| | × { 0, 1}||×| |
R(+1)×| | be a graph neural function that
computes the embeddings of author nodes from the
current activation states of hotel nodes H() and
the adjacency matrix A,
• F→ : R()×| |</p>
        <p>R||×| | → R(+1)×| |
be a graph neural function that computes the
embeddings of hotel nodes from their hidden states</p>
        <p>H() and the weighted adjacency matrix W ,
• F→ : R()×| |</p>
        <p>R||×| | →</p>
        <p>R(+1)×| |
be a graph neural function that computes the
embeddings of author nodes from their hidden states</p>
        <p>H() and the weighted adjacency matrix W.
• We can use as a graph neural function, e.g., SAGE,</p>
        <p>GAT, GCN, ChebNet, LSTM or GRU, among
→
→
others.</p>
        <p>︁(
︁(
×
×
The (sum-aggregating)</p>
        <p>heterogeneous graph layer
HetLayer computes the next hidden activation states
of author and hotel nodes
R(+1)×| |
× R(+1)×| | as:
︁(</p>
        <p>H(+1)
, H(+1))︁</p>
        <p>∈
H(+1)
H(+1)
=
=
︁(
︁(
︁(
︁(
ReLU</p>
        <p>F→</p>
        <p>H(), A, X</p>
        <p>+
+ ReLU</p>
        <p>F→</p>
        <p>H()</p>
        <p>, W , X
ReLU</p>
        <p>F→</p>
        <p>H(), A
+ ReLU</p>
        <p>F→</p>
        <p>H(), W</p>
        <p>︁)
︁(
︁(
e
︁)
+
e
︁)
Missing rating values for the author-hotel pairs natu- edge embeddings produced by these layers.
the evaluation author awarded to the reviewed hotel. layers with 1, 2 ∈ N being the dimensionalities of the
face (CLI) utilities benefit from the PyTorch Geometric
[6], PyTorch Geometric Temporal [16] and NetworkX [7]
libraries.
√︀MSE(x, y).</p>
        <p>We applied the 10-fold cross-validation for training
and testing.4 The Adam method [10] helped optimize
the models, with the parameters  1 = 0.9,  2 = 0.999
and with learning rates 0.1% for 800 epochs. The mean
squared error (MSE) between the true x and predicted y
ratings MSE(x, y) = mean (︁ ∑︀ (︀ (x) − (y))︀ 2)︁
represented the loss function. We report on the results
referring to the root mean squared error RMSE(x, y) =</p>
        <sec id="sec-3-1-1">
          <title>4.1. Review Rating Prediction</title>
          <p>Each review incorporates several ratings from 1 to 5 that
rally raise the question of a possible review rating
prediction. Formally, we will work with a bipartite SN
 = (, , , ,  , ), and our objective will be to
extend the original domain of the edge attribute function
 to the entire set of possible edges  × , by leveraging
the information given by the attribute functions , ,
and  .</p>
          <p>In the context of GNNs, review rating prediction
epitomizes a link label prediction problem (i.e., predict the
overall hotel rating ; the per-item ratings remain
ignored) and belongs to the area of recommender systems.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3https://github.com/elkablo/gnn-social-tripadvisor.</title>
      </sec>
      <sec id="sec-3-3">
        <title>4Computational resources were provided by MetaCentrum NGI</title>
        <p>(https://www.metacentrum.cz/en) under the e-INFRA CZ project
(ID:90254) supported by the Ministry of Education, Youth, and
Sports of the Czech Republic.</p>
        <p>Definition 5 (Review Rating Prediction Model).
Let</p>
        <p>HetGNN
be a heterogeneous
graph
neural
network
W.</p>
        <p>constructed
for</p>
        <p>X , X, A, W ,
and</p>
        <p>Then, the review rating prediction model is a
graph autoencoder network model with the encoder
enc defined as:
enc (X , X, A, W , W)
=
HetGNN (X , X, A, W , W)
=
(H , H);
(H , H) ∈ R ×| |
×</p>
        <p>R×| |. The decoder dec is
defined for hotel ℎ ∈  embedding hℎ = (H )ℎ ∈ R
and author  ∈  embedding h = (H) ∈ R , as:
dec (hℎ, h) = Θ2 ReLU
Θ1
︂(
︂( hℎ)︂
h</p>
        <p>︂)
+ b1
+ b2,
where Θ1 ∈</p>
        <p>R1× ( +), b1
∈</p>
        <p>R1 and Θ2 ∈
R2× 1 , b2 ∈ R2 represent two fully connected linear</p>
        <p>Based on the pair of hotel and author embeddings, the
decoder defined in this way produces a review rating
prediction of dimensionality 2.</p>
        <p>According to the above generic definition, we have
built review rating prediction models with the parameters
of the following type and range:
• the rating prediction dimensionality 2 is always
1 (we predict the overall rating only),
• the number of layers  ranged through values 1,
2, and 3 (higher values led to worse performance),
• the number of hidden channels (the
dimensionality of node embeddings) ranged through the
values 4, 8, 12, and 16 (higher values yielded worse
performance), and remained the same for all
layers ( = () for all  ∈ {1, . . . , }),
• the same heterogeneous graph neural function
Fhetero was used for both F→ and F→;
its possible variants ranged through the SAGE,</p>
      </sec>
      <sec id="sec-3-4">
        <title>GAT, and GNN* models,</title>
        <p>• the same homogeneous graph neural function</p>
        <p>Fhomo was used for both F→, F→ .
Possible variants ranged through the same functions
as for the heterogeneous case and comprised also
the context of GNNs, we call this task a node classification
problem. In Figure 8, the hotel class attribute value we
want to predict is depicted by the red "?". Potential use
cases for a model predicting the hotel class include, e.g.,
the detection of fake or misleading information about
hotels submitted by their owners and the assessment of
hotel class where we lack this information. Alternatively,
the model issues another score for users considering the
GCN and ChebNet=2. Models without homo- hotel for accommodation.
geneous edges were, however, tested, too.</p>
        <sec id="sec-3-4-1">
          <title>4.2. Hotel Class Prediction</title>
        </sec>
      </sec>
      <sec id="sec-3-5">
        <title>In the hotel class prediction task, our objective is to esti</title>
        <p>mate the hotel class attribute (i.e., the number of stars as
one of 9 possible values 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5) by
considering the information from the other attributes. In
The methodology for the hotel class prediction
experiment remains the same as for review rating
prediction. Formally, we will work with the same bipartite SN
 = (, , , ,  , ) and preprocess the data
accordingly, except for the hotel feature matrix X that
does not contain the hotel class now since this will be
the target attribute. We set  = ℎ = 12 for review
graph filtering and augment the bipartite network with
author-author and hotel-hotel edges. The models for
hotel class prediction have the following form:
Definition 6 (Hotel Class Prediction Model).
Let HetGNN be a heterogeneous graph neural
network constructed for</p>
        <p>X, A, W , W.</p>
        <p>Further,
let X be the hotel feature
matrix without hotel
trix. The hotel class prediction model is then defined
as: HotelClassPred (X , X, Xe, A, W , W) =
ΘH() +b. H() can be obtained from
︁(</p>
        <p>H(), H())︁
=
with  = 1 for the resulting node embedding.
and b ∈ R
HetGNN (X , X, Xe, A, W , W) . Θ ∈ R× 
 represent one fully connected linear layer</p>
      </sec>
      <sec id="sec-3-6">
        <title>The other parameters (the number of hidden layers,</title>
        <p>hidden channels, and the homogeneous and
heterogeneous graph neural functions) range over the same values
as in the previous experiment. Again, the training of most
models was fast (under 30 s).
based on the other known attribute values.
(depicted by the red "?" sign in the figure) and has to be
predicted based on the other known attribute values.
GCRN-based architectures for the temporal hotel score prediction task:3
1 The times were obtained for training on NVIDIA GeForce RTX 2080 Ti GPU.
2 The times were obtained for training on NVIDIA A100-SXM4-40GB GPU.</p>
        <p>3 The times are not comparable as 3 diferent GPUs were used for training.</p>
        <p>In this experiment, the GNN with 12 hidden channels temporal network, too. The remaining 109 monthly
snapin two hidden layers with SAGE used for Fhetero and
ChebNet for Fhomo achieved the best results (RMSE =
0.4154). In most cases, the class predicted by the model
thus difered by at most one hotel class (considering its
granularity of 0.5). Table 2 shows the performance of the
best five models. Regarding accuracy, the SAGE graph
shots were split into 80%–20% training-testing sequences
(comprising 87 and 22 snapshots, resp.).</p>
      </sec>
      <sec id="sec-3-7">
        <title>We initialized the network models with random</title>
        <p>weights using the same hotel feature matrix X = X
and Adam parameters like in review rating prediction.</p>
      </sec>
      <sec id="sec-3-8">
        <title>Concerning the temporal data, we ran the experiments</title>
        <p>neural function again outperformed other functions for five times for evaluation instead of using
-fold
crossthe author-hotel heterogeneous links.</p>
        <sec id="sec-3-8-1">
          <title>4.3. Hotel Score Prediction</title>
          <p>The final experiment evaluates the chance of predicting
how a hotel’s score changes over time, where the score
corresponds to the eigenvector centrality of the hotel in
the projection to hotels. To solve the task, we create a
temporal projection of the bipartite SN to hotels  and
then aim at predicting the eigenvector centrality based
on the static hotel attribute function  and the dynamic
edge weights in the temporal projection.</p>
          <p>For this task, we divided the time interval with the
scraped reviews (2003–2022) into monthly partitioning
 and then created a temporal projection to the hotels
with parameters  ,  = 20,  = 3,  = 3. From
the resulting dynamic network, we removed the initial
snapshots that contained less than 10,000 edges. In the
last several years, the centralities stabilized. Therefore,
we removed the last six years’ data from the projected
validation. We trained each model for 3000 epochs with
the MSE loss. For this task, we applied the GCRN-LSTM
and GCRN-GRU models. In addition to the recurrent
layer, the ReLU activation function and a linear
transformation were used to generate score prediction.</p>
          <p>Definition 7 (Hotel Score Prediction Model).</p>
          <p>Let GRCN be either the GRCNLSTM or the GRCNGRU
model, X be the hotel feature matrix and W() be the
weighted adjacency matrices of the temporal projection
model computes the -th score prediction as
to hotels for  ∈ {1, . . . ,  }. The hotel score prediction
ear layer with  ∈ N hidden states. H() is initialized with
0 and adjusted by H() = GRCN
︁(</p>
          <p>X, W(), H(− 1))︁ .</p>
          <p>We tested networks with the order of ChebNet’s [5] M. Deferrard, X. Bresson and P. Vandergheynst,
Chebyshev polynomial  set to 2 and 3 and the num- “Convolutional Neural Networks on Graphs with
ber of hidden channels ranging through 4, 8, 12, and 16. Fast Localized Spectral Filtering”, NIPS, 2016, 9 p.
An LSTM-based GCRN model with 16 hidden channels [6] M. Fey and J.E. Lenssen, “Fast Graph Representation
yielded the most accurate predictions (RMSE = 0.0020). Learning with PyTorch Geometric”, ICLR, 2019, 9 p.
Table 2 shows the results for the five best-performing [7] A. Hagberg, P. Swart and D.S. Chult, “Exploring
models. Overall, LSTM-based models with higher val- network structure, dynamics, and function using
ues of  (facilitating information flow from further dis- NetworkX”, SciPy, 2008, 5 p.
tances) achieved better performance. Unfortunately, con- [8] W.L. Hamilton, R. Ying and J. Leskovec, “Inductive
siderable time costs accompany high accuracy (training Representation Learning on Large Graphs”, NIPS,
often takes longer than 1 hour). 2017, 11 p.
[9] Z. Hu, Y. Dong, K. Wang and Y. Sun,
“Heterogeneous Graph Transformer”, WWW, 2020, pp.
5. Conclusions 2704–2710.
[10] D.P. Kingma and J.L. Ba, “Adam: A Method for
In this paper, we have explored the applicability of GNN Stochastic Optimization”, ICLR, 2015, 13 p.
models to the analysis of scraped Tripadvisor data. For [11] T.N. Kipf and M. Welling, “Semi-Supervised
Classifiour investigations, we scraped 3,125,631 Tripadvisor re- cation with Graph Convolutional Networks”, ICLR,
views for 3,260 hotels from 2,296,247 unique authors. The 2017, 14 p.
analyzed problems involved prediction of review ratings, [12] T.N. Kipf and M. Welling, “Variational Graph
Autoassessment of the actual hotel (star) classes, and predic- Encoders”, 2016, doi: 10.48550/ARXIV.1611.07308.
tion of dynamic / temporal centrality scores of the hotels. [13] L. McInnes, J. Healy and J. Melville, “UMAP:</p>
          <p>The performed experiments yield reliable results for Uniform Manifold Approximation and
Prorecommender systems even without the information on jection for Dimension Reduction”, 2018, doi:
textual or image-related parts of the reviews. The paper 10.48550/ARXIV.1802.03426.
thus presents a proof of concept for boosting the per- [14] Neo4j, https://neo4j.com, Accessed: 2023-08-19.
formance of recommender systems with advanced AI [15] F. Ricci, L. Rokach and B. Shapira (eds.),
“Recomtechniques. Overall, a reasonably low number of wider mender Systems Handbook (3 ed.), Springer, 2022.
hidden layers led to a better performance in achieving [16] B. Rozemberczki et al., “PyTorch Geometric
Tempoaccuracy. Temporal models consumed, however, signifi- ral: Spatiotemporal Signal Processing with
Neucantly more computational resources. The involved GNN ral Machine Learning Models”, CIKM, 2021, pp.
models were able to extract adequate knowledge that re- 4564–4573.
quires non-trivial methods, e.g., UMAP, to be visualized [17] F. Scarselli, M. Gori, A.Ch. Tsoi, M. Hagenbucher
in an easy-to-understand way. and G. Monfardini, “The Graph Neural Network</p>
          <p>
            To boost their accuracy, future models might embrace Model”, IEEE Transactions on Neural Networks,
attributes extended, e.g., by word embeddings of the ac- vol. 20, no. 1, 2009, pp. 61–80.
tual review texts (like in [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ]) or by the information on [18] Y. Seo, M. Deferrard, P. Vandergheynst and X.
Bresclose attractions or the quality of nearby restaurants. son, “Structured Sequence Modeling with Graph
The provided SW could benefit from integrating a Neo4j Convolutional Recurrent Networks”, ICONIP, 2018,
graph database [14] as a core tool for dealing with SNs. pp. 362-373.
          </p>
          <p>Eficient training of deeper and temporal GNN models for [19] Tripadvisor, https://www.tripadvisor.com,
Acdynamic SN data shall also represent a welcome addition. cessed: 2022-12-19.
[20] G. Van Rossum and F. L. Drake, “Python 3 Reference
References Manual”, Scotts Valley, CA: CreateSpace, 2009.
[21] P. Veličković et al., “Graph Attention Networks”,</p>
          <p>ICLR, 2018, 12 p.
[22] Z. Wu et al., “A Comprehensive Survey on Graph</p>
        </sec>
      </sec>
      <sec id="sec-3-9">
        <title>Neural Networks”, IEEE Transactions on Neural</title>
      </sec>
      <sec id="sec-3-10">
        <title>Networks and Learning Systems, vol. 32, no. 1, 2021,</title>
        <p>pp. 4-24.
[23] J. Zhou et al., “Graph neural networks: A review of
methods and applications”, AI Open, vol. 1, 2020,
pp. 57-81.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.C.</given-names>
            <surname>Aggarwal</surname>
          </string-name>
          , “
          <article-title>Data mining: the textbook</article-title>
          ”, Springer,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.-L.</given-names>
            <surname>Barabási</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Pósfai</surname>
          </string-name>
          , “Network Science”, Cambridge University Press,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bruna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zaremba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Szlam</surname>
          </string-name>
          and Y. LeCun, “
          <article-title>Spectral Networks and Locally Connected Networks on Graphs”</article-title>
          , ICLR,
          <year>2014</year>
          , 14 p.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Chitiz</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Perlmuter</surname>
          </string-name>
          , “Hotel Rating Prediction”, url: https://github.com/doviec/TripAdvisorRating-Prediction. Accessed:
          <fpage>2023</fpage>
          -07-12.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>