=Paper= {{Paper |id=Vol-3946/TGD_paper3 |storemode=property |title=Transforming Time Series into Graphs and Vice Versa with HyGraph |pdfUrl=https://ceur-ws.org/Vol-3946/TGD-3.pdf |volume=Vol-3946 |authors=Mouna Ammar,Shubhangi Agarwal,Angela Bonifati,Erhard Rahm }} ==Transforming Time Series into Graphs and Vice Versa with HyGraph== https://ceur-ws.org/Vol-3946/TGD-3.pdf
                         Transforming Time Series into Graphs and Back with HyGraph
                         Mouna Ammar1,* , Shubhangi Agarwal2 , Angela Bonifati3 and Erhard Rahm1
                         1
                           Leipzig University and ScaDS.AI, Leipzig, Germany
                         2
                           Lyon 1 University, LIRIS, Lyon, France
                         3
                           Lyon 1 University, LIRIS and IUF, Lyon, France


                                        Abstract
                                        Existing graph data management systems still provide limited support for evolving and temporal data. In addition, time-series data often
                                        reside outside graph engines, hindering unified analysis. HyGraph is a new hybrid approach to manage and analyze both temporal
                                        graph and time series data in a unified manner. In particular, it supports rich transformations between graph and time-series data. We
                                        discuss two novel operators on HyGraph to illustrate such transformations, a time-series-based graph operator and a graph-based
                                        time-series operator. The first ingests time-series data and produces a new graph (or a subgraph) that captures relationships among time
                                        series based on correlation values. The second operator, in contrast, generates a time series based on the evolution of temporal graph
                                        metrics, such as aggregated edges or changes in node degree. The transformation operators allow the augmentation of derived values to
                                        the hybrid structure for self-enrichment. We also outline open challenges of dynamic transformations within the hybrid model.

                                        Keywords
                                        hygraph, hybrid graph, property graph, temporal graph, time-series, multi-model



                         1. Introduction                                                                                           elements and computing similarity-based transformations.
                                                                                                                                      We start by formally introducing the HyGraph model in
                         Graphs are a powerful tool for modeling interconnected real-                                              Section 2, followed by its UML and system architecture in
                         world data, widely used in domains such as social networks,                                               Section 3. We present the two key transformation operators
                         knowledge graphs, and urban mobility. Many of these ap-                                                   in Section 4 and illustrate the applicability of our model
                         plications inherently involve temporal dynamics, where                                                    through a micro-mobility use case (Section 5). Broader im-
                         graph elements evolve. For instance, sensor networks con-                                                 plications and future directions are discussed in Section 6.
                         tinuously generate time-series data [1, 2], and ride-sharing
                         platforms track vehicle metrics over time [3, 4]. Existing
                         graph database systems are often limited in their ability to                                              2. HyGraph Data Model
                         natively manage and analyze such evolving temporal data.
                         By contrast, the representation of time series data falls short                                           Analyzing data that combines graph structures and time
                         of preserving interaction between the entities. The time se-                                              series offers deeper insights than separate analyses. For
                         ries databases (TSDBs) are designed to efficiently store and                                              instance, in micro-mobility applications, tracking how usage
                         analyze temporal data and are not optimized for capturing                                                 patterns evolve alongside spatial station layouts can predict
                         complex graph structures. They primarily focus on sequen-                                                 demand and uncover efficient vehicle distribution strategies.
                         tial data retrieval and aggregation [5], lacking native support                                           Although there have been efforts to unify graph and time-
                         for graph traversal, multi-hop, or relationship-based analyt-                                             series data, they often rely on graph representations for both,
                         ics. Further, high-dimensional time series data challenges                                                limiting the depth of time series analysis and relegating time
                         traditional mining techniques, motivating graph-based rep-                                                series to a secondary role, primarily representing property
                         resentations as a powerful tool for analysis and visualization                                            evolution [8, 9]. Such approaches essentially extend the
                         [6]. As a result, time-series data in graph applications is of-                                           property graph model rather than creating a truly unified
                         ten stored separately, either in side systems or as attributes                                            model. As a result, time-series capabilities are limited in
                         in graph databases, leading to inefficient data management.                                               terms of analysis and querying, and there remains a dearth
                            HyGraph aims at addressing these limitations with a                                                    of operators and algorithms that fully leverage both data
                         unified model that seamlessly integrates property graphs                                                  types in tandem. Although some domain-specific machine
                         with time-series data. HyGraph directly represents time-                                                  learning models combine those data types [10, 11, 12, 13], a
                         dependent attributes and supports new transformation op-                                                  general-purpose approach is lacking.
                         erators for evolving graph analytics. A broader discussion                                                   Through HyGraph we aim to provide a unified system
                         of HyGraph’s vision and related work can be found in [7],                                                 that handles the complexities of integrating graph and time
                         where we also outline the motivation behind the approach                                                  series data, offering flexible functionalities. The core of Hy-
                         and its high-level goals. In contrast, this paper provides a                                              Graph is a novel data model designed with equal emphasis
                         detailed exploration of the HyGraph data model and trans-                                                 on graph and time series data, enabling the development of
                         formation operations, like extracting time series from graph                                              hybrid operators, algorithms, and data mining techniques
                                                                                                                                   specifically tailored for this combined data structure. This
                                                                                                                                   model, detailed below, lays the foundation for a flexible
                         Published in the Proceedings of the Workshops of the EDBT/ICDT 2025                                       approach to analyzing graph and time series data in unison.
                         Joint Conference (March 25-28, 2025), Barcelona, Spain
                         *
                           Corresponding author.
                                                                                                                                      Let 𝒦 be the set of property keys, 𝒩 the set of property
                         $ ammar@informatik.uni-leipzig.de (M. Ammar);                                                             values, β„’ the set of labels and 𝒯 the set of timestamps.
                         shubhangi.agarwal@liris.cnrs.fr (S. Agarwal);
                         angela.bonifati@univ-lyon1.fr (A. Bonifati);                                                              Definition 1. Temporal Property Graph (TPG). We refer-
                         rahm@informatik.uni-leipzig.de (E. Rahm)                                                                  ence the property graph model defined in [14] and extend
                          0009-0005-8959-3643 (M. Ammar); 0009-0004-4405-4833                                                     it by adding a validity period for each element. A TPG 𝒒
                         (S. Agarwal); 0000-0002-9582-869X (A. Bonifati); 0000-0002-2665-1114                                      can be represented by a tuple as, 𝒒 = (𝑉𝑝𝑔 , 𝐸𝑝𝑔 , 𝜌, πœ‘, πœ†, πœ‚),
                         (E. Rahm)
                                    Copyright Β© 2025 for this paper by its authors. Use permitted under Creative Commons License   where:
                                    Attribution 4.0 International (CC BY 4.0).



CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
β€’ 𝑉𝑝𝑔 and 𝐸𝑝𝑔 : Sets of vertices and edges respectively.
β€’ 𝜌: 𝐸𝑝𝑔 β†’ 𝑉𝑝𝑔 Γ— 𝑉𝑝𝑔 maps an edge to its source and
  target vertices.
β€’ πœ‘ : (𝑉𝑝𝑔 βˆͺ𝐸𝑝𝑔 )×𝒦 β†’ 𝒩 is a property function mapping
  each graph element and property key π‘˜ ∈ 𝒦 to a property
  value in 𝒩 .
β€’ πœ† : (𝑉𝑝𝑔 βˆͺ 𝐸𝑝𝑔 ) β†’ β„’ associates each graph element with
  a unique label from the set of labels β„’.
β€’ πœ‚ : (𝑉𝑝𝑔 βˆͺ 𝐸𝑝𝑔 ) β†’ 𝒯 Γ— 𝒯 retrieves the start and the end
  timestamps between which the graph element is valid.
  Let {𝑑start , 𝑑end } ∈ 𝒯 represent the two timestamps, then
  𝑑start β‰Ί 𝑑end , where the symbol β‰Ί specifies ordering, (𝑑end
  is initialized to π‘šπ‘Žπ‘₯(𝒯 )).
                                                                     Figure 1: Example HyGraph snapshot of bike-sharing data
Definition 2. Time series. A time series 𝑑𝑠 (univari-
ate or multivariate) is an ordered set of tuples 𝑑𝑠 =
{(𝑑1 , 𝑦1 ), (𝑑2 , 𝑦2 ), . . . , (𝑑𝑛 , 𝑦𝑛 )|𝑛 ∈ N}, with timestamp
𝑑𝑖 ∈ 𝒯 , such that 𝑑𝑖 β‰Ί 𝑑𝑗 if 𝑖 < 𝑗, and 𝑦𝑖 represents a tuple       Definition 5. HyGraph Model. The HyGraph model is
of real values 𝑦𝑖 = (π‘£π‘Žπ‘™π‘–1 , π‘£π‘Žπ‘™π‘–2 , . . . , π‘£π‘Žπ‘™π‘–π‘˜ ).                denoted by a tuple,

Definition 3. Dynamic Subgraph. Let 𝑠 ∈ 𝑆 be a sub-                             𝐻𝐺 = (𝑉, 𝐸, 𝑆, 𝑇 𝑆, 𝜌, πœ‘, 𝛿, πœ“, πœ†, πœ‚)
graph where 𝑆 represents a set of subgraphs. The function
                                                                     where, 𝑉 is the set of vertices, 𝐸 the set of edges, 𝑆 the set
πœ“ : 𝑆 Γ— 𝒯 β†’ 𝒫(𝑉𝑝𝑔 ) Γ— 𝒫(𝐸𝑝𝑔 ) maps a subgraph at a time
                                                                     of logical subgraphs and 𝑇 𝑆 the set of time series.
𝑑 ∈ 𝒯 to a set of constituent vertices and edges from 𝑉𝑝𝑔
and 𝐸𝑝𝑔 , respectively, while 𝒫(Β·) denotes the power set.            β€’ 𝑉 : A union set of property graph vertices (𝑉𝑝𝑔 ) and time
   Further, two subgraphs may overlap at any point in time,            series vertices (𝑉𝑑𝑠 ), i.e., 𝑉 = 𝑉𝑝𝑔 βˆͺ 𝑉𝑑𝑠 .
𝑑 ∈ 𝒯 . The overlap between two subgraphs {𝑠1 , 𝑠2 } ∈ 𝑆
                                                                     β€’ 𝐸: Similar to 𝑉 , it is defined as a union set of property
can then be captured as the set of vertices and edges common
                                                                       graph and time series edges, i.e., 𝐸 = 𝐸𝑝𝑔 βˆͺ 𝐸𝑑𝑠 .
to both the subgraphs at 𝑑, i.e., 𝛼(𝑠1 , 𝑠2 , 𝑑) = {(πœ‹v (𝑠1 , 𝑑) ∩
πœ‹v (𝑠2 , 𝑑)), (πœ‹e (𝑠1 , 𝑑) ∩ πœ‹e (𝑠2 , 𝑑))}. Here, πœ‹v : 𝑆 Γ— 𝒯 β†’       β€’ The function 𝛿 : (𝑉𝑑𝑠 βˆͺ 𝐸𝑑𝑠 ) β†’ 𝑇 𝑆, maps a time-series
𝒫(𝑉𝑝𝑔 ) and πœ‹e : 𝑆 Γ— 𝒯 β†’ 𝒫(𝐸𝑝𝑔 ) are projection func-                  vertex and edge to a multi-variate time series in 𝑇 𝑆.
tions that retrieve the set of constituent vertices and edges,
respectively, for a subgraph at any given time.                      All the mapping functions are adapted to include both prop-
   We extend the property-graph model to incorporate time-           erty graph and time series graph objects.
series data, such that any vertex, edge, or subgraph can hold
time-series properties. Formally, we expand the scope of             β€’ 𝜌 : 𝐸 β†’ 𝑉 Γ— 𝑉 maps edges to source and target vertices.
sets of property keys and values to include both static and
                                                                     β€’ πœ‘ : (𝑉 βˆͺ𝐸 βˆͺ𝑆)×𝒦 β†’ 𝒩 . The map function is modified
dynamic, thus embedding time series as a natural property.
                                                                       to include a subgraph, which is treated as a logical graph
Definition 4. Property. The property of a graph element is             object and can have associated properties.
represented by a key-value pair, where the key and value be-
long respectively to the set of keys 𝒦 and values 𝒩 , respec-        β€’ The subgraph mapping function is adapted to allow a sub-
tively. The map function πœ‘ : (𝑉𝑝𝑔 βˆͺ𝐸𝑝𝑔 βˆͺ𝑆)×𝒦 β†’ 𝒩 maps                  graph to have edges and vertices of both types, property
a vertex, edge or a subgraph and a property key to a prop-             graph, and time series, as, πœ“ : 𝑆 Γ— 𝑇 β†’ 𝒫(𝑉 ) Γ— 𝒫(𝐸).
erty value in 𝒩 , where 𝒩 = {𝒩Σ βˆͺ 𝒩TS | 𝒩Σ ∩ 𝒩TS = βˆ…}.               β€’ The label function πœ† : (𝑉 βˆͺ 𝐸 βˆͺ 𝑆) β†’ 𝒫(β„’) associates
The set 𝒩Σ is the set of all possible static property values           an entity with labels from the set of labels β„’.
and the set 𝒩TS contains the dynamic property values, i.e.,
time series. Dynamic properties are further classified into          β€’ Finally, the function πœ‚ : (𝑉 βˆͺ 𝐸 βˆͺ 𝑆) β†’ 𝒯 Γ— 𝒯 retrieves
two categories:                                                        the start and the end timestamps between which a graph
                                                                       element is valid.
β€’ Regular Properties. These store external data associated
  with the object, representing attributes that evolve based         This new extension ensures that time series are treated as
  on external updates or observations.                               structured entities that can be queried, connected to other
β€’ Meta-Properties. These store internal data derived from            time series, and analyzed within a TPG framework.
  the graph itself, such as the evolution of graph metrics              Figure 1 shows an example HyGraph created from a snap-
  (e.g., node degree, centrality measures) or aggregated             shot of a bike-sharing system (NYC "CitiBike" [15]). In this
  properties over edges (e.g., traffic volume between nodes).        representation, stations are modeled as property graph ver-
  These meta-properties provide insights into the graph’s            tices (𝑉𝑝𝑔 ), while trips between stations are represented as
  internal structure and dynamic behavior.                           property graph edges (𝐸𝑝𝑔 ). The edges, shown in green
                                                                     in Figure 1, encode the trips connecting two station nodes.
   Now that we have established the fundamental defini-              The AvailBikeSim edge set is derived to represent the sim-
tions, we formally introduce the HyGraph model, detail-              ilarity in bike availability patterns between station nodes.
ing its structural components and integration of property            An AvailBikeSim edge is generated when the computed time
graphs and time series data models.
series similarity between the corresponding dynamic proper-                                                                       Gr aphElement

ties (π‘›π‘’π‘š_π‘π‘–π‘˜π‘’π‘ _π‘Žπ‘£π‘Žπ‘–π‘™π‘Žπ‘π‘™π‘’ and π‘›π‘’π‘š_π‘’π‘π‘–π‘˜π‘’π‘ _π‘Žπ‘£π‘Žπ‘–π‘™π‘Žπ‘π‘™π‘’)                                                                    0..1
                                                                                                                                  + oid
                                                                                                                                  + label
of two stations meets or exceeds a predefined threshold.                                                 M etadataTimeser ies     + static_properties

The stored values represents the evolution of the similarity
                                                                                                                                                                          StaticProper ty
                                                                  Timeser ies                            + ownerID                + dynamic_properties
                                                                                                         + attributes             update_graph_element()          0..* + key
                                                                  + oid
score (e.g., 0.67, 0.79), computed using a time series simi-      + timestamps                   1       update_metadata()        get/add_property_type()                 + value

larity method [16] (e.g., Pearson correlation). The edges of      + data                                                          get/add_static_property()
                                                                                                                                  get/add_dynamic_property()
                                                                                                                                                                          get_value()
                                                                                                                                                                          set_value()
                                                                  + metadata
this edge set are modeled as time series edges (𝐸𝑑𝑠 ) and are     +variables
                                                                                                1..*

depicted in blue color in Figure 1. Each Station node has:        aggregate_timeseries()
                                                                                                1..*
                                                                                                                                                                            DynamicProper ty
                                                                  sum(), mean(), min(),                                                                           0..*      + key
                                                                  max(), count()
1. An id, e.g., π‘†π‘‘π‘Žπ‘‘π‘–π‘œπ‘›π΄;                                         get_value_at_timestamp()
                                                                                                                                                                            + series
                                                                                                                                                                            get_time_series()
                                                                  get_timestamp_at_value()                                                        Edge
                                                                                                        Node                    Subgr aph                                   set_value()
2. A validity interval, e.g., πœ‚(π‘†π‘‘π‘Žπ‘‘π‘–π‘œπ‘›π΄) = ⟨2000, ∞⟩;            last_value(), first_value()
                                                                  subset_timeseries()                   + membership            + start_time
                                                                                                                                                  + membership              apply_aggregation()
                                                                                                        + start_time            + end_time        + source                  get_timestamp()
                                                                  compute_similarity()
3. Static properties, e.g., πœ‘(π‘†π‘‘π‘Žπ‘‘π‘–π‘œπ‘›π΄, π‘›π‘Žπ‘šπ‘’) = Christ Hos-                                             + end_time              get_type()        + target
                                                                                                                                                  + start_time
                                                                                                                                                                            get_first_value()
                                                                                                                                                                            get_last_value
   pital, πœ‘(π‘†π‘‘π‘Žπ‘‘π‘–π‘œπ‘›π΄, π‘π‘Žπ‘π‘Žπ‘π‘–π‘‘π‘¦) = 2 and;
                                                                                                        get_membership()        apply_filter()
                                                                                0..*                                                              + end_time
                                                                                                        get_neighbors()
                                                                                                                                                  get_membership()
                                                                                                                                          0..*
4. Dynamic properties (time-series), the time series proper-
                                                                    HyGr aphQuer ying
   ties are represented as an object with a list of times-
                                                                                                                               HyGr aph
                                                                    +hygraph                       TSNode        PGNode        + graph
   tamps and associated data values for the variable at             + node_matches                 + series      get_type()    + time_series                             TSEdge         PGEdge

   each timestamp. For instance, for π‘†π‘‘π‘Žπ‘‘π‘–π‘œπ‘›π΄ the dy-               + edge_matches
                                                                    + pattern
                                                                                                   get_type()
                                                                                                                        0..*
                                                                                                                               + subgraphs
                                                                                                                               + query
                                                                                                                                                                         + series
                                                                                                                                                                         get_type()
                                                                                                                                                                                        get_type()

   namic attribute π‘›π‘’π‘š_π‘π‘–π‘˜π‘’π‘ _π‘Žπ‘£π‘Žπ‘–π‘™π‘Žπ‘π‘™π‘’ has a variable               + conditions                         0..*
                                                                                                                                                                                          0..*
                                                                                                                                                                          0..*
   π‘‘π‘–π‘šπ‘’π‘ π‘‘π‘Žπ‘šπ‘π‘  = [β€œ21 : 00”, β€œ22 : 00”] which stores a
                                                                    + return_elements                                          add/get/delete_tsnode/tsedge
                                                                    + groupings                                                add/get/delete_subgraph
   list of timestamps, and the associated π‘£π‘Žπ‘Ÿπ‘–π‘Žπ‘π‘™π‘’ name             + aggregations                                             add/get/delete_timeseries

   π‘π‘–π‘˜π‘’_π‘Žπ‘£π‘Žπ‘–π‘™ holds π‘‘π‘Žπ‘‘π‘Ž = [[1], [2]]. Note that this is a
                                                                    + ordering                   0..*                          get_node/edge_by_static_property
                                                                    + limit_count                                              get_node/edge_by_dynamic_property
   regular dynamic property as its value change due to              + subquery_results                                         get_subgraph_by_temporal_property

   external factors, like a trip being undertaken.                                                                             find_path()
                                                                                                                               get_node_degree_over_time()
                                                                                                                               add/get/delete_pgnode/pgedge
Each edge in Trips consists of:
1. A label, e.g., 2 ;                                            Figure 2: HyGraph UML conceptual diagram

2. A validity interval, e.g., πœ‚( 2 ) = ⟨2001, 2025⟩;
                                                                 with everything handled in memory, using NetworkX [17]
3. Dynamic properties (regular), e.g., π‘šπ‘’π‘šπ‘π‘’π‘Ÿ_π‘Ÿπ‘–π‘‘π‘’π‘ ,
                                                                 and Xarray [18] for graph and time series in-memory stor-
   π‘π‘Žπ‘ π‘’π‘Žπ‘™_π‘Ÿπ‘–π‘‘π‘’π‘ , etc.
                                                                 age. While scalability to larger storage engines is desirable,
Each time series edge AvailBikeSim consists of:                  the current focus is on proof-of-concept, emphasizing a
                                                                 uniform data storage and querying model. We discuss the
1. A label, e.g., 8 ;                                            conceptual architecture of the HyGraph model through
                                                                 a UML diagram, followed by its system architecture with
2. A validity interval, e.g., πœ‚( 8 ) = ⟨2010, 2025⟩;
                                                                 some core functionalities.
3. Dynamic attribute represented as an object with a list of
   timestamps and associated data values for the variable        3.1. UML Architecture
   at each timestamp. For instance, edge 8 represents
   similarity score evolution between StationB and StationC      In Figure 2, a UML diagram describes the HyGraph system
   through a list of timestamps [β€œ2010”, β€œ2011”], and the        representation in a conceptual aspect. It illustrates the main
   associated π‘£π‘Žπ‘Ÿπ‘–π‘Žπ‘π‘™π‘’ name is π‘π‘–π‘˜π‘’_π‘Žπ‘£_π‘ π‘–π‘š, which holds          classes and their relationships. We capture HyGraph as
   data [[0.67], [0.70]].                                        a set of interrelated classes reflecting property-graph and
                                                                 time-series functionalities. At the root, an abstract class
Figure 1 also shows two non-overlapping subgraphs, Sub-          GraphElement defines fundamental attributes (such as id,
graph #1 and Subgraph #2, depicting partial views of the net-    and label). Three main classes inherit from GraphElement:
work at time 𝑑1. Subgraph #1 includes StationA and StationE,     Node, Edge, and Subgraph. Each of the classes plays core
and connecting edges, and Subgraph #2 features π‘†π‘‘π‘Žπ‘‘π‘–π‘œπ‘›π΅,         structural roles. Within this architecture, the class Node is in-
π‘†π‘‘π‘Žπ‘‘π‘–π‘œπ‘›πΆ and π‘†π‘‘π‘Žπ‘‘π‘–π‘œπ‘›π·, and connecting edges.                     herited by classes PGNode (representing standard property-
   Since the subgraphs can evolve with time and are dy-          graph nodes) and TSNode (representing time-series nodes),
namic, Figure 1 only depicts a snapshot at 𝑑1 . The logical      while the class Edge is similarly inherited by classes PGEdge
subgraphs aggregate vertices that share the same Avail-          and TSEdge.
BikeSim patterns. For instance, StationB will continue to be        The Timeseries class defines a multivariate time series by
a part of Subgraph #2 till it has a high similarity score, but   maintaining five attributes, id, data to capture the value of
as soon as the score decreases it can be placed out or moved     multi-variate time series at different points in time, variables
to another subgraph.                                             holds information about each dimension of the multivariate
                                                                 time series, timestamps to hold an ordered list of timestamps
                                                                 for all recorded entries in data and the fifth attribute is the
3. HyGraph Architecture                                          metadata. The metadata attribute, which is an instance of a
We aim to seamlessly integrate time series and graph compo-      separate class MetadataTimeseries, is optional and facilitates
nents in a single system that allows combined querying and       additional descriptive attributes.
transformations over these components. At the moment,               A top-level HyGraph class aggregates all these compo-
theHyGraph system is being developed as a Python package         nents, ensuring that property-graph elements (PGNode,
PGEdge) and time-series elements (TSNode, TSEdge) coexist
in one coherent framework. Node and Edge and Subgraph
have two attributes start_time and end_time as timestamps,
to represent their time validity. TSNode and TSEdge classes
hold in addition an instance of the Timeseries class as at-
tributes to store the time series. The classes StaticProp-
erty and DynamicProperty represent the two types of prop-
erties. Specifically, an object of class GraphElement can
hold none or multiple property instances. The variables
static_properties and dynamic_properties in class GraphEle-      Figure 3: HyGraph System Architecture
ment, respectively represent instances of classes StaticProp-
erty and DynamicProperty in Figure 2. The class StaticProp-
erty captures properties with static values (represented by      functionalities, implementing classes like Timeseries and
𝒩Σ ). It simply stores the key and its corresponding value       MetadataTimeseries for storing and manipulating temporal
as attributes. While the class DynamicProperty corresponds       data. To facilitate the import and export of data, the sys-
to a property whose values evolve (represented by 𝒩TS ).         tem is equipped with the HyGraphFileLoader module, which
Thus, in addition to the key, it references an ID of the re-     aids in streamlining ETL (Extract–Transform–Load) tasks.
lated time series instance. The GraphElement class manages       In our system, we also define a module GraphConstruct.
these properties, exposing the logic to read, insert, delete,    This module facilitates (re-)construction of a graph purely
or modify them.                                                  from a correlation between time series. This enables trans-
   The dynamic nature of subgraphs is captured via the at-       formations that spawn graph structures based on temporal
tribute membership in Node and Edge, implemented as an           similarity. The module HyGraphQuery is defined to combine
instance of the Timeseries class. Concretely, an object of       all types of querying within HyGraph. At the moment it
either class can accumulate multiple membership updates          already includes the HybridPatternMatching class, and will
over time - one per subgraph change.As a result, every in-       also include as a future work other classes like Subgraph-
clusion or exclusion is appended to membership, effectively      Matching to enable searching for patterns corresponding
reflecting how the subgraph’s composition evolves. By tying      to a whole subgraph. All these modules interoperate un-
membership changes to a time-series structure, we maintain       der the central HyGraph module, which coordinates their
a complete history of when a node or edge was valid in each      interactions and maintains the global system state.
subgraph. This allows subgraphs to evolve as entities join,         The HyGraph system’s functions are intuitively grouped
leave, and transform with updates to the HyGraph object.         into three principal interfaces, as shown in Table 1. The
   The HyGraphQuerying class provides a hybrid pattern-          interface ModelToHyGraph gathers data from an external
matching mechanism, allowing users to define queries             model (graph-only or time-series–only) and ingests it into
that simultaneously reference graph and time-series pat-         HyGraph. Here, GraphOperator handles graph injection,
terns. The class supports key concepts like node- or edge-       adding nodes, edges, and their properties. In parallel, Time-
based matching, groupings, and aggregations, reminiscent         seriesOperator manages time-series injection, creating or
of Cypher-style clauses.                                         updating dynamic properties and elements. This interface
                                                                 also includes the Graph similarity generation function as
                                                                 part of GraphConstruct module, which will be explained
3.2. System Architecture and Functionalities                     later in Section 4.2. Note that Model refers to graph or
Figure 3 shows a high-level architecture of the HyGraph          time-series data models.GraphConstruct can also generate
system, illustrating interactions between different layers. It   another The second interface, HyGraphToHyGraph, com-
stores and processes both graph and time-series elements         prises the core operations and algorithms that transform
in memory.                                                       one HyGraph into another, such as hybrid pattern match-
   We leverage two specialized processors: NetworkX as           ing, dynamic subgraph creation, and HyGraph clustering.
the graph processor, for structural operations (e.g., shortest   HyGraph instance from the existing one. Finally, the third in-
paths, subgraph extraction, etc.) and; Xarray as the time-       terface, HyGraphToModel focuses on extracting or exporting
series processor, suitable for multivariate data storage and     data back into an external format or distinct model. Graph-
computations. We extend the pre-defined object models pro-       Operator can provide standalone graph operators such as get
vided by these libraries to implement our custom-defined         the neighbors, and shortest path, while TimeseriesOperator
classes (described in Section 3.1). An object-mapping strat-     handles isolated time-series operations such as correlation
egy then bridges these customized classes with the under-        and feature extraction.
lying library objects, enabling both data persistence and
data retrieval. High-level algorithms such as hybrid pat-
tern matching or graph similarity, are developed on top of       4. HyGraph Transformation
this system, to enable analyses of the hybrid data. In the          Operators
implementation of the HyGraph system, we adopt a mod-
ular architecture with six principal modules, orchestrated       Data transformations may convert one representation into
by the main HyGraph module. The GraphOperator module             another or produce a new instance in the same representa-
handles all graph-centric logic, in addition to integrating      tion (augmented, summarized, or updated). These transfor-
graph and time series operations, ensuring that HyGraph          mations can be static (one-off) or dynamic (continuously
supports standalone property graph or time series features.      adapting as data changes). Within HyGraph, we distinguish
This module implements the classes GraphElement, Node,           two primary types of transformations. The first transfor-
Edge, Subgraph, StaticProperty, and DynamicProperty. In par-     mation type is from time series to graph. It involves ana-
allel, the TimeseriesOperator module manages time series         lyzing correlations and other temporal relationships among
     Table 1
     HyGraph modules and associated functionalities provided for different interfaces

                                                                        Interfaces
         Modules
                             ModelToHyGraph                  HyGraphToHyGraph                   HyGraphToModel
       GraphOperator         Graph injection                 Subgraph creation, Clustering      Standalone graph operators
       TimeseriesOperator    Time series injection           –                                  Standalone Time series operators
       HyGraphQuery          –                               Hybrid pattern matching            Data extraction and retrieval
       GraphConstruct        Graph similarity generation     Graph similarity generation        -



time series to generate new graph entities (nodes, edges,             4.2. From Graph Topology to Time series
or subgraphs) that reflect these relationships. The second
                                                                      Prior approaches have examined how graph metrics evolve,
transformation type transforms a graph into a time series.
                                                                      by either implementing algorithms that always catch new
By examining evolving graph metrics (like node in-degree
                                                                      changes in the graph structure and update the results of the
or edge traffic), construction of new time series that capture
                                                                      graph operator [23] or by creating time series to analyze
these structural changes over time. Moreover, HyGraph
                                                                      patterns [24, 25]. However, most solutions stop after gener-
supports the continuous execution of these transformations.
                                                                      ating standalone time series data and do not further link it
If the graph is updated, the transformed time series can be
                                                                      with the graph. In HyGraph, we can generate time series
updated simultaneously, and vice versa.
                                                                      by analyzing the evolution of graph topology and option-
   In the following subsections, we illustrate two transfor-
                                                                      ally, embed them back into the HyGraph, either as dynamic
mations: (i) a time-series–based similarity graph operator
                                                                      property of existing graph elements or as dedicated element
(Section 4.1), and (ii) an extraction operator to extract time
                                                                      (as instances of TSNode or TSEdge). This allows transforma-
series from evolving graph metrics (Section 4.2). These
                                                                      tion of the HyGraph through augmentation of the derived
examples demonstrate the support for flexible and bidirec-
                                                                      data and enables further transformation operations. We
tional transformations that unify structural and temporal
                                                                      define the extraction operator as:
data in a coherent ecosystem of HyGraph.
                                                                      𝐸π‘₯π‘‘π‘Ÿπ‘Žπ‘π‘‘π‘‡ 𝑆(𝐻, β„±, π‘šπ‘’π‘‘π‘Ÿπ‘–π‘, 𝜏, 𝑓 π‘Ÿπ‘’π‘ž) β†’ {𝑑𝑠1 , ..., π‘‘π‘ π‘š }βˆͺ𝐻 β€²
4.1. Time series based Graph Similarity                                where,
Several existing methods already construct graphs from                β€’ 𝐻 is a HyGraph instance,
time-series data for tasks like clustering or anomaly de-
tection [19, 20, 21, 22]. They typically compute pairwise             β€’ β„± is a filter (or set of filters) that specifies which ver-
similarities or distances among time series and generate a              tices/edges or subgraphs to evaluate (e.g., node filter based
static graph whose edges represent these similarities. In               on labels),
contrast, HyGraph provides a similar time series similarity-          β€’ π‘šπ‘’π‘‘π‘Ÿπ‘–π‘ specifies the graph property over which the time
to-graph mechanism and also integrates the newly created                series is to be generated, e.g., degree centrality, clustering
HyGraph within the unified hybrid system. This implies                  coefficient, etc.
that the resulting HyGraph can also maintain static and
dynamic properties on edges, and be used for further pro-             β€’ 𝜏 = [𝑑start , 𝑑end ) specifies a time range, {𝑑start , 𝑑end } ∈ 𝒯 ,
cessing by hybrid operators, like pattern matching.
                                                                      β€’ 𝑓 π‘Ÿπ‘’π‘ž indicates sampling frequency (e.g., daily, weekly).
   We define our time series-based graph similarity opera-
tor formally as a function πΊπ‘Ÿπ‘Žπ‘β„Žπ‘†π‘–π‘š(𝑇 𝑆, π‘šπ‘’π‘‘β„Žπ‘œπ‘‘π‘ , πœƒ) β†’                We enumerate discrete time steps at 𝑓 π‘Ÿπ‘’π‘ž in the time range
𝐻 β€² where 𝑇 𝑆 is a set of time series, π‘šπ‘’π‘‘β„Žπ‘œπ‘‘π‘  is a set of            𝜏 , as {𝑑1 , 𝑑2 , . . .}. At each time step 𝑑𝑖 , we take a snap-
similarity strategies (correlation, shape similarity, etc.) and       shot of the HyGraph instance. The value for π‘šπ‘’π‘‘π‘Ÿπ‘–π‘ is then
πœƒ ∈ [0, 1] is a similarity threshold. The output 𝐻 β€² is a             computed for each snapshot and assembled into a time se-
new HyGraph instance generated by analyzing time series               ries. The time series thus generated reflects the evolution of
nodes. Specifically, let {𝑣1 , 𝑣𝑛 } βŠ† 𝑉ts represent time se-          π‘šπ‘’π‘‘π‘Ÿπ‘–π‘ across the selected nodes/edges over discrete time
ries nodes, then for each edge, represented as a vertex pair          intervals. The time series can be processed for further anal-
(𝑣𝑖 , 𝑣𝑗 ), we compute the similarity score of their time series      ysis, or injected back into the HyGraph instance as dynamic
as π‘†π‘–π‘šπ‘‘π‘  (𝑣𝑖 , 𝑣𝑗 ). If π‘†π‘–π‘šπ‘‘π‘  (𝑣𝑖 , 𝑣𝑗 ) β‰₯ πœƒ, an edge is created.     properties of the graph elements, to produce an updated
The similarity score is stored as the static or dynamic edge          instance, or can simply be returned as a set of time series.
property. If the user only requests a single, fixed value, a
PGEdge is created with a static property. However, if the
evolution of similarity over time is of interest, it is more          5. Use case: Micro-mobility
strategic to store it as a time series in an instance of TSEdge.
   The objective of the operator is to either create a HyGraph        Micro-mobility has emerged as a cornerstone of sustain-
from scratch when only time-series data is provided, or to            able urban transportation. Yet, one of its persistent opera-
further analyze time series in the existing HyGraph instance          tional hurdles is rebalancing, ensuring that vehicles such
by applying graph operators to time series. In HyGraph ter-           as bicycles or e-bikes are appropriately distributed across
minology, this is a ModelToHyGraph transformation, where              docking stations to meet fluctuating demand. Studies fo-
one or more time series (either ingested from an external             cusing on bike-sharing systems emphasize that neglected
source or extracted from the current HyGraph) are analyzed            rebalancing can lead to chronic station shortages or over-
to produce a HyGraph reflecting their interrelationships.             flows, hindering overall service reliability and increasing
                                                                      user dissatisfaction [26, 27, 28].
   To address the rebalancing challenge, we propose a multi-       6. Future Research
step pipeline that leverages HyGraph’s transformation oper-
ators, GraphSim and ExtractTS (described in Section 4). Our        HyGraph represents an initial step toward integrating prop-
core objective is to determine, for each station, which other      erty graphs with time-series, addressing key challenges in
station(s) serve as ideal rebalancing partners, i.e., whenever     maintaining and querying dynamic data. By unifying these
one station experiences a surplus, the other experiences a         two paradigms, HyGraph enables seamless temporal graph
deficit, while simultaneously accounting for neighbor con-         transformations, but its implementation also presents sev-
nectivity and distance. We base our analysis on the dataset        eral complexities.
provided by [15]. It unifies graph and time-series data by            However, one major challenge lies in efficiently updat-
representing a bike station as a vertex with a static property     ing and querying time-series data associated with graph
representing the parking capacity of the station and dy-           nodes and edges. Indexing strategies in traditional graph
namic properties like the number of available bikes; while         databases are not inherently designed to accommodate time-
each edge represents trips between two stations, a time-           series data efficiently, leading to potential scalability bottle-
series property tracking daily active trips, member rides vs.      necks. A new system that integrates indexing techniques
casual rides, and total trips.                                     tailored for both graph structures and time-series storage,
   Using the ExtractTS operator, we first extract two dy-          would ensure efficient querying and seamless data evolution.
namic properties for each station node:                            Maintaining indexing structures that accommodate both
                                                                   topological changes in the graph and temporal variations
1. For a station 𝑣 at time 𝑑, the imbalance is defined as          in time-series data requires novel optimization techniques.
   the difference between the number of trips that end at             Additionally, the lack of a standardized query language
   station 𝑣 (i.e. bikes arriving) until time 𝑑, and the number    for seamlessly integrating time-series operations with graph
   of rides starting from station 𝑣 (i.e. bikes departing) until   traversal necessitates the design of new operators and query
   time 𝑑. The imbalance value is for a station is captured        execution strategies. Existing graph query languages do not
   at different timestamps and is stored as a time series          natively support analytic operations commonly found in
   property, π‘–π‘šπ‘π‘Žπ‘™π‘Žπ‘›π‘π‘’_𝑑𝑠.                                         time-series databases, such as temporal aggregations, win-
                                                                   dowed computations, and similarity searches based on se-
2. For a station 𝑣, we also compute its connectivity score
                                                                   quence patterns or shape-based matching. Future research
   to quantify how strongly it is connected to its neighbors.
                                                                   could explore the development of a unified query language
   The connectivity score for 𝑣 is defined as the ratio of
                                                                   that incorporates time-aware traversal semantics and trans-
   weighted sum of edges and the degree of 𝑣 at any time 𝑑.
                                                                   formation operators to enable efficient interaction between
   Similar to imbalance of a station, the connectivity score
                                                                   graph topology and temporal dynamics.
   is also stored as a dynamic property, π‘π‘œπ‘›π‘›π‘’π‘π‘‘π‘–π‘£π‘–π‘‘π‘¦_𝑑𝑠,
                                                                      The fast-evolving nature of time-series data necessitates
   of the vertex.
                                                                   low-latency updates and retrieval, making it essential to
   In the next step, a similarity graph is constructed us-         scale HyGraph for real-time applications. Addressing this
ing the module GraphConstruct, where nodes represent               challenge requires investigating efficient data streaming
stations and edges represent the similarity of their time          architectures, like designing caching mechanisms for fre-
series property π‘–π‘šπ‘π‘Žπ‘™π‘Žπ‘›π‘π‘’_𝑑𝑠. To capture the complemen-            quently queried data and hybrid storage layouts optimized
tary behavior of two stations, i.e., pairing a surplus station     for high-throughput ingestion and query concurrency.
with a deficit station, we compute a negative correlation be-
tween π‘–π‘šπ‘π‘Žπ‘™π‘Žπ‘›π‘π‘’(𝑣, 𝑑) for station 𝑣 and π‘–π‘šπ‘π‘Žπ‘™π‘Žπ‘›π‘π‘’(𝑒, 𝑑)
for station 𝑒. The function will augment the HyGraph in-
                                                                   7. Conclusion
stance with new TSNode objects, created to represent the           This paper introduced the UML and sytem architecture of
π‘–π‘šπ‘π‘Žπ‘™π‘Žπ‘›π‘π‘’_𝑑𝑠 of each station and new PGEdge objects rep-           HyGraph [7], illustrating a unified approach for integrat-
resenting the similarity between the newly created TSNode          ing property graphs and time-series data. We introduced
objects. The negative correlation between the imbalance            two novel transformation operators: (i) a time-series-based
time series of two stations 𝑒 and 𝑣, 𝑛𝑒𝑔imb (𝑒, 𝑣) quantifies      graph operator, which derives graphs based on correlations
how complementary the two stations are.                            among time series, and (ii) a graph-based time-series opera-
   After building the similarity graph, for every edge con-        tor, which extracts time-series representations from evolv-
necting stations 𝑒 and 𝑣, we compute a composite score             ing graph metrics.
that will represent the weight of the edge. This weight is a          Our micro-mobility case study further demonstrated the
combination of the following: a similarity score based on a        practical applicability of HyGraph and the transformation
distance decay function [29], the distance between the two         operators for augmented analysis in real-world settings. By
stations, the negative correlation score 𝑛𝑒𝑔imb (𝑒, 𝑣) and the     establishing a foundation for hybrid graph-time-series an-
average value of π‘π‘œπ‘›π‘›π‘’π‘π‘‘π‘–π‘£π‘–π‘‘π‘¦_𝑑𝑠 between the two stations.         alytics, HyGraph paves the way for plethora of research
This composite score reflects both the temporal complemen-         opportunities in graph data management, temporal reason-
tarity of imbalance and the practical factors of connectivity      ing, and dynamic query processing.
and distance. Once the similarity graph is fully augmented,           Despite its advantages, several challenges remain and
we apply a maximum weighted matching algorithm [30] to             future research should explore scalable indexing and query
select a set of non-overlapping edges and return the set of        optimization techniques for hybrid queries.
station pairs (𝑒, 𝑣) that maximize the total composite score.
   For each matched pair, the average imbalance difference
is computed to suggest the direction and the number of             References
bikes that should be transferred from one station to another.
                                                                    [1] S. Bhandari, N. Bergmann, R. Jurdak, B. Kusy, Time
                                                                        series data analysis of wireless sensor network mea-
     surements of temperature, Sensors 17 (2017) 1221.               accessed: 2024-09-27.
 [2] R. Krishnamurthi, A. Kumar, D. Gopinathan, A. Nay-         [16] A. Kianimajd, M. G. Ruano, P. Carvalho, J. Henriques,
     yar, B. Qureshi, An overview of iot sensor data pro-            T. Rocha, S. Paredes, A. E. Ruano, Comparison of
     cessing, fusion, and analysis techniques, Sensors 20            different methods of measuring similarity in physio-
     (2020) 6076.                                                    logic time series, IFAC-PapersOnLine 50 (2017) 11005–
 [3] H. Zhang, J. Chen, W. Li, X. Song, R. Shibasaki, Mobile         11010.
     phone gps data in urban ride-sharing: An assessment        [17] NetworkX, Networkx: Network analysis in python,
     method for emission reduction potential, Applied                2024. URL: https://networkx.org/.
     Energy 269 (2020) 115038.                                  [18] Xarray, Xarray: Dealing with multidimensional ar-
 [4] L. Belkessa, M. Ameli, M. Ramezani, M. Zargayouna,              rays in python, 2024. URL: https://docs.xarray.dev/en/
     Multi-channel spatio-temporal graph convolutional               stable.
     networks for accurate micromobility demand predic-         [19] D. Tiano, A. Bonifati, R. Ng, Featts: Feature-based
     tion integrating public transport data, in: Proceedings         time series clustering, in: G. Li, Z. Li, S. Idreos, D. Sri-
     of the 2nd ACM SIGSPATIAL Workshop on Sustain-                  vastava (Eds.), SIGMOD ’21: International Conference
     able Urban Mobility, 2024, pp. 5–13.                            on Management of Data, Virtual Event, China, June
 [5] A. Bader, O. Kopp, M. Falkenthal, Survey and com-               20-25, 2021, ACM, 2021, pp. 2784–2788.
     parison of open source time series databases, in:          [20] P. Li, S. F. Boubrahimi, S. M. Hamdi, Graph-based
     Datenbanksysteme fΓΌr Business, Technologie und Web              clustering for time series data, in: 2021 IEEE Interna-
     (BTW 2017) - Workshopband, Gesellschaft fΓΌr Infor-              tional Conference on Big Data (Big Data), IEEE, 2021,
     matik e.V., Bonn, 2017, pp. 249–268.                            pp. 4464–4467.
 [6] K. Mishra, S. Basu, U. Maulik, Graft: A graph based        [21] L. N. Ferreira, L. Zhao, Time series clustering via com-
     time series data mining framework, Eng. Appl. Artif.            munity detection in networks, Information Sciences
     Intell. 110 (2022).                                             326 (2016) 227–242.
 [7] M. Ammar, C. Rost, R. Tommasini, S. Agarwal, A. Boni-      [22] K. F. Eteffa, S. Ansong, C. Li, M. Sheng, Y. Zhang,
     fati, P. Selmer, E. Kharmlamov, E. Rahm, Towards hy-            C. Xing, An experimental study of time series based
     brid graphs: Unifying property graphs and time series,          patient similarity with graphs, in: Web Information
     28th International Conference on Extending Database             Systems and Applications: 17th International Con-
     Technology (2025).                                              ference, WISA 2020, Guangzhou, China, September
 [8] E. Bollen, R. Hendrix, B. Kuijpers, Managing data of            23–25, 2020, Proceedings 17, Springer, 2020.
     sensor-equipped transportation networks using graph        [23] D. Eppstein, Z. Galil, G. F. Italiano, Dynamic graph
     databases, Geoscientific Instrumentation, Methods               algorithms, Algorithms and theory of computation
     and Data Systems Discussions 2024 (2024) 1–30. URL:             handbook 1 (1999) 9–1.
     https://gi.copernicus.org/preprints/gi-2024-3/. doi:10.    [24] C. Aggarwal, K. Subbian, Evolutionary network anal-
     5194/gi-2024-3.                                                 ysis: A survey, ACM Computing Surveys (CSUR) 47
 [9] B. Steer, F. Cuadrado, R. Clegg, Raphtory: Stream-              (2014) 1–36.
     ing analysis of distributed temporal graphs, Future        [25] C. Rost, K. Gomez, P. Christen, E. Rahm, Evolu-
     Generation Computer Systems 102 (2020) 453–464.                 tion of degree metrics in large temporal graphs, in:
[10] J. Chen, X. Wang, X. Xu, Gc-lstm: Graph convolution             Datenbanksysteme fΓΌr Business, Technologie und Web
     embedded lstm for dynamic network link prediction,              (BTW 2023), volume P-331 of LNI, Gesellschaft fΓΌr In-
     Applied Intelligence (2022) 1–16.                               formatik e.V., 2023, pp. 485–507. URL: https://doi.org/
[11] S. Bloemheuvel, J. van den Hoogen, D. Jozinović,                10.18420/BTW2023-23. doi:10.18420/BTW2023-23.
     A. Michelini, M. Atzmueller, Graph neural networks         [26] K. Wang, X. Yan, Z. Zhu, X. M. Chen, Understanding
     for multivariate time series regression with applica-           bike-sharing usage patterns of members and casual
     tion to seismic data, International Journal of Data             users: A case study in new york city, Travel Behaviour
     Science and Analytics 16 (2023) 317–332.                        and Society 36 (2024) 100793.
[12] S. Gocheva-Ilieva, H. Kulina, A. Yordanova, Stacking       [27] Y.-T. Hsu, L. Kang, Y.-H. Wu, User behavior of bikeshar-
     machine learning models using factor analysis to pre-           ing systems under demand–supply imbalance, Trans-
     dict the output laser power, in: 2022 International Con-        portation Research Record 2587 (2016) 117–124.
     ference on Electrical, Computer, Communications and        [28] F. Chiariotti, C. Pielli, A. Zanella, M. Zorzi, A dynamic
     Mechatronics Engineering (ICECCME), IEEE, 2022.                 approach to rebalancing bike-sharing systems, Sensors
[13] Z. Wang, H. Ren, R. Lu, L. Huang, Stacking based                18 (2018) 512.
     lightgbm-catboost-randomforest algorithm and its ap-       [29] M. Halas, P. Klapka, Spatial influence of regional cen-
     plication in big data modeling, in: 2022 4th Inter-             tres of slovakia: analysis based on the distance-decay
     national Conference on Data-driven Optimization of              function, Rendiconti Lincei 26 (2015) 169–185.
     Complex Systems (DOCS), IEEE, 2022, pp. 1–6.               [30] B. Wu, L. Li, Solving maximum weighted matching
[14] R. Angles, The property graph database model, in:               on large graphs with deep reinforcement learning, In-
     D. Olteanu, B. Poblete (Eds.), Proceedings of the 12th          formation Sciences 614 (2022) 400–415.
     Alberto Mendelzon International Workshop on Foun-
     dations of Data Management, Cali, Colombia, May
     21-25, 2018, volume 2100 of CEUR Workshop Proceed-
     ings, CEUR-WS.org, 2018.
[15] Lyft Bikes & Scooters, C. Urbainsky, New york city
     bike sharing network: Time-series enhanced nodes
     and edges dataset, 2024. URL: https://doi.org/10.5281/
     zenodo.13846868. doi:10.5281/zenodo.13846868,