Context-Free Path Querying In Terms of Linear Algebra
                                                                    Rustam Azimov
                                                            supervised by Semyon Grigorev
                                                                Saint Petersburg State University
                                                                      St. Petersburg, Russia
                                                                   st013567@student.spbu.ru

ABSTRACT                                                                               to return the example of such path for all vertex pairs, and all-path
Context-Free Path Querying (CFPQ) is an important problem with                         query semantics that requires to return all such paths for all vertex
applications in many areas, for example, graph databases, bioin-                       pairs.
formatics, static analysis, etc. Historically, regular languages are                      A number of CFPQ algorithms based on parsing techniques were
used as constraints for navigational path queries. However, in some                    proposed: (G)LL and (G)LR-based algorithms by Ciro M. Medeiros
important cases, regular languages are not expressive enough and                       et al. [3], Fred C. Santos et al. [19], Semyon Grigorev et al. [13],
context-free languages are used instead. Many algorithms for CFPQ                      and Ekaterina Verbitskaia et al. [4]; CYK-based algorithm by Xi-
were proposed but recently showed that the state-of-the-art CFPQ                       aowang Zhang et al. [15]; combinators-based approach to CFPQ
algorithms are still not performant enough for practical use. One                      by Ekaterina Verbitskaia et al. [5]. Yet recent research by Jochem
promising way to achieve high-performance solutions for graph                          Kuijpers et al. [8] shows that existing solutions are not applicable
querying problems is to reduce them to linear algebra operations                       for real-world graph analysis because of significant running time
(such as matrix multiplication). The active utilization of these oper-                 and memory consumption.
ations in the process of context-free path query evaluation makes it                      Inspired by Valiant’s [20] matrix-based algorithm for context-
possible to efficiently apply a wide class of optimizations and com-                   free language recognition, we explore the applicability of linear
puting techniques, such as GPGPU (General-Purpose computing                            algebra methods to the CFPQ problem. The linear algebra methods
on Graphics Processing Units), parallel computation, sparse matrix                     is widely used for various problems of finding paths in graphs, but
representation, distributed-memory computation, etc. In this Ph.D.                     the CFPQ problem poses additional challenges originating from
work, we aim at: (i) studying the applicability of linear algebra                      query-specific information that needs to be captured. Valiant pro-
methods to the CFPQ problem, (ii) at devising the algorithms for                       posed a parsing algorithm, which computes a recognition table
context-free path query evaluation formulated in terms of linear                       by computing matrix transitive closure. These algorithms take a
algebra operations, and (iii) at achieving high-performance imple-                     string at the input and decide whether this string is generated from
mentations of the devised algorithms using parallel computations.                      the input context-free grammar. Valiant’s algorithm has essentially
                                                                                       the same complexity as Boolean matrix multiplication. This is a
                                                                                       promising way to achieve high-performance solutions for graph
                                                                                       querying problems. The active utilization of these operations in the
1    INTRODUCTION                                                                      process of context-free path query evaluation makes it possible to
Formal language-constrained path querying [2] is a graph analysis                      efficiently apply a wide class of optimizations and computing tech-
problem in which formal languages are used as constraints for nav-                     niques, such as GPGPU (General-Purpose computing on Graphics
igational path queries. In this problem, a path in an edge-labeled                     Processing Units), parallel computation, sparse matrix representa-
graph is viewed as a word constructed by the concatenation of                          tion, distributed-memory computation, etc.
edge labels. The formal languages are used to constrain the paths                         In this Ph.D. work, we make the following contributions.
of interest: a query should find only paths labeled by words from
the language. The most popular class of constraints used as naviga-
tional queries in graph databases are the regular ones. However, in
some important cases, regular languages are not expressive enough                          (1) We provide an approach to solving the CFPQ problem using
and context-free languages are used instead. The context-free path                             linear algebra operations. The provided approach allows
querying (CFPQ), can be used in many areas, for example, RDF anal-                             us to use a wide class of optimizations of linear algebra
ysis [15], static code analysis [9, 14], biological data analysis [12],                        operations for efficient analysis of large graphs.
graph segmentation [6].                                                                    (2) Using provided approach, we devise the CFPQ algorithms
   CFPQ has been studied a lot since the problem was first stated by                           based on linear algebra operations for relational, single-
Mihalis Yannakakis in 1990 [21]. Jelle Hellings investigates various                           path, and all-path query semantics.
aspects of CFPQ in [16–18] and formulates three possible querying                          (3) We provide the implementations of the devised algorithms
semantics: relational that requires to find all vertex pairs reachable                         for context-free path query evaluation using different op-
by some path of interest, single-path query semantics also requires                            timizations and computing techniques. Our preliminary
                                                                                               results demonstrate that our best CPU and GPU-based
Proceedings of the VLDB 2021 PhD Workshop, August 16th, 2021. Copenhagen, Den-                 implementations that utilize sparse matrix representation
mark. Copyright (C) 2021 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).                                     and parallel computation outperform the state-of-the-art
                                                                                               context-free path querying solutions.
                           Reducing the CFPQ problem to the linear algebra
                           operations
                                                                                                          Implementation
    Data                                                                        Analysis                                                                        Result interpretation
                                 Construction of adjacency matrices of a
                                                                                                                                Using the
                                  graph with sets of nonterminals of the            Inheritance of                             appropriate                              Extracting
                                     input CF-grammar as elements                     theoretical                             libraries that                        information about
            Input graph                                                            properties by the                         support linear                        the required paths
                                                                                     constructed                                 algebraic                            from matrices
                                                                                       algorithm                                operations                          depending on the
                                   Using the apparatus of linear algebra                                                                                             query semantics
                                     (matrix theory, ring theory, etc.)


    Query
                                                                                                              Using of suitable                Efficient
                                                                                                             formats for storing           parallelization on
                                                                                                             sparse adjacency                CPU or GPU
                                Construction of a           Traversing the
      Input Context-Free                                                                                       matrices (CSR,
                                semiring over the         input graph using
           grammar                                                                                              CSC, COO
                                     sets of               linear algebraic
                                                                                                                  formats)
                               nonterminals of the           operations on
                               input CF-grammar           adjacency matrix


                                                                                                                                Converting
      Query semantics            Operations must              Using matrix                                                 adjacency matrices
                                  comply with the            multiplication,                                                 to matrices with
                                 application of the         tensor product,                                                 information about
                                production rules of       transitive closure,                                               the desired paths
                                     the input                    etc.
                                   CF-grammar


               Figure 1: An overview of our proposed approach to solving the CFPQ problem using linear algebra operations


2      APPROACH                                                                                 of a certain algebraic structure, namely, in the form of a semiring
In this section, we describe our approach to solving the CFPQ prob-                             with non-associative multiplication. This approach takes advantage
lem using linear algebra operations. The idea of using a sparse adja-                           of the fact that in the production rules of the CF-grammars there
cency matrix as a graph representation in graph analysis problems                               are two operations — concatenation and union, which are trans-
is well-known. Recently, became very popular the GraphBLAS [7]                                  formed into the product and the sum of a semiring. The elements
API specification that defines standard building blocks for graph                               of the adjacency matrices must be sets of nonterminals since each
algorithms in the language of linear algebra. Various libraries that                            of them describe the set of words corresponding to the paths of
implement it provide data structures and functions to compute lin-                              interest. Further, the resulting semiring is used to define matrix
ear transformations and other linear algebra operations on sparse                               multiplication (or other linear algebra operation), which simulates
matrices. Using these libraries or other efficient libraries for such                           the step of the input graph traversing. For a complete traversal of
operations is a good recipe for making a high-performance CFPQ                                  the input graph, a transitive closure of adjacency matrices can be
solution if we can reduce the CFPQ problem to linear algebra oper-                              defined, which allows us to obtain the information about all paths
ations. Although such reduction was found for a number of graph                                 corresponding to the query. The type of information retrieved de-
algorithms, there are many graph algorithms for which it has not                                pends on the query semantics, which must be taken into account
been done. To the best of our knowledge, the reduction of CFPQ                                  when constructing the adjacency matrices and semirings.
problem to linear algebra operation is an open question.
                                                                                                   Analysis. To analyze the theoretical properties of the constructed
   An overview of our proposed approach is shown in Figure 1. The
                                                                                                algorithm, the existing theoretical results of linear algebra, graph
purpose of this approach is to solving CFPQ problem using linear
                                                                                                theory and the theory of formal languages can be used. The proper-
algebra operations. In order to do this, it is necessary to devise the
                                                                                                ties of the entire CFPQ algorithm largely depend on the properties
CFPQ algorithms that have the input in the form of a directed edge-
                                                                                                of linear algebra operations since it evaluates the context-free path
labeled graph as a data, a context-free grammar as a query, and the
                                                                                                queries by offloading the most intensive computations into calls to
query semantics that determines the type of requested information
                                                                                                procedures for these operations. Therefore, the most effective for
about paths in the graph. Further, the approach can be divided into
                                                                                                the algorithm will be optimizations that use the existing results of
the following stages.
                                                                                                linear algebra to efficiently compute such operations, for example,
                                                                                                sparse matrix and vector operations.
   Reducing the CFPQ problem to the linear algebra operations. The
query is formulated by a context-free grammar 𝐺 which is a tuple                                   Implementation. From a practical point of view, the algorithms
(𝑁 , Σ, 𝑃, 𝑆), where 𝑁 is a finite set of nonterminals; Σ is a finite set                       built in this approach are easy to implement, since the most time
of terminals, 𝑁 ∩ Σ = ∅; 𝑃 is a finite set of productions of the form                           consuming is the implementation of the necessary linear algebra
𝐴 → 𝛼, where 𝐴 ∈ 𝑁 , 𝛼 ∈ (𝑁 ∪ Σ) ∗ ; and 𝑆 is the start nonterminal.                            operations, which have already been implemented in many effi-
To formulate the obtained problem in terms of linear algebra, the in-                           cient libraries that support linear algebra operations. For example,
put graphs are considered in the form of adjacency matrices, and the                            can be used the SuiteSparse:GraphBLAS [1] library — is an imple-
input context-free grammar (CF-grammar) is encoded in the form                                  mentation of the GraphBLAS API. In such libraries, various matrix
                                                                                           2
optimizations are used, which will significantly speed up the com-
putation of the context-free queries for large graphs. One of such
optimizations is the use of sparse matrix formats (CSR, CSC, COO),
the use of which gives a significant performance gain since real
data is often sparse. In addition, many linear algebra operations
can be efficiently computed in parallel, for example on a GPU. As a
result, the CFPQ implementation will allow to obtain matrices that
will store information about the desired paths in the graph. The
type of stored information is determined by the query semantics,
for example, it can be an answer to the question of the existence of
                                                                              Figure 2: Execution time in seconds of the path extraction
paths of a certain form or their enumeration.
                                                                              algorithm depending on the path length for 𝑔𝑒𝑜𝑠𝑝𝑒𝑐𝑖𝑒𝑠
   Result interpretation. The final step is to interpret the query re-
sult. Depending on the query semantics, it is possible to extract                For single-path and all-path query semantics, we store additional
certain information about the paths between the vertices of in-               information in matrices to be able to restore found paths. For single-
terest from the resulting matrix. In addition, in the case when the           path query semantics, we store the intermediate vertex 𝑘 in the
query semantics involves the enumeration of paths between certain             element 𝑇𝑖,𝑗 only if there is a path from 𝑖 to 𝑘 corresponding to the
vertices in the graph, it also becomes necessary to implement an              nonterminal 𝐵, there is a path from 𝑘 to 𝑗 corresponding to the
algorithm for constructing these paths from the resulting matrices.           nonterminal 𝐶, and there is a rule (𝐴 → 𝐵𝐶) ∈ 𝑃. For all-path query
                                                                              semantics, we store the sets of the intermediate vertices, since we
3    ALGORITHMS                                                               must store the information about all paths between each vertex
In this section, we briefly describe our devised CFPQ algorithms              pair. In that case, we cannot reduce computations to operations
based on the linear algebra operations. For all our algorithms we             on Boolean matrices and we use custom matrix multiplication for
formally prove the correctness, termination, and time complexity              matrices with more complex elements (tuples or arrays of integers).
bounds. Our algorithms can be divided into two groups.                        There are still libraries that support linear algebra operations for
                                                                              our algorithms for single-path and all-path query semantics, for
   Matrix-based algorithms. The algorithms in the first group utilize         example, the GraphBLAS implementations that support the creation
the Boolean matrix multiplication for relational query semantics              of custom semirings for the matrix operations.
and operate with more complex matrices for single-path and all-                  Kronecker product-based algorithm. On the contrary, the algo-
path query semantics. Also, these algorithms, like many existing              rithm in the second group is based on the Kronecker product opera-
ones, require the CF-grammar transformation to some normal form               tion and does not require the transformation of the input grammar.
that allows us to encode one step of the input graph traversing into          The transformation leads to at least a quadratic blow-up in grammar
exactly one matrix multiplication since in this form we have only             size, thus by avoiding the transformation, this algorithm achieves
two nonterminals in the right-hand side of productions rules.                 better time complexity in terms of the grammar size. While regu-
   We define a binary operation ( · ) for arbitrary subsets 𝑁 1, 𝑁 2 of       lar languages can be expressed as a Finite-State Machine (FSM), a
𝑁 with respect to a CF-grammar 𝐺 = (𝑁 , Σ, 𝑃) as                              CF-grammar can be expressed as a Recursive State Machine (RSM)
                                                                              in a similar fashion. In these algorithms, we use RSM to represent
    𝑁 1 · 𝑁 2 = {𝐴 | ∃𝐵 ∈ 𝑁 1, ∃𝐶 ∈ 𝑁 2 such that (𝐴 → 𝐵𝐶) ∈ 𝑃 }.             the context-free path query and evaluate this query using the Kro-
                                                                              necker product of the corresponding adjacency matrices of the
   Using this binary operation as subset multiplication, and union            input graph and the RSM for the input CF-grammar. Although the
as an addition, we can define a matrix multiplication, 𝑎 × 𝑏 = 𝑐,             Kronecker-based algorithm for constructing the matrices with in-
where 𝑎 and 𝑏 are matrices of a suitable size, that have subsets of 𝑁         formation about required paths (or so-called index) is the same for
                      Ð
as elements, as 𝑐𝑖,𝑗 = 𝑛𝑘=1 𝑎𝑖,𝑘 · 𝑏𝑘,𝑗 . Also, we use the element-wise       all three query semantics, the paths extraction algorithm for each
union operation on matrices 𝑎 and 𝑏 with the same size: 𝑎 ∪ 𝑏 = 𝑐,            semantics is different.
where 𝑐𝑖,𝑗 = 𝑎𝑖,𝑗 ∪ 𝑏𝑖,𝑗 . Finally, we define the transitive closure
                                (1)    (2)                (1)
of a square matrix 𝑎 as 𝑎 + = 𝑎 + ∪ 𝑎 + ∪ · · · , where 𝑎 + = 𝑎 and           4    PRELIMINARY RESULTS
  (𝑖)  Ð        ( 𝑗)  (𝑖−𝑗)                                                   We evaluate implementations of devised algorithms on real-world
𝑎 + = 𝑖−1 𝑗=1 𝑎 + × 𝑎 +     , 𝑖 ≥ 2.
    We can evaluate the context-free path queries with relational             RDFs. We provide results only for our CPU and GPU-based imple-
semantics by computing this transitive closure of an adjacency ma-            mentations of matrix-based algorithms for relational and single-
trix 𝑇 of input labeled graph with sets of nonterminals as elements           path query semantics because of the page limit. We use Redis-
where 𝐴 ∈ 𝑇𝑖,𝑗 only if there is 𝑥 ∈ Σ such that there is edge from            Graph [11] graph database as storage and measure the full time of
vertex 𝑖 to 𝑗 labeled by 𝑥 and (𝐴 → 𝑥) ∈ 𝑃. However, described                query execution including all overhead on data preparation. This
operations can be computed using several Boolean matrix multipli-             way we can estimate the applicability of the matrix-based algorithm
cations and additions if we encode the matrix 𝑇 by the |𝑁 | Boolean           to real-world problems. The source code is available on GitHub1
matrices (one Boolean matrix for each nonterminal how it is done              1 Sources of matrix-based CFPQ algorithm for the RedisGraph database: https://github.
in the work of Valiant [20]).                                                 com/YaccConstructor/RedisGraph. Access date: 27.02.2021.
                                                                          3
           Table 1: Index creation time for RDFs (time is measured in seconds and memory is measured in megabytes)

                                                            Relational semantics index              Single-path semantics index
        Name                  V            E            RG_CPUrel             RG_GPUrel         RG_CPUpath            RG_GPUpath
                                                    Time (s) Mem (MB) Time (s) Mem (MB) Time (s) Mem (MB) Time (s) Mem (MB)
        go-hierarchy 45 007 1 960 436                 0.091         16.3    0.108      121.2   0.976         92.0    0.336      125.0
        enzyme       48 815 219 390                   0.018           5.9   0.018         4.0  0.029           8.1   0.043         6.0
        eclass_514en 239 111 1 047 454                0.067         13.8    0.166       16.0   0.195         31.2    0.496       26.0
        go           272 770 1 068 622                0.604         28.8    0.365       30.2   1.286         75.7    0.739       45.4
        geospecies   450 609 2 311 461                7.146       16 934    0.856      5 274  15.134       35 803    1.935      5 282


   The results of the CFPQ evaluation are presented in Table 1.                            [3] Ciro M. Medeiros et al. 2018. Efficient Evaluation of Context-free Path Queries
As we can see, our matrix-based algorithm for relational query                                 for Graph Databases. In Proceedings of the 33rd Annual ACM Symposium on
                                                                                               Applied Computing (Pau, France) (SAC ’18). ACM, New York, NY, USA, 1230–
semantics implemented for RedisGraph is more than 1000 times                                   1237. https://doi.org/10.1145/3167132.3167265
faster than the one based on annotated grammar implemented for                             [4] Ekaterina Verbitskaia et al. 2016. Relaxed Parsing of Regular Approximations
                                                                                               of String-Embedded Languages. In Perspectives of System Informatics, Manuel
Neo4j [8] and uses more than 4 times less memory. We can conclude                              Mazzara and Andrei Voronkov (Eds.). Springer International Publishing, Cham,
that the matrix-based algorithm is more performant than the state-                             291–302.
of-the-art CFPQ algorithms for query evaluation under a relational                         [5] Ekaterina Verbitskaia et al. 2018. Parser Combinators for Context-free Path
                                                                                               Querying. In Proceedings of the 9th ACM SIGPLAN International Symposium
semantics for real-world data processing.                                                      on Scala (St. Louis, MO, USA) (Scala 2018). ACM, New York, NY, USA, 13–23.
   We can conclude, that the cost of computing matrices for single-                            https://doi.org/10.1145/3241653.3241655
path query semantics is not high. On average, it is about 2 times                          [6] H. Miao et al. 2019. Understanding Data Science Lifecycle Provenance via Graph
                                                                                               Segmentation and Summarization. In 2019 IEEE 35th International Conference on
slower than for the relational query semantics. The additional run-                            Data Engineering (ICDE). 1710–1713.
ning time of the path extraction is presented in Figure 2 (boxplots                        [7] J. Kepner et al. 2016. Mathematical foundations of the GraphBLAS. In 2016
                                                                                               IEEE High Performance Extreme Computing Conference (HPEC). 1–9. https:
are standard, outliers are omitted). As we can see, this time is small                         //doi.org/10.1109/HPEC.2016.7761646
and linear in the length of the path.                                                      [8] Jochem Kuijpers et al. 2019. An Experimental Study of Context-Free Path Query
   Finally, we conclude that the matrix-based algorithm paired with                            Evaluation Methods. In Proceedings of the 31st International Conference on Scien-
                                                                                               tific and Statistical Database Management (Santa Cruz, CA, USA) (SSDBM ’19).
a suitable database and employing appropriate libraries for linear                             ACM, New York, NY, USA, 121–132. https://doi.org/10.1145/3335783.3335791
algebra operations is a promising way to make CFPQ with different                          [9] Jakob Rehof et al. 2001. Type-Base Flow Analysis: From Polymorphic Subtyping
query semantics applicable for real-world data analysis.                                       to CFL-Reachability. SIGPLAN Not. 36, 3 (Jan. 2001), 54–66. https://doi.org/10.
                                                                                               1145/373243.360208
                                                                                          [10] Jyothi Vedurada et al. 2019. Batch Alias Analysis. In Proceedings of the 34th
5    CONCLUSION AND NEXT STEPS                                                                 IEEE/ACM International Conference on Automated Software Engineering (San
                                                                                               Diego, California) (ASE ’19). IEEE Press, 936–948. https://doi.org/10.1109/ASE.
In this Ph.D. work, we provide an approach to solving the CFPQ                                 2019.00091
problem using linear algebra operations. Using provided approach                          [11] P. Cailliau et al. 2019. RedisGraph GraphBLAS Enabled Graph Database. In
                                                                                               2019 IEEE International Parallel and Distributed Processing Symposium Workshops
we devise the CFPQ algorithms for all three query semantics and                                (IPDPSW). 285–286. https://doi.org/10.1109/IPDPSW.2019.00054
implement them using appropriate libraries for linear algebra op-                         [12] Petteri Sevon et al. 2008. Subgraph Queries by Context-free Grammars. Journal
erations. Preliminary results show that our CPU and GPU-based                                  of Integrative Bioinformatics 5, 2 (2008), 157 – 172. https://doi.org/10.1515/jib-
                                                                                               2008-100
implementations that utilize sparse matrix representation and par-                        [13] Semyon Grigorev et al. 2017. Context-free Path Querying with Structural Repre-
allel computation outperform the state-of-the-art solutions.                                   sentation of Result. In Proceedings of the 13th Central & Eastern European Software
                                                                                               Engineering Conference in Russia (St. Petersburg, Russia) (CEE-SECR ’17). ACM,
   As a next step, we plan to provide a full comparison of our linear                          New York, NY, USA, Article 10, 7 pages. https://doi.org/10.1145/3166094.3166104
algebra-based implementations with all state-of-the-art solutions                         [14] Xin Zheng et al. 2008. Demand-driven Alias Analysis for C. In Proceedings of the
for CFPQ on the same benchmark and experimental setup. More-                                   35th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming
                                                                                               Languages (San Francisco, California, USA) (POPL ’08). ACM, New York, NY,
over, our implementations are prototypes and we plan to provide                                USA, 197–208. https://doi.org/10.1145/1328438.1328464
full integration of CFPQ to RedisGraph. Also, we plan to improve                          [15] Xiaowang Zhang et al. 2016. Context-Free Path Queries on RDF Graphs. In The
the dataset by including more real-world cases with larger graphs,                             Semantic Web – ISWC 2016. Springer International Publishing, Cham, 632–648.
                                                                                          [16] Jelle Hellings. 2014. Conjunctive context-free path queries. In Proceedings of
for example, from the area of static code analysis [10, 14].                                   ICDT’14. 119–130.
                                                                                          [17] Jelle Hellings. 2015. Path Results for Context-free Grammar Queries on Graphs.
                                                                                               CoRR abs/1502.02242 (2015). arXiv:1502.02242 http://arxiv.org/abs/1502.02242
ACKNOWLEDGMENTS                                                                           [18] Jelle Hellings. 2015. Querying for Paths in Graphs using Context-Free Path
This work was funded by RFBR, project number 19-37-90101, and                                  Queries. arXiv preprint arXiv:1502.02242 (2015).
                                                                                          [19] Fred C. et al. Santos. 2018. A Bottom-Up Algorithm for Answering Context-Free
grant from JetBrains Research.                                                                 Path Queries in Graph Databases. In Web Engineering, Tommi Mikkonen, Ralf
                                                                                               Klamma, and Juan Hernández (Eds.). Springer International Publishing, Cham,
                                                                                               225–233.
REFERENCES                                                                                [20] Leslie G Valiant. 1975. General context-free recognition in less than cubic time.
 [1] Timothy A. Davis. 2019. Algorithm 1000: SuiteSparse: GraphBLAS: Graph Algo-               Journal of computer and system sciences 10, 2 (1975), 308–315.
     rithms in the Language of Sparse Linear Algebra. ACM Trans. Math. Softw. 45, 4       [21] Mihalis Yannakakis. 1990. Graph-theoretic Methods in Database Theory. In
     (2019), 44:1–44:25. https://doi.org/10.1145/3322125                                       Proceedings of the Ninth ACM SIGACT-SIGMOD-SIGART Symposium on Principles
 [2] C. Barrett et al. 2000. Formal-Language-Constrained Path Problems. SIAM                   of Database Systems (Nashville, Tennessee, USA) (PODS ’90). ACM, New York,
     J. Comput. 30, 3 (2000), 809–837. https://doi.org/10.1137/S0097539798337716               NY, USA, 230–242. https://doi.org/10.1145/298514.298576
     arXiv:https://doi.org/10.1137/S0097539798337716
                                                                                      4