Knowledge-Augmented Induction of Complex
      Networks on Supply–Demand–Material Data

              Dan Hudson1? , Leonid Schwenke1? , Stefan Bloemheuvel2 ,
           Arnab Ghosh Chowdhury1 , Nils Schut3 , and Martin Atzmueller1
 1
      Osnabrück University, Semantic Information Systems Group, Osnabrück, Germany
        2
          Tilburg University (TiU), Jheronimus Academy of Data Science (JADS),
                Tilburg (TiU), ’s-Hertogenbosch (JADS), The Netherlands
                    3
                       Polymer Science Park, Zwolle, The Netherlands


          Abstract. We describe a method for complex network induction in a
          knowledge-augmented data-driven approach. For this, we match items
          in a database according to their attributes, using knowledge of sub-
          contexts within the problem domain to improve the specificity and rele-
          vance of matches; this relates specifically to the challenge of supply chain
          modelling for the recycled plastics industry, using heterogeneous supply-
          demand-material data. In our approach, knowledge of sub-contexts comes
          from a mixture of data-driven inference and input from experts, and
          is crucial in determining how best to match items to one another. We
          store domain-specific knowledge in the form of patterns that describe
          subgroups of our data, a ‘case base’ for use in case retrieval, and also
          explicit rules provided by experts. We present a system prototype, de-
          scribe the conceptual modelling approach, and discuss preliminary out-
          puts demonstrating the proposed modelling method. An effective supply
          chain model can be used to support the recycled plastics industry and
          expand the uptake of recyclate.


 1      Introduction
 Supply chains [10, 29] can be defined as all stages involved in producing and
 delivering a product from supplier to customer – historically considered as a
 series of steps [16]. However, recent studies have used network theory to model
 supply chains as complex networks [24, 30]. This requires explicit information
 on the supply chain elements, which is not always available, a common gap
 which our system aims to address in order to then model supply chain data as
 a complex network – in a knowledge-augmented approach.
     In the context of the Di-Plast project [19], we focus on utilising industrial
 supply–demand–material data from the recycled plastics industry on suppliers,
 buyers and products with specific material specifications. However, supplier–
 product and buyer–product information is only provided in heterogeneous form,
 which needs to be aligned and matched, leading to resource-induced complex
 networks. This requires a knowledge-augmented network induction approach.
  ?
      Both authors contributed equally to this research.


Copyright © 2021 for this paper by its authors.
Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
2      D. Hudson, L. Schwenke, et al.

    In a data-driven approach, we start with supplier–product and buyer–product
specifications, i. e., complex user–item relationships, where the item part con-
tains complex product and/or material specifications. The central problem we
address is the matching [11, 38] of the respective entities, i. e., the complex ma-
terial specifications of a product, in order to form the complex network/graph
structure. This is difficult, due to the complex alignment and the specification
of constraints which have to be fulfilled during the matching. However, this can
be supported in a data-driven way – mining important parameters – while also
including background knowledge of domain experts in order to guide the pro-
cess. Then, with a human-in-the-loop approach, important (complex) constraints
can be captured, domain knowledge e. g., on the importance of features can be
provided, and the mined relations and induced networks can be inspected.
    Therefore, we propose a knowledge-augmented, context-based and data-driven
approach. We use background knowledge, subgroup discovery [4] as well as tech-
niques from case-based reasoning, specifically focusing on case-retrieval. This
takes the form of a matching and similarity ranking method for creating the
respective edges between buyer and seller nodes in our complex network repre-
sentation. Essentially, we match them according to their product specifications,
utilising the knowledge-augmented matching method. Two special challenges in
our industrial context are that many properties can be unknown and often no
perfect match exists for a buyer specification and, further, that often the detailed
target context is not known. This highlights the need to find the hidden context
and use a knowledge augmented similarity approach to find possible alternatives.
    Similar products between suppliers and buyers are matched and used for
relationship modelling, leading to our desired complex network abstraction. The
complex network itself is facilitated by a tripartite graph representation and/or
analysing the respective bipartite projections, as discussed below. For this, we
adapt analysis principles of complex networks and folksonomies [20, 15, 40].
Our contributions are summarised as follows:
 1. We present a framework for inducing complex networks in order to model
    heterogeneous supply-demand-material data. The resulting complex networks
    (or graphs) can then be analysed for a variety of purposes, such as identify-
    ing important suppliers in the network, identifying gaps in the supply chain,
    and developing profiles of the materials required in different industries. For
    this, we propose a knowledge-augmented data-driven approach for creating
    the graph structure, i. e., the links between the respective tripartition of
    supplier–product–buyer nodes.
 2. This is supplemented by a first view on our prototypical system implemen-
    tation in the context of the recycled plastics industry, for which we discuss
    a preliminary evaluation of our outputs with a domain expert.
   The rest of the paper is structured as follows: Section 2 summarises related
work. After that, Section 3 describes our proposed approach. Section 4 presents
and discusses preliminary results. Finally, Section 5 concludes with a summary
and interesting directions for future work.
                     Knowledge-Augmented Induction of Complex Networks              3

2    Related Work
Below, we discuss related work on matching and context-aware retrieval: We
briefly summarise case-based reasoning (CBR) [35, 1] – an analogical reasoning
technique used in various machine learning methods [35, 1]. Next, we introduce
subgroup discovery, a technique to find interesting subsets of of data points.

Matching and Context-Aware Retrieval. The problem of matching items is
closely related to that of search and recommendation in user–item interactions,
where users are interested in obtaining relevant items for their queries. This has
been approached in numerous ways e. g., [20, 21, 18, 38, 39]. Matches can then be
analysed and processed using graph-based [20, 21] approaches, which we focus
on, or using deep learning techniques [38]. Furthermore, we focus on different
contexts and sub-contexts of the user, which we use for entity matching, as well
as relationship modelling in our complex network. This is called context-aware
retrieval [13], also relating to context aware queries [8]. In the context of recycled
plastics, one big challenge is that producers of certain products want to find alter-
natives for their non-recycled plastics. Thus, they have a working specification,
but no exact recycled alternative exists (buyer context). By considering similar
specifications in the same application area from other buyers (hidden context)
we want to identify important target attributes for adapting our results.

Case-Based Retrieval. We apply an adapted case-based retrieval [28, 2, 12]
method for ranking and retrieving products that match to a particular query,
while in our application, the outcomes of each case are not known. [34] imple-
mented a CBR ranking system for products over a user query which can provide
suggestions on how the user could change the query to find noteworthy products
based on the selections of other users. In our case, the decision process of a buyer
is rather complex, taking different constraints into account. Therefore, we need
to have an awareness of the buyer’s hidden context to understand the applied
complex data. How to evaluate such a complex context was researched by [17],
on which basis we decided to do an evaluation by experts. A further challenge is
the high proportion of missing data. For this reason and based on the research
from [26], we try to handle missing values also in a context-based manner.

Subgroup Discovery. Subgroup discovery [4], aims at identifying subgroups of
data instances that are interesting with respect to a certain target concept, e. g.,
having a high chance of some interesting attribute being present. A subgroup
can be represented by a pattern which specifies rules for membership in the
subgroup, typically in the form of feature–value pairs, which must hold true
for a data instance in order for it to be included in the subgroup. This means
that it is a data-driven process that discovers explicit and interpretable rules to
associate the target concept to attributes found in the data instances. Subgroup
discovery has been applied to, e. g., analyse medical knowledge [32], industrial
data [22] and social media [3]. In our work, subgroup discovery is used to identify
sub-contexts (hidden contexts) of buyer specifications, where certain attributes
may become more important than others in the matching process.
4       D. Hudson, L. Schwenke, et al.

3     Method
Our method targets two objectives: First, to create a ranking of product spec-
ification matches, and second, to create a hypergraph using this. To do so, we
match buyer property specifications Q (also called queries) in the form of at-
tributes to the product attributes A of different sellers. In many cases, no exact
match be can be found, however similar products could satisfy the buyer’s needs.
Normally an expert would analyse the needs and make suggestions based on this.
To support an increasing demand for recyclate, an automatic approach is desired.
Our method includes a case-based approach to understand which deviations will
be acceptable, based on the queries from previous buyers who looked for recy-
cled materials in the same application context, i. e., we let the buyer’s application
context inform the matching process.
    Our method is presented diagrammatically in Figure 1 (A). A complex net-
work abstraction is the end goal of our work, which we describe in subsection
3.1. To begin the process, we describe an approach to automatically extracting
information to place into the Supplier-Product and Buyer-Product databases,
in subsection 3.2. Next, in subsections 3.3 and 3.4, we describe a method for
matching product specifications, leading to the relationship modelling that then
forms the basis of our final network abstraction. In subsection 3.3, we explain
how subgroup discovery can infer a relevant context when matching product
specifications, leading to even greater specificity in how different attributes are
weighted in the matching process described in subsection 3.4. The remaining
steps required to match product specifications and to create the edges for our
network model are described in 3.4. This includes a step to transform the fea-
ture space according to the context of the buyer, thereby focusing the matching
procedure on the most relevant attributes of the products. Concluding with sub-
section 3.5, we explain how our method supports interpretation, explanation and
adaptation of the matching process.

3.1   Complex Network Model


        Conceptual Network Modeling Process
                                                                                       Suppliers                    Suppliers

             Preparation
                                               Relationship


                                                              Abstraction
                           Aggregation
                           & Matching


              Cleaning
                                                Modeling


                                                               Network


                                                                            Products


                                                                                                   (B)                          (C)
             Preparation
                                                                                                         Products
              Cleaning


                                         (A)                                             Buyers                       Buyers


Fig. 1. Conceptual Graph Modeling Process (A), Example Tripartite Hypergraph (B)
and projected Hypergraph (C).


   To model the supply chain, we consider networks modelled as graphs GS , GB , G,
with the bipartite supplier and buyer graphs GS = (VS , VP , ES ), GB = (VB , VP , EB ,
                            Knowledge-Augmented Induction of Complex Networks                               5

where VS and VB indicate suppliers and buyers and VP indicates products (with
heterogeneous textual/numeric/nominal material specifications from a set A =
{a1 , a2 , . . . , an }, attributed to a node), ES ⊆ VS ×VP , EB ⊆ VB ×VP , and finally
the induced tripartite hypergraph G = (VS , VB , VP , E), where E ⊆ VS × VB × VP
captures ternary supply–demand relationships between suppliers and buyers for
specific products. We thus aim at a tripartite hypergraph between suppliers,
products, and buyers. Figure 1 – part (B) – shows an example graph. This graph
can then also be projected onto bipartite graphs, as shown in part (C), similar to
procedures in folksonomies [20] such that according methods like search, recom-
mendation and ranking can also be directly applied on the graph. This augments
the capabilities of our system to adapt based on accumulated knowledge.

3.2     Sources of Data
The data we use to construct our model of the recycled plastics supply chain
is taken from our ‘Matrix Tool’, created in collaboration between Osnabrück
University and industrial partner Polymer Science Park (PSP). This contains a
database of plastic recyclate suppliers, specifications of the recyclate products
they offer, plus potential buyers and the specifications of the products they make
from recyclate.
     For obtaining supplier–
product–buyer data, the          PDF                            Document Layout
                                                 Images
Matrix tool relies on         Documents                              Analysis

data extracted from PDF
data sheets containing         Product
                                 Title
                                               Product
                                               Subtitle
                                                             Product
                                                            Description
                                                                              . . . Table
product specifications.
This specifically targets                               Doc
                                                                     Title
the cold-start problem,                                           Description

i. e., when no (or lit-          Data extraction
                                                                   ...
                                 with hierarchical                   Table
tle) user information is         relationship                                     row
                                                                                      cell
available. On the PDF                                                                 cell    Data storage in
                                                                                           integrated database
data sheets we apply                                                             row


document layout analy-
sis, to identify and to
analyse the physical and Fig. 2. Overview of data extraction and data integration
                            pipeline (adapted from [33]).
logical document struc-
ture to extract relevant
information [27, 37] and also extract information from other heterogeneous data
sources. Figure 2 illustrates the extraction process. Supplier-Product and Buyer-
Product is stored in an appropriate database, and becomes the basis for inferring
Supplier-Product-Buyer edges in our network abstraction.

3.3     Subgroup Discovery
Our method aims to match buyers and products in a context-aware manner.
When matching a buyer specification, we use subgroup discovery to find a rele-
6       D. Hudson, L. Schwenke, et al.

vant context, specified through other similar buyer specifications in the database.
Subgroup discovery [4], aims at identifying subgroups of data instances (in our
case buyer specifications) that are interesting with respect to a certain tar-
get concept, e. g., regarding a specific processing technology such as injection
moulding. Using a binary target concept such as this, we are interested in large
subgroups with a high share of instances for which the target concept is true,
e. g., being very predictive of a specific production process.
    Subgroups are described by a symbolic pattern which is typically given by a
conjunction of feature–value pairs in the case of nominal features and selections
on intervals in the case of numeric features. In our work, this is performed on the
set of product attributes, so that we obtain conjunctions of constraints on the
attributes. An example in our application context is given by ‘MFI Minimum’
< 1.0 AND ‘Elongation at Break’ = NA AND ‘MFI Maximum < 1.500. Here,
MFI indicates a specific property of a material. A pattern can thus also be
interpreted as the body of a rule. The rule head then depends on the target
concept. In a top-k setting, a subgroup discovery algorithm returns the top-
k subgroups according to a selectable interestingness measure, c. f., [4]. For a
binary target concept, e. g., the size of a subgroup described by the pattern (its
support), and the share of the target concept in the subgroup, (its confidence),
are combined by one of the standard interestingess measures.
    Finding subgroups that, for example, have a high probability of being appro-
priate for a certain manufacturing process, we can identify contexts consisting of
highly relevant product specifications. These sub-contexts can then be used to
normalise the data in a way that is more relevant to what the buyer is looking
for, leading to better matches, as described in subsection 3.4, next.

3.4   Creation of Tripartite Supplier–Product–Buyer Edges
An edge in our tripartite hypergraph model represents a match between a “buyer
query” Q ⊆ A of a node VB specified via a set of attribute constraints (our query
on material attributes from the set of parameter attributes A) and the according
attributes of a node VP of a supplier node VS , as defined above. We aim to build
a similarity ranking to indicate matches between queries and product attributes
for constructing our graph model. To create edges, we can then take the top-n
elements in the ranking, or just the top-1 match for a simple graph.
    We suggest a buyer context aware ranking approach to compare multiple
related products to one query Q. A context is described by C ⊆ A \ Q, which
consists of several similar buyer specifications. At first an initial limitation of the
context is provided from the buyer, but via subgroup discovery we next want to
find the most suitable hidden context Ch . An example for a context could be the
attribute product=pipes, where a more specific unknown hidden context Ch ex-
ists e. g., underground pipes, which we need to discover via subgroup discovery.
Inside subgroup Ch , different degrees of variance for the individual attributes can
be observed. We argue that a smaller variance of an attribute inside a subgroup
means that this attribute is especially important for the context Ch . Conclud-
ing, an attribute with a lower variance is more important compared to one with
                     Knowledge-Augmented Induction of Complex Networks             7

a higher variance. This further highlights the importance to find the correct
sub-context via subgroup discovery. Compared to traditional methods, we de-
cide the importance of features according to previous buyer specifications rather
than product specifications, and achieve this by discovering a detailed (hidden)
context for the matching on a case-by-case basis (see subsection 3.3).
    After finding the hidden con-
text, we normalise the data (prod-
uct, query), by transforming the at-
tribute value space into a Gaussian                1.10
form based on Ch to achieve a nor-
mal distribution and stabilised vari-              1.05


                                         density
ances. In this way, we implicitly in-
                                                   1.00
clude the weight/importance of an at-
tribute into the according normalised              0.95
value space.
                                                   0.90
    This is similar to other work                  0   10 20 30 40 50 60 70
                                                             mfi_minimum
on CBR problems that has used a
weighting based on Gaussian distri-
butions [25, 31]. An example of this            4
transformation for the attributes MFI           2
(melting flow index) and density is
                                           density


                                                0
shown in Figure 3, demonstrating
the resulting normalised distribution.          2
We now compute the Euclidean dis-               4
tances between a query and poten-
                                                       4    2     0      2  4
tially matching products within this                         mfi_minimum
transformed space. This normalisa-
tion provides a data-driven adapta- Fig. 3. Example distribution of the at-
tion of the matching procedure to the tribute Density and the attribute MFI
context at hand. Hence, an adapted Minimum before (top) and after (bottom)
query can now be better assessed in- the Gaussian transformation, showing the
side the normalised feature space. Ad- shifted distances based on the variance.
ditionally, our method adapts accord-
ing to knowledge that has been cap-
tured from experts. To return to the example of product=pipes, when construct-
ing piping, there is a need for sufficient rigidity to prevent the pipe from bending.
In this instance, expert knowledge may inform the matching procedure by adding
the constraint that an e-modulus higher than stated in the query is acceptable
while a lower e-modulus is not. We capture this type of knowledge as com-
plex constraints for the relevant context. Using domain knowledge also makes it
possible to define partial matches for non-numeric attributes, e. g., HDPE is a
sub-material of PE, which is handled as a match rather than being penalised
with an error of 1 in our distance calculations. Alternatively, non-numeric at-
tributes can be included as constraints in the definition of a context C, thus
impacting the data subset which normalises the attribute space.
8        D. Hudson, L. Schwenke, et al.

3.5     Interpretation, Explanation, Adaptation
The steps described so far provide a matching between suppliers, products and
buyers, that then can provide tripartite edges in a hypergraph. Using this graph
structure, as well as the case information contained in the ranking we can pro-
vide an explanation on why a product is ranked as it is. We can perform this on
different levels: First, we can show related cases. Second, we can visualise the
discovered subgroups for a given context C and show how each attribute space
is transformed and why. In this space the distance can be visualised and easily
interpreted by a human. Further we can give examples based on older searches
which illustrate the deviation in the given context to explain the base idea of
the weights. Finally, we can also apply techniques of CBR for case-based expla-
nation [14, 23]. In particular, we can also adapt cases utilising the top-n matches
of a query utilising the adaptation step in the CBR cycle [1], such that we can,
e. g., merge cases into prototypical cases for both summarising cases as well as
provide explainable candidate cases to the user, c. f., [9, 5].

4     First Results
In this section we present the results from subgroup discovery, evaluate a pre-
liminary example of rankings according to the knowledge from a domain expert
and showcase a portion of the induced hypergraph structure.

4.1     Subgroup discovery
As stated in section 3.4, we use subgroup discovery as a way to detect possi-
ble unknown sub-contexts within a larger context, such as sub-contexts of the
automobile industry. As outlined above, subgroup discovery operates by first
specifying a variable of interest (the ‘target variable’), and then applying a dis-
covery algorithm that identifies sets of membership criteria, also known as ‘pat-
terns’, such that the membership criteria identify a collection of instances with
some atypical average value, as determined by a quality function. The discovered
sub-contexts identify a selection of data points that are closely related and are
particularly relevant for the query at hand. Using the VIKAMINE system [6],
subgroups were identified within the different markets in the business domain.
Below are examples taken from the construction market, where each subgroup
is represented by a pattern of membership criteria:
    – ‘Elongation at Break’ = NA AND ‘OIT’ = NA AND ‘MFI Maximum’ < 1.5
    – ‘MFI Maximum’ < 1.5
    – ‘MFI Minimum’ < 1.0 AND ‘Elongation at Break’ = NA AND ‘MFI Maximum’
      < 1.5
    – ‘MFI Minimum’ < 1.0 AND ‘Elongation at Break’ = NA AND ‘OIT’ = NA
    – ‘MFI Minimum’ < 1.0 AND ‘MFI Maximum’ < 1.5

    These patterns indicate that there is a sub-context of the construction market
in which elongation at break and OIT (oxidative induction time) are not relevant,
and in which MFI values should be low.
                     Knowledge-Augmented Induction of Complex Networks             9

Table 1. Example: Top-5 of the ranking for a search query (in bold) in the context of
building constructions. For the abbreviation overview see:
https://www.professionalplastics.com/ACRONYMS?XLT_TO=en

Similarity Material Processing MFI-Min MFI-Max Density E Modulus TYS TYE OIT
Query: HDPE extrusie             0.3       0.5      0.96      900      25    11   10
0.998948  PE    extrusie          1.5      10.0     0.95      850.0    23.0 75.0 NA
0.998948  PE  spuitgieten         1.5      10.0     0.95      850.0    23.0 75.0 NA
0.997435 HDPE extrusie            0.7       0.7     0.97      560.0    24.0 NA NA
0.996710 HDPE extrusie            1.3       1.3     0.96      950.0    28.0 NA NA
0.996667 HDPE extrusie            1.5       1.5     0.95      900.0    25.0 NA NA


4.2   Example: Ranking-Induced Graph


Subgroup discovery helped us to
find smaller clusters in our given
                                                    s1       s2       s3
context. For enabling a knowledge-
augmented approach, we also in-
cluded some initial expert knowl-       b3
edge for the selection of important
features. In further steps and with
                                        b2
more data we want to develop a
semi-automatic selection of the im-
portant product features.               b1
    Table 1 depicts an example rank-
ing for a given query. The rows indi-              m1        m2       m3       m4
cate different product specifications.
According to inspection by a domain
expert, the produced top rows are Fig. 4. Tripartite HyperGraph example from
relevant, in particular rows 1-5 all our dataset where blue nodes are materials
include relevant specifications. How- (m), green are buyers (b) and red are suppli-
                                       ers (s).
ever, rows 3-5 are slightly more rele-
vant than the others, since for some
parameters, the deviations between
query and provided values should only deviate in one direction. So, ultimately
the ranking needs to be reordered using some additional domain constraints.
    This is an example of the domain knowledge that needs to be incorporated
into our knowledge augmented approach. It is important to note, however, that
the domain knowledge required to match products to buyers is quite complex,
and further work is needed to capture and exploit this knowledge. In our ap-
proach, we can either incorporate this using domain knowledge, or by enriching
the graph using multi-edges for capturing a larger set of matched options. Fi-
nally, Figure 4 shows an example visualisation of the hypergraph created from
a subset of the real-world data.
10      D. Hudson, L. Schwenke, et al.

5    Conclusions and Future Work
In this work, we focused on utilising complex industrial supply–demand–material
data for context-based search and ranking, which we also implemented in a sys-
tem prototype. We presented a framework for knowledge-augmented induction
of complex networks for modeling complex relationships in the context of het-
erogeneous data. Our hypergraph modelling approach can be generally applied
on supply chains with Supplier-Product-Buyer relations. Our proposed match-
ing process is suited to domains where previous buyer requests are informative
about (hidden) buyer contexts which in turn are informative about the impor-
tance and availability of attributes used for matching. In the application domain
of recycled plastics, our proposed knowledge-augmented data driven approach
showed first promising results according to the assessment of domain experts.
    Future steps include further domain specific fine-tuning of the matching, in-
corporating data about the selection step of real users to enable a fine-grained
application of CBR approaches, in particular taking the adaptation step of
CBR into account. This also concerns the further formalisation and inclusion
of domain knowledge into the proposed framework, in order to enable a re-
fined human-centred knowledge-based approach using specific constraints of real
users. In addition, we intend to investigate further refinements of the hypergraph
model, as well as augment the hypergraph model further, leading to knowledge
graph structures, such that both knowledge modeling as well as application can
be integrated into the same structural representation, e. g., [7, 36].
    Last but not least, we aim to analyse the industrial data with multiple com-
plex network methods. For example, (1) link prediction could be performed on
the supply-demand-material graph to infer new edges, and (2) community de-
tection can help with identifying new subgroups in the data. In addition, (3)
global graph metrics such as density and the average degree of the nodes in the
hypergraph could provide further insights into the characteristics of the data.


Acknowledgements
This work has been supported by Interreg NWE, project Di-Plast - Digital Cir-
cular Economy for the Plastics Industry (NWE729).


References
 1. Aamodt, A., Plaza, E.: Case-Based Reasoning: Foundational Issues, Methodologi-
    cal Variations, and System Approaches. AI Communications 7(1), 39 – 59 (1994)
 2. Arnold, C.W., El-Saden, S.M., Bui, A.A., Taira, R.: Clinical case-based retrieval
    using latent topic analysis. In: AMIA annual symposium proceedings. vol. 2010,
    p. 26. American Medical Informatics Association (2010)
 3. Atzmueller, M.: Mining social media: key players, sentiments, and communities.
    Wiley Interdisciplinary Reviews: DMKD 2(5), 411–419 (2012)
 4. Atzmueller, M.: Subgroup Discovery. WIREs Data Mining and Knowledge Discov-
    ery 5(1), 35–49 (2015). https://doi.org/10.1002/widm.1144
                     Knowledge-Augmented Induction of Complex Networks            11

 5. Atzmueller, M., Baumeister, J., Puppe, F.: Evaluation of two Strategies for Case-
    Based Diagnosis handling Multiple Faults. In: Proc. 2nd Conf. Professional Knowl-
    edge Management (WM2003). Luzern, Switzerland (2003)
 6. Atzmueller, M., Lemmerich, F.: VIKAMINE - Open-Source Subgroup Discovery,
    Pattern Mining, and Analytics. In: Proc. ECML/PKDD. Springer, Berlin, Ger-
    many (2012)
 7. Atzmueller, M., Sternberg, E.: Mixed-Initiative Feature Engineering Using Knowl-
    edge Graphs. In: Proc. 9th International Conference on Knowledge Capture (K-
    CAP). ACM Press, New York, NY, USA (2017)
 8. Bai, J., Nie, J.Y., Cao, G., Bouchard, H.: Using query contexts in information
    retrieval. In: Proc. annual international ACM SIGIR conference on Research and
    development in information retrieval. pp. 15–22. ACM, New York, NY, USA (2007)
 9. Baumeister, J., Atzmueller, M., Puppe, F.: Inductive Learning for Case-Based
    Diagnosis with Multiple Faults. In: Advances in Case-Based Reasoning. LNAI,
    vol. 2416, pp. 28–42. Springer, Berlin, Germany (2002)
10. Beylot, A., Villeneuve, J.: Assessing the national economic importance of metals:
    An input–output approach to the case of copper in france. Resources Policy 44,
    161–165 (2015)
11. Blanco, R., Cambazoglu, B.B., Mika, P., Torzec, N.: Entity recommendations in
    web search. In: International Semantic Web Conference. pp. 33–48. Springer (2013)
12. Brown, M.G.: An underlying memory model to support case retrieval. In: Proc.
    EWCBR. pp. 132–143. Springer, Heidelberg, Germany (1993)
13. Brown, P.J., Jones, G.J.: Context-aware retrieval: Exploring a new environment
    for information retrieval and information filtering. Personal and Ubiquitous Com-
    puting 5(4), 253–263 (2001)
14. Caro-Martinez, M., Recio-Garcia, J.A., Jimenez-Diaz, G.: An algorithm indepen-
    dent case-based explanation approach for recommender systems using interaction
    graphs. In: Proc. ICCBR. pp. 17–32. Springer (2019)
15. Cattuto, C., Schmitz, C., Baldassarri, A., Servedio, V.D., Loreto, V., Hotho, A.,
    Grahl, M., Stumme, G.: Network Properties of Folksonomies. AI Communications
    20(4), 245–262 (2007)
16. Crandall, R.E., Crandall, W.R., Chen, C.C.: Principles of supply chain manage-
    ment. CRC Press (2014)
17. Gu, M., Aamodt, A.: Evaluating cbr systems using different data sources: A case
    study. In: Proc. ECCBR. pp. 121–135. Springer (2006)
18. Hinz, O., Eckert, J.: The impact of search and recommendation systems on sales
    in electronic commerce. Business & Information Systems Engineering 2(2), 67–77
    (2010)
19. van den Hoogen, J., Bloemheuvel, S., Atzmueller, M.: The Di-Plast Data Science
    Toolkit – Enabling a Smart Data-Driven Digital Circular Economy for the Plastics
    Industry. In: Proc. DBDBD. JADS, ’s-Hertogenbosch, The Netherlands (2019)
20. Hotho, A., Jäschke, R., Schmitz, C., Stumme, G.: Information Retrieval in Folk-
    sonomies: Search and Ranking. In: Proc. ESWC. pp. 411–426. Springer, Heidelberg,
    Germany (2006)
21. Jäschke, R., Marinho, L., Hotho, A., Schmidt-Thieme, L., Stumme, G.: Tag rec-
    ommendations in folksonomies. In: Proc. ECML/PKDD. pp. 506–514. Springer,
    Heidelberg, Germany (2007)
22. Jin, N., Flach, P., Wilcox, T., Sellman, R., Thumim, J., Knobbe, A.: Subgroup
    discovery in smart electricity meter data. IEEE Transactions on Industrial Infor-
    matics 10(2), 1327–1336 (2014)
12      D. Hudson, L. Schwenke, et al.

23. Jorro-Aragoneses, J.L., Caro-Martı́nez, M., Dı́az-Agudo, B., Recio-Garcı́a, J.A.: A
    user-centric evaluation to generate case-based explanations using formal concept
    analysis. In: Proc. ICCBR. pp. 195–210. Springer (2020)
24. Kim, Y., Choi, T.Y., Yan, T., Dooley, K.: Structural investigation of supply net-
    works: A social network analysis approach. Journal of Operations Management
    29(3), 194–211 (2011)
25. Li, H., Sun, J.: Gaussian case-based reasoning for business failure prediction with
    empirical data in china. Information Sciences 179(1-2), 89–108 (2009)
26. Löw, N., Hesser, J., Blessing, M.: Multiple retrieval case-based reasoning for in-
    complete datasets. Journal of biomedical informatics 92, 103127 (2019)
27. Marinai, S.: Learning algorithms for document layout analysis. In: Handbook of
    Statistics, vol. 31, pp. 400–419. Elsevier (2013)
28. Montani, S., Portinale, L., Leonardi, G., Bellazzi, R., Bellazzi, R.: Case-based
    retrieval to support the treatment of end stage renal failure patients. Artificial
    Intelligence in Medicine 37(1), 31–42 (2006)
29. Moran, D., McBain, D., Kanemoto, K., Lenzen, M., Geschke, A.: Global supply
    chains of coltan: a hybrid life cycle assessment study using a social indicator.
    Journal of Industrial Ecology 19(3), 357–365 (2015)
30. Nuss, P., Chen, W.Q., Ohno, H., Graedel, T.: Structural investigation of aluminum
    in the us economy using network analysis. Environmental science & technology
    50(7), 4091–4101 (2016)
31. Park, Y.J., Kim, B.C., Chun, S.H.: New knowledge extraction technique using
    probability for case-based reasoning: application to medical diagnosis. Expert sys-
    tems 23(1), 2–20 (2006)
32. Puppe, F., Atzmueller, M., Buscher, G., Huettig, M., Lührs, H., Buscher, H.P.:
    Application and evaluation of a medical knowledge system in sonography (sono-
    consult). In: Proc. ECAI. pp. 683–687 (2008)
33. Rausch, J., Martinez, O., Bissig, F., Zhang, C., Feuerriegel, S.: Docparser: Hierar-
    chical document structure parsing from renderings. In: 35th AAAI Conference on
    Artificial Intelligence (AAAI-21)(virtual) (2021)
34. Ricci, F., Venturini, A., Cavada, D., Mirzadeh, N., Blaas, D., Nones, M.: Product
    recommendation with interactive query management and twofold similarity. In:
    Proc. ICCBR. pp. 479–493. Springer, Heidelberg, Germany (2003)
35. Sànchez-Marrè, M.: Principles of case-based reasoning
36. Sternberg, E., Atzmueller, M.: Knowledge-Based Mining of Exceptional Patterns
    in Logistics Data: Approaches and Experiences in an Industry 4.0 Context. In:
    Proc. ISMIS. LNCS, Springer, Berlin, Germany ((accepted) 2018)
37. Subramani, N., Matton, A., Greaves, M., Lam, A.: A survey of deep learning
    approaches for ocr and document understanding. arXiv preprint arXiv:2011.13534
    (2020)
38. Xu, J., He, X., Li, H.: Deep learning for matching in search and recommendation.
    In: The 41st International ACM SIGIR Conference on Research & Development
    in Information Retrieval. pp. 1365–1368 (2018)
39. Zamani, H., Croft, W.B.: Learning a joint search and recommendation model from
    user-item interactions. In: Proc. WSDM ’20. pp. 717–725 (2020)
40. Zhao, J., Zhang, Q., Sun, Q., Huo, H., Xiao, Y., Gong, M.: Folkrank++: An
    optimization of folkrank tag recommendation algorithm integrating user and item
    information. KSII Transactions on Internet and Information Systems 15(1), 1–19
    (2021)