Learning semantic rules for intelligent transport
                  scheduling in hospitals

                   Pieter Bonte, Femke Ongenae, and Filip De Turck

            IBCN research group, INTEC department, Ghent University - iMinds
                           Pieter.Bonte@intec.ugent.be


       Abstract. The financial pressure on the health care system forces many hos-
       pitals to balance budget while struggling to maintain quality. The increase of
       ICT infrastructure in hospitals allows to optimize various workflows, which offer
       opportunities for cost reduction.
       This work-in-progress paper details how the patient and equipment transports can
       be optimized by learning semantic rules to avoid future delays in transport time.
       Since these delays can have multiple causes, semantic clustering is used to divide
       the data into manageable training sets.


1   Introduction

Due to the continuing financial pressure on the health care system in Flanders, many
hospitals are struggling to balance budget while maintaining quality. The increasing
amount of ICT infrastructure in hospitals enables cost reduction, by optimizing various
workflows. In the AORTA project1, cost is being reduced by optimizing transport logis-
tics of patients and equipment through the use of smart devices, self-learning models
and dynamic scheduling to enable flexible task assignments. The introduction of smart
wearables and devices allows the tracking of transports and convenient notification of
personnel.
    This paper presents how semantic rules, in the form of OWL-axioms, can be learned
from historical data, to avoid future delays in transport time. These learned axioms are
used to provide accurate data to a dynamic transport scheduler, allowing an optimized
scheduling accuracy. For example, the system could learn that certain transports dur-
ing the visiting hour on Friday are often late and more time should be reserved for
those transport during that period. Since transport delays can have multiple causes,
semantic clustering is performed to divide the data in more manageable training sets.
The increasing amount of integrated ICT infrastructure in hospitals allows all facets of
these transport to be captured for thorough analysis. To learn accurate rules, a complete
overview of the various activities in the hospital is mandatory. Since this data is result-
ing from various heterogeneous sources, ontologies are utilized that have proven their
strengths in data integration [1]. The incorporation of the domain knowledge modeled in
the ontology, allows to learn more accurate rules. Furthermore, learning semantic rules
allows to understand and validate the learned results.
 1 http://www.iminds.be/en/projects/2015/03/10/aorta
2

2     Related Work
2.1   Learning Rules
Learning rules from semantic data can be accomplished through various methods.
The most prevalent are association rule mining [8] and Inductive Logic Programming
(ILP) [4]. ILP is able to learn rules as OWL-axioms and fully exploits the semantics
describing the data. Incorporating this domain knowledge makes this method more accu-
rate. Statistical relational learning is an extension of ILP that incorporates probabilistic
data and can handle observations that may be missing, partially observed, or noisy [2].
However, since our data is not noisy or possible missing, ILP was used in this research.
    DL-Learner [5] is an ILP framework for supervised learning in description logics
and OWL. Its Class Expression Learning for Ontology Engineering (CELOE) algo-
rithm [6] is a promising learning algorithm. It is a class expression learning algorithm
for supervised machine learning that follows a generate and test methodology. This
means that class expressions are generated and tested against the background knowledge
to evaluate their relevance. Furthermore, no explicit features need to be defined, since
the algorithms uses the structure of the ontology to select its features.

2.2   Semantic Similarity
Clustering algorithms use a distance measure to have a notion of how similar two
data points are. Traditional distance measures, such as the Euclidean measure, are not
applicable to semantic data. Therefore, a semantic similarity measure is used to calculate
the semantic distance (1 − semantic_similarity).
    Semantic similarity measures defines a degree of closeness or separation of target
objects [7]. Various semantic similarity measures exist, e.g. the Link Data Semantic
Distance [10] uses the graph information in the RDF resources, however it cannot deal
with literal values in the RDF data set.
    The closest to our approach is the The Maedche and Zacharias (MZ) [9] similarity
measure because it fully exploits the ontology structure. The MZ similarity differentiates
three dimensions when comparing two semantic entities (i) the taxonomy similarity, (ii)
the relation similarity and (iii) the attribute similarity. However, MZ does not take into
account that some relations between instances hold more information than others.


3     Data set and Ontology
Real data sets were received from two hospitals describing all transports and related
information over a timespan of several months. A tailored ontology has been created
to model all transport information. It describes the transports, the hospital layout, the
patients, the personnel and their relations.
    Based on the characteristics of the received data, a data set was generated to conduct
our experiments on. For example, about 25% of the scheduled transports do not arrive
on time. The relevant use cases, such as described in Section 5, were provided by the
hospitals as well. An elaborate description and example of the ontology and the generated
data set can be found on http://users.intec.ugent.be/pieter.bonte/aorta/ontology/.
                                                                                        3
                   Notification               Log In
                                                                          Dynamic
                    Manager                Information     Triplestore   Scheduler
                                                         Context
                                  Message Bus             Layer
                    Wearable                Location                      Rule
                    Devices                 Updates                      Learner


                  Fig. 1. The architecture of the designed AORTA platform

4   Architecture

Figure 1 visualizes the conceptual architecture of the dynamic transport planning and
execution platform that is currently being built within the AORTA project. Various
components can be discerned:
– Message Bus: enables the communication between the various components.
– Notification Manager: allows the planning of new transports to be communicated
  and allocated to the staff.
– Wearable devices: allows to interact with multiple wearable devices. This enables to
  communicate personalized tasks and transport to the personnel.
– Location Updates: captures the location updates of the executed transports and
  positioning of personnel.
– Log in Information: captures where personnel is logged in and on which wearable
  device they are reachable.
– Context Layer: captures all context changes from the Message Bus and constructs a
  view on what is happening in the hospital. All knowledge is stored in a Triplestore,
  using an ontology for easy integration of the data arriving from various sources and
  incorporation of background knowledge.
– Dynamic Scheduler: incorporates sophisticated scheduling algorithms in order to
  plan the transports in a dynamic fashion. The algorithms receive their input data
  from the Context Layer, e.g., current location of the staff or average walking speed in
  particular hallways.
– Rule Learner: analyzes the data in the Context Layer and learns why certain trans-
  ports were late. These learned rules are added to the context layer, enabling this
  knowledge to be taken into account when planning new transports. This allows the
  Context Layer to get a better grasp on what is going on in the hospital and provides
  the Dynamic Scheduler with more accurate input data.

The remainder of this paper focuses on the Rule Learner component.


5   Learning relevant rules

The goal of the rule learner, is to learn why certain transports were delayed and use this
information to optimize future transport scheduling. Examples of why a transport might
be late and their possible learned rules include:
– Popular visiting hours: transports taking place during the visitor hours on Friday or
  during the weekends have a considerable chance of delay.
  LateTransport1 ≡ executedAtDay some ({Friday})
4

– Busy passages: transports that use busy passages, such as a very busy elevator, are
  often late and thus should be avoided on busy moments or more time should be
  planned when taking this route.
  LateTransport2 ≡ hasRoute some (hasSegment some ({BusyElevator}))
– Personalized transport times: not all staff members are able to do the transports in the
  same amount of time.
  LateTransport3 ≡ executedBy some ({SlowNurse})
The CELOE algorithm from DL-Learner was utilized to discover hidden axioms, such
as those mentioned above, from the knowledge base. Once the rules are found with a
reasonable accuracy, they are added to the Context Layer, so they can be taken into
consideration when scheduling future transports.


5.1   Obstacles

Two obstacles occurred when learning expressions with the CELOE algorithm:
    As one can assume, there are various possible explanations why a set of transports
were late. The CELOE algorithm should thus result in a Class expression containing
multiple unifications to cover all possibilities. However, CELOE gives by design priority
to shorter expressions. It thus has difficulties to capture all the causes, even after fine-
tuning its parameters. To resolve this matter, we performed semantic clustering on the
set of late transports. Each resulting cluster then captures one of the possible causes
of the delay. If a detected cluster is sufficient large and represent a subset of data, it is
fed to CELOE separately to find a suitable explanation. These explanation are merged
afterward. The clustering of semantic data is further discussed in Section 5.2.
    Furthermore, learning causes such as in the first example (visitor hours on Friday
are busy) require a notion of time. However, days of the week or interesting periods,
e.g., vacations, are difficult to derive from the dateTimeStamp data type by the CELOE
algorithm. Therefore, we extended our semantic model with various concepts and indi-
viduals to incorporate more knowledge about the time points at which the transport was
planned and eventually executed. For example, the day of the week, the period of the
day (morning, noon, etc.) and the shifts of the staff are modeled.


5.2   Semantic Clustering

Calculating the semantic distance: Compared to the MZ similarity measure, we
propose a semantic similarity measure that takes into account that some relations between
certain individuals might hold more information than others. Comparable to Term
Frequency-Inverse Document Frequency (TF-IDF), which reflects how important a
word is to a document in a collection, some relations from one individual to another, are
more important to identify the similarity between two instances in a semantic knowledge
base. Since clustering takes place on the subset of transport data that were late, relations
to more frequently occurring individuals might hold more information. For example,
referring back to the first example, executedOnDay relationships to Friday occur more
frequently in the data set of late transports than other days. As such, the executedOnDay
relationships to Friday should get more weight. Formally, this means that when two
                                                                                                 5

individuals have the same relation to a third individual (of type C) which is more
frequently referred to than other individuals of type C, extra weight is added to the
similarity of this relation. The similarity can be measured as:
                                   taxSim(i, j) + relSim(i, j) + attSim(i, j)
                PBSim(i, j) = ∑
                                                       3
 The calculation of taxonomy (taxSim) and attribute similarity (attSim) is similar to
those proposed by the MZ measure. The relational similarity is further elaborated upon.
To explain this similarity some additional functions and terminalogy are introduced.
D = document describing all Classes (C), Relations (Rel) and Individuals (Inds).
            linkFreq(x) = |{(s, p, x)|(s, p, x) ∈ D, p ∈ Rel, s, x ∈ Inds}|
           typeFreq(x) = |{(s, p, y)|(s, p, y) ∈ D, p ∈ Rel, s, x, y ∈ Inds,C(x) = C(y)}|
    distinctTypeFreq(x) = |{y|x, y ∈ Inds,C(x) = C(y)}|
                  R(x) = {p|(x, p, y) ∈ D, p ∈ Rel, x, y ∈ Inds}
               ER(x, q) = {y|(x, q, y) ∈ D, q ∈ Rel, x, y ∈ Inds}
The similarity itself can be defined as min(1, rSim(i, j)):
               (                            linkFreq(e)∗distinctTypeFreq(e)
                 ∑r∈R(i),r∈R( j) ∑e∈ER(i,r)           typeFreq(e)             if e ∈ ER( j, r)
 rSim(i, j) =
                 ∑r∈R(i),r∈R( j) ∑e∈ER(i,r) PBSim(e,    ER( j, r)             otherwise
     It gives more weight to relations to individuals that occur more often than others.

Clustering semantic data: To cluster the data in more managable subset, the K-Means
clustering algorithm is utilized. It calculates centroids to compute to which cluster the
various data points belong to. In the original algorithm, the centroid is an averaged
vector of all the data points in that cluster. Since there are no vectorized data points
in our cluster, this is not possible. Therefore, a mean individual is calculated for each
cluster and used as centroid, similar to the mean schema proposed by Fleischhacker et
al. [3]. The type of the centroid is based on the most occurring type of individuals in
the cluster. When no consensus can be made, the class hierarchy is taken into account.
The most occurring attributes are selected, while the specific value of the literals can be
averaged over the occurring values. When selecting the relations, the relations occurring
with more than half of the individuals in the cluster are added.

6     Results
A data set was generated reflecting the characteristics of a real hospital setting. Various
reasons why the transport is late are integrated in the data over various clusters. Since
DL-Learner can only discover one of the reason with an accuracy of 80.46%, and a
F-measure of 41.77%, clustering was performed. The K-Means algorithm is utilized
to compare the MZ similarity measure and our own PB similarity. Table 1 shows
the clustering precision and recall and the accuracy (DL-Acc) and F-measure (DL-F)
                                                                               TP
retrieved from DL-learner on the clusters. The precision is calculated as T P+FP    and the
             TP
recall as T P+FN . With the true positives (TP), false positives (FP) and false negatives
(FN) defines as follows:
6
          Table 1. The results utilizing K-Means for the two similarity measures.
                  PB Similarity                    MZ Similarity
        Precision Recall DL-Acc DL-F Precision Recall DL-Acc DL-F
K-Means   0.95     0.94 99.94% 99.96% 0.43            0.45 36.7% 36.7%

T P = number of transports correctly assigned to clusteri .
FP = number of transports incorrectly assigned to clusteri .
FN = number of transports incorrectly assigned to a cluster j , instead of clusteri .
    Note that the results are averages over the various clusters. The combination of K-
Means and our own similarity measures allows DL-Learner to learn the expected rules
with acceptable accuracy, which was not the case before clustering the data. Using the
MZ similarity measure, clustering often fails to identify the correct clusters. This results
in an unmanageable subset of data for DL-Learner to perform its rule learning on.

7    Conclusion & Future Work
Scheduling transports can be optimized by learning from past delays. The possible
high number of various reasons for this delay can be handled by performing semantic
clustering on the data set, producing more manageable data sets for learning algorithms
such as DL-Learner’s CELOE.
    In future work, we will investigate how the learned rules can be incorporated to
influence the assignment. Furthermore, a statistical relational learning solution will be
researched, to be able to handle possible missing and noisy data.

Acknowledgment: This research was partly funded by the AORTA project, co-funded
by the IWT, iMinds, Xperthis, Televic Healthcare, AZMM and ZNA.

References
 1. Bergamaschi, et al.: Semantic integration of heterogeneous information sources. Data &
    Knowledge Engineering 36, 215–249 (2001)
 2. De Raedt, L., Kersting, K.: Statistical relational learning. In: Encyclopedia of Machine
    Learning, pp. 916–924. Springer (2011)
 3. Fleischhacker, et al.: Mining rdf data for property axioms. In: On the Move to Meaningful
    Internet Systems: OTM 2012, pp. 718–735. Springer (2012)
 4. Konstantopoulos, S., et al.: Formulating description logic learning as an inductive logic
    programming task. In: FUZZ-IEEE. pp. 1–7 (2010)
 5. Lehmann, J.: Dl-learner: Learning concepts in description logics. J. Mach. Learn. Res. 10,
    2639–2642 (Dec 2009)
 6. Lehmann, J., et al.: Class expression learning for ontology engineering. Web Semantics:
    Science, Services and Agents on the World Wide Web 9(1), 71–81 (2011)
 7. Lin, C., et al.: Sherlock: A Semi-automatic Framework for Quiz Generation Using a Hybrid
    Semantic Similarity Measure. Cognitive Computation 7(6), 667–679 (2015)
 8. Nebot, V., Berlanga, R.: Finding association rules in semantic web data. Knowledge-Based
    Systems 25(1), 51–62 (2012)
 9. Ng, R.T., et al.: Clustering ontology-based metadata in the semantic web. Principles of Data
    Mining and Knowledge Discovery 1704(February), 262 – 270 (1999)
10. Passant, A.: Measuring semantic distance on linking data and using it for resources recom-
    mendations. In: AAAI spring symposium: linked data meets artificial intelligence. vol. 77, p.
    123 (2010)