Segmenting Sequences Semantically
  Using Petri Net Transducers for the Translation from Sequential
                  Data to Non-Sequential Models

                         Markus Huber1,2(B) and Matthias Wolff1
           1   Brandenburg University of Technology Cottbus-Senftenberg, Germany
                           2 InnoTec21 GmbH, Leipzig, Germany

                                markus.huber@b-tu.de


       Abstract In previous work we presented an extension and generalisation of finite
       state transducers (FSTs) to so-called Petri net transducers (PNTs). These are
       applicable to any form of transforming sequential input signals into non-sequential
       output structures – which can be used to represent the semantics of the input –
       by performing a weighted relation between partial languages, i.e. assigning one
       weight per related input-output-pair. This paper extends the framework by an
       additional weighted output where every node of the structure carries a weight.
       Moreover the output structure induces segments on all intermediate transducers
       and thus on the input giving a more detailed view on how the single weight
       emerged during the transformation. We extend the definitions of PNTs, show
       the resulting changes for the operation of language composition, and discuss the
       impact on other PNT-algorithms. The theory is accompanied by an example of
       translating a keyboard input stream into a structural representation of commands
       to be executed.

       Keywords: Petri net transducer · Weighted labelled partial order · Concurrent
       semiring · Natural language understanding · Cognitive systems


1 Introduction
In [16] we introduced a special form of Petri nets for the weighted translation of partial
languages. We showed how Petri net transducers (PNTs) are a generalisation of finite
state transducers (FSTs) [2] and discussed several operations including language com-
position where a cascade of transducers translates from a source language into a target
language by means of intermediate languages. In [23,14,15] we proposed the use of
PNTs in the field of semantic dialogue modelling for the translation of utterances into
meanings. We first give a brief overview of the area of research within which our current
work is developed.
    Cognitive systems (including speech dialogue systems) mainly consist of three parts
allowing them to interact with their environments: perceptor, behaviour control, and
actuator [7]. Of these perceptor and actuator each comprise a hierarchy of transforming
units and bidirectional communication between the particular levels (cf. [21]).
    The responsibility of the perceptor hierarchy is translating input signals into semantic
representations where the relevant parts from the input are related to semantic categories


                                           139
relevant to the system. While a low-level signal – e.g. the name Wolfgang Amadeus
Mozart when spoken by a person – is sequential, its semantics is, in general, non-
sequential – although there is an order relation between Wolfgang and Amadeus there is
no order between those and Mozart.
    We use the concept of feature-values-relations (FVRs) [10,9] as semantic repres-
entations. In figure 1 the semantic structure for the name Wolfgang Amadeus Mozart
can be seen where the semantic categories are depicted as ellipses and the parts from
the signal as rectangles. The order relation between Wolfgang and Amadeus is retained
by the order of their semantic categories. Although the category of middle name exists,
examples as Pippilotta Viktualia Rullgardina Krusmynta Efraimsdotter Långstrump or
Oscar Zoroaster Phadrig Isaac Norman Henkle Emmannuel Ambroise Diggs should
motivate the need for a more general approach to repetition. Note that although the
example obeys a tree-like form, FVRs are neither limited to trees nor need to be single
rooted. Which semantic categories and which parts of an input signal are of relevance
for a concrete system depends on its scope of action. Therefore semantics is clearly
not a function of (only) the input signal and especially not a function of its syntax. By
assigning weights to the nodes of an FVR it is possible to represent the confidence of
each single part of the semantics as opposed to an overall weight.
    Examples for non-sequential semantic representations are concept relational trees
[1] which structures are constraint to binary trees, and abstract meaning representation
[17] and feature structures [18] which can have a more general structure. All of them
are single rooted, do not provide a nice way to preserve order information of repeated
parts, and are computed from syntactical analysis. Furthermore none of them carry
weights on individual nodes and only the last two have assigned an overall weight.
[20] uses relational trees where connections between entities relevant for the system
are represented. The authors state that ‘in most cases a relation link is analogous to a
syntactic dependency link’ and again these tree-like structures are single rooted, do not
cover the preservation of order information, and there is only a single weight assigned.
    We use labelled partial orders (LPOs) to represent FVRs, and PNTs to represent sets
of LPOs and sets of pairs of LPOs. The transducer operation of language composition
allows building hierarchical bidirectional translation systems and since PNTs are a
generalisation of FSTs we can utilise results from existing research in the field of
speech technology [8,25] to build on. The resulting transducer cascade allows bottom-up
processing of input signals to output decoded meaning, and top-down processing using
prior knowledge and expectations. Systems as described in [20] all have a gap between
syntax and semantics since they first translate input signals into syntactic representations
and do their semantic analysis afterwards. Our approach needs no premature decisions

                                           name

                      given name                           surname

                       Wolfgang         given name          Mozart

                                         Amadeus


             Figure 1. Semantic representation of ‘Wolfgang Amadeus Mozart’.


                                         140
                                                                         CMD/1.00
     c   p     P   N   T   s     A   T   A   E   S   /                   cp/0.81
                                                            SRC/1.00       DST/1.00
                                                           FILE/0.50       DIR/1.00
                                                           PNTs/0.63       ATAED/0.01


             Figure 2. Segmented input sequence and weighted output structure


because we directly translate from the signal level into semantic representations. Also
the priming of the perceptor hierarchy by simply appending another transducer which
represents expected semantics is (to our knowledge) a unique feature.
    The semantic structure – the output of the perceptor – serves as input to the behaviour
control which computes the reaction of the cognitive system. These reactions also include
interactions with the environment to request additional information for an incomplete
structure. New input is incorporated into the already collected information. In [5] several
operations on FVRs are described which admit an algebraic structure on the set of FVRs.
In [20,19] a similar approach is described where relational trees are used to represent
the information state of the system and some operations are described using so-called
tree regular expressions. However, there is a complete lack of algebraic investigation.
    The computed output of the behaviour control is again a semantic structure which
gets translated by the actuator hierarchy into an output signal. According to system
theory the actuator should use the same formalism as the perceptor (cf. [8]). Since PNTs
provide the operation of inversion – by swapping the input and output label of each
transition – they constitute a good choice. In contrast to PNTs tree transducers [12] and
DAG transducers [18] are both derivation tree oriented and in general not closed under
language composition and inversion. Both do not allow semantic priming in general but
can be used for generation with some reservations.
    Speech dialogue systems must operate under real-time conditions. There is a period
of a few hundred milliseconds for analysing the input, computing the reaction, and
generating the output. Systems such as the Unified Approach to Signal Synthesis and
Recognition (UASR) [8,24] have shown that FSTs can fulfil these requirements. We
are confident that PNTs are also capable of meeting these demands and are currently
developing an embedded cognitive user interface for intuitive interaction with arbitrary
electronic devices [3] where we already applied first results.
    In previous work we also proposed a variant of PNTs capable of handling output
LPOs of arbitrary width (rhPNTs [11]) which are needed to produce general structures
as in figure 1. For the here presented work we do not use that extension but we will
describe another enhancement of PNTs which is capable of segmenting the input (sensor
signal, speech or some text) into semantic units and assigning meaningful weights –
e.g. likelihoods or probabilities – to these segments. Figure 2 shows a simple example of
an input text and the decoded semantics. The scenario is that a user types the command
cp PNTs ATAES/ on the keyboard trying to copy a file named PNTs to a folder
named ATAED. The input – a sequence of character events – is translated into a semantic
structure for executable commands including correction of the destination of the copy-
command. So the system executes the user’s intention not the command he typed.


                                         141
    The paper is organised as follows: In section 2 we recall basic definitions and
introduce in section 3 the used translation formalism, always demonstrating the concepts
on our example. Thereby subsections 3.1 and 3.2 reuse the definitions from [16] and
subsection 3.2 extends the definitions from [11]. In section 4 we introduce the new
notion of segments of weighted LPOs and extend Petri net transducers to make use of
them. Afterwards, we introduce segmenting Petri net transducers fusing the concepts
given so far and finally give a brief conclusion and outlook on future work in section 5.


2 Mathematical Preliminaries

By N0 we denote the set of non-negative integers, by N the set of positive integers.
     The set of all multisets over a set X is the set N0 X of all functions f : X → N0 .
Addition + on multisets is defined by (m + m0 )(x) = m(x) + m0 (x). The relation ≤
between multisets is defined through m ≤ m0 :⇔ ∃m00 (m + m00 = m0 ). We write x ∈ m
if m(x) > 0. A set A ⊆ X is identified with the multiset m satisfying (m(x) = 1 ⇔ x ∈
A) ∧ (m(x) = 0 ⇔ x 6∈ A). A multiset m satisfying m(a) > 0 for exactly one element a
we call singleton multiset and denote it by m(a)a.
     Given a binary relation R ⊆ X ×Y over the sets X,Y and sets A ⊆ X and B ⊆ Y , we
denote the image of A by R(A) = {y ∈ Y | ∃x ∈ A : (x, y) ∈ R} and the preimage of B by
R−1 (B) = {x ∈ X | ∃b ∈ B : (x, b) ∈ R}. We denote by Dom(R) = R−1 (Y ) the domain of
R and call R(X) the image of R. For X 0 ⊆ X and Y 0 ⊆ Y the restriction of R onto X 0 ×Y 0
is denoted by R|X 0 ×Y 0 .
     Given a binary relation R ⊆ X ×Y and a binary relation S ⊆ Y × Z for sets X,Y, Z,
their composition is defined by R ◦ S = {(x, z) | ∃y ∈ Y ((x, y) ∈ R ∧ (y, z) ∈ S)} ⊆ X × Z.
For a binary relation R ⊆ X × X over a set X, we denote    S
                                                                R1 = R and Rn = R ◦ Rn−1 for
n ≥ 2. The symbol R+ denotes the transitive closure n∈N Rn of R.
     Let A be a finite set called an alphabet. A word over A is a finite sequence of
symbols from A . For a word w its length |w| is the number of its symbols. The symbol ε
denotes the empty word satisfying |ε| = 0 and is the neutral w.r.t. concatenation of words:
wε = εw = w. By A ∗ we denote the set of all words over A , including the empty word.
A step over A is a multiset over A . A step sequence over A is an element of (N0 A )∗ .
     A directed graph is a pair G = (V, →), where V is a finite set of nodes and → ⊆ V ×V
is a binary relation over V , called the set of edges. For a node v ∈ V its preset is the
set • v = →−1 ({v}) and its postset the set v• = →({v}). A path is a sequence of (not
necessarily distinct) nodes v1 . . . vn (n > 1) such that vi → vi+1 for i = 1, . . . , n − 1. A
path v1 . . . vn is a cycle if v1 = vn . A directed graph is called acyclic if it has no cycles.
For an acyclic directed graph G = (V, →) its maximal nodes is the set Max(G) = {v ∈
V | v• = 0},
           / its minimal nodes is the set Min(G) = {v ∈ V | • v = 0}.  / An acyclic directed
graph (V, →0 ) is an extension of an acyclic directed graph (V, →) if → ⊆ →0 .
     An irreflexive partial order over a set V is a binary relation < ⊆ V × V which is
irreflexive (∀v ∈ V : v 6< v) and transitive (<=<+ ). A reflexive partial order over a set V
is a binary relation ≤ ⊆ V ×V which is reflexive (∀v ∈ V : v ≤ v), transitive (≤ = ≤+ )
and antisymmetric (∀v, w ∈ V : v ≤ w ∧ w ≤ v =⇒ v = w). We identify a finite partial
order ∼ over V with the directed graph (V, ∼). Given a partial order po = (V, ∼) we call


                                           142
two nodes v, v0 ∈ V independent if v 6∼ v0 and v0 6∼ v. By co∼ ⊆ V ×V we denote the set
of all pairs of independent nodes of V .
    A monoid is a triple (S, ∗, n) where S is a set, ∗ is a binary closed operation on S
(∀a, b ∈ S : a∗b ∈ S), and n is the neutral w.r.t. ∗ (∀a ∈ S : a∗n = a = n∗a). The operation
is often written by juxtaposition (ab = a ∗ b). If ∗ is idempotent (∀a ∈ S : aa = a) the
monoid is called idempotent. If ∗ is commutative (∀a, b ∈ S : ab = ba) the monoid is
called commutative. If there exists an absorbing element 0 ∈ S (∀a ∈ S : a0 = 0 = 0a)
the monoid is called a monoid with zero. A monoid is called ordered if it is equipped
with a reflexive partial order ≤ such that the operation ∗ is monotone (∀a, b ∈ S : a ≤
b =⇒ (∀c ∈ S : c ∗ a ≤ c ∗ b ∧ a ∗ c ≤ b ∗ c)).
    For a given monoid (S, ∗, n) the operation ∗ defines a binary relation on S via
a ≤∗ b :⇔ ab = b. If this relation is a reflexive partial order, the monoid is called
naturally ordered.
    An idempotent and commutative monoid (S, ∗, n) is naturally ordered. Moreover, if
S is equipped with the reflexive partial order ≤∗ , then ∀a, b ∈ S : ab = sup{a, b}, where
the supremum is taken w.r.t. ≤∗ .


3 Input and Output

We use LPOs extended by weights from an algebraic structure called bisemiring to
represent the input and output structures as depicted in figure 2 and use PNTs by the
means of non-sequential semantics of Petri nets for the weighted translation of LPOs.


3.1   Weighting of Structures

Except for definition 2 this subsection equates the state from [16].
    A labelled partial order (LPO) over a set X is a 3-tuple (V, <, l), where (V, <) is an
irreflexive partial order and l : V → X is a labelling function on V . In most cases, we
only consider LPOs up to isomorphism, i.e. only the labelling of nodes is of interest, but
not their names. Two LPOs (V, <, l) and (V 0 , <0 , l 0 ) are isomorphic if there is a bijective
renaming function I : V → V 0 satisfying l(v) = l 0 (I(v)) and v < w ⇔ I(v) <0 I(w). In
figures, we only show the labels of nodes, and normally omit transitive arrows. If an LPO
lpo is of the form ({v} , 0,
                          / l), then it is called a singleton LPO. A step-wise linear LPO is
an LPO (V, <, l) where the relation co< is transitive. The maximal sets of independent
nodes are called steps. The steps of a step-wise linear LPO are linearly ordered. Thus,
step-wise linear LPOs can be identified with step sequences. A step-linearisation of an
LPO lpo is a step-wise linear LPO lpo0 which is an extension of lpo.
    The set of series-parallel LPOs (sp-LPOs) is the smallest set of LPOs containing all
singleton LPOs (over a set X) and being closed under sequential and parallel product.
For two LPOs lpo1 = (V1 , <1 , l1 ) and lpo2 = (V2 , <2 , l2 ), where V 1 ∩V2 = 0/ is assumed,
their sequential product is defined by lpo1 ; lpo2 = (V1 ∪V2 , <1 ∪ <2 ∪V1 ×V2 , l1 ∪ l2 )
and their parallel product is defined by lpo1 k lpo2 = (V1 ∪V2 , <1 ∪ <2 , l1 ∪ l2 ). For an
LPO lpo we denote by SP(lpo) the set of all series-parallel extensions of lpo and by
SPmin (lpo) the set of all minimal series-parallel extensions of lpo in SP(lpo). If lpo is an
extension of lpo0 , we write lpo ≤ lpo0 .


                                           143
    A semiring is a quintuple S = (S, ⊕, ⊗, 0, 1), where (S, ⊕, 0) is a commutative
monoid, (S, ⊗, 1) is a monoid with zero where 0 is the absorbing element, and ⊗ (the S-
multiplication) distributes over ⊕ (the S-addition) from both sides. If ⊗ is commutative,
then the semiring is called commutative.
    A bisemiring is a six-tuple S = (S, ⊕, ⊗, , 0, 1), where (S, ⊕, ⊗, 0, 1) is a semiring
and (S, ⊕, , 0, 1) is a commutative semiring. The binary operation  on the set S is
called S-parallel multiplication. If ⊕ is idempotent, the bisemiring is called idempotent.
    A concurrent semiring is an idempotent bisemiring (S, ⊕, ⊗, , 0, 1) satisfying
                 ∀a, b, c, d ∈ S : (a  b) ⊗ (c  d) ≤⊕ (a ⊗ c)  (b ⊗ d).                    (CS)
A weighted LPO (wLPO) over an alphabet A and a bisemiring S = (S, ⊕, ⊗, , 0, 1)
is a quadruple (V, <, l, ν) such that (V, <, l) is an LPO over A and ν : V → S is an
additional weight function. We use all notions introduced for LPOs also for wLPOs.
    The weight of sp-wLPOs is defined in the obvious way. The total weight is computed
from the weights of the nodes through applying ⊗ to the sequential product and  to
the parallel product of sub-wLPOs. This is well-defined, since the set of sp-wLPOs as
well as the sub-structure (S, ⊗, ) of a bisemiring (S, ⊕, ⊗, , 0, 1) form an sp-algebra
admitting an sp-algebra homomorphism from the set of sp-wLPOs into the bisemiring.
For an sp-wLPO wlpo its weight is denoted by ω(wlpo).
    The weight for general wLPOs is computed as the sum of the weights of all its
series-parallel extensions. Condition (CS) ensures that less restrictive wLPOs yield
bigger weights. So in the case of using concurrent semirings it suffices to consider only
minimal series-parallel extensions.
Definition 1 (sp-Weight of wLPOs). Let wlpo = (V, <, l, ν) be a wLPO over a concur-
rent semiring. Then its sp-weight is defined by ω sp (wlpo) = ⊕wlpo0 ∈SPmin (wlpo) ω(wlpo0 ).
Figure 3 shows on the left the semantic structure from our example. The weights are
drawn from the concurrent semiring ([0, 1], max, ·, ·, 0, 1) – a trivial extension of the
Viterbi semiring. The sp-weight is easy to compute since the structure is an sp-wLPO.
    For wLPOs over the same bisemiring having isomorphic underlying LPOs we provide
a new sum-operation. The result is again a wLPO with an isomorphic underlying LPO
but all nodes carry the sum of the weights of their corresponding nodes.
Definition 2 (Sum of wLPOs). Let wlpo = (V, <, l, ν) and wlpo0 = (V 0 , <0 , l 0 , ν 0 ) be
two wLPOs over the bisemiring S = (S, ⊕, ⊗, , 0, 1) having isomorphic underlying
LPOs and let I : V → V 0 be the corresponding bijective renaming function. Then their
sum is defined by wlpo ⊕ wlpo0 = (V, <, l, ν ⊕ (ν 0 ◦ I)), where the sum of the weight
functions is declared pointwise.

                 CMD/1.00                         CMD/1.00                       CMD/1.00
                 cp/0.81                          cp/0.34                        cp/0.81
    SRC/1.00       DST/1.00     ⊕    SRC/1.00       DST/1.00     =    SRC/1.00     DST/1.00
   FILE/0.50       DIR/1.00         FILE/0.50       DIR/1.00         FILE/0.50     DIR/1.00
   PNTs/0.63       ATAED/0.01       PNTs/0.76       ATAED/0.02       PNTs/0.76     ATAED/0.02


                  Figure 3. Semantic structure as wLPO and sum of wLPOs


                                            144
Figure 3 also shows a second wLPO, representing the same semantics, which get summed
up with the first one. The new weights equal the maximum of the original weights.


3.2   Translation of Structures

We use the concept of transducers to translate between LPOs by augmenting weighted
transitions with input and output symbols. A run is represented as a wLPO over the
transitions which can then be projected onto the input resp. output symbols. This sub-
section equates the states from [16,11] and extends the notions from [11] with weights.
Additionally, definitions 5, 7 and 8 provide formalisations of some concepts from [16,11].
     A place/transition Petri net (PT-net) is a 4-tuple N = (P, T, F,W ), where P is a finite
set of places, T is a finite set of transitions disjoint from P, F ⊆ (P × T ) ∪ (T × P) is
the flow relation and W : (P × T ) ∪ (T × P) → N0 is a flow weight function satisfying
W (x, y) > 0 ⇔ (x, y) ∈ F. A marking of a PT-net assigns to each place p ∈ P a number
m(p) ∈ N0 of tokens, i.e. a marking is a multiset over P representing a distributed
state. A marked PT-net is a PT-net N = (P, T, F,W ) together with an initial marking
m0 . For (transition) steps τ over T we introduce the two multisets of places • τ(p) =
∑t∈T τ(t)W (p,t) and τ • (p) = ∑t∈T τ(t)W (t, p). A transition step τ can occur in m if
m ≥ • τ. If τ occurs in m, the resulting marking m0 is defined by m0 = m − • τ + τ • . We
              τ
write m →    − m0 to denote that τ can occur in m and its occurrence leads to m0 . A step
execution in m is a finite step sequence over T τ1 . . . τn such that there are markings
                       τ1    τ2     τn
m1 , . . . , mn with m −
                       → m1 −→ ... −→ mn .
     We use LPOs over T to represent single non-sequential runs of PT-nets, i.e. the labels
of an LPO represent transition occurrences. For a marked PT-net N = (P, T, F,W, m0 )
an LPO lpo = (V, <, l) over T is an LPO-run if each step-linearisation of lpo is a step
execution of N in m0 . If an LPO-run lpo = (V, <, l) occurs in a marking m, the resulting
marking m0 is defined by m0 = m − ∑v∈V • l(v) + ∑v∈V l(v)• . We denote the occurrence
                           lpo
of an LPO-run lpo by m −→ m0 .
    A Petri Net Transducer is a PT-net where each transition is augmented with an input
and an output symbol and additionally carries a weight drawn from a bisemiring.

Definition 3 (Petri Net Transducer). A Petri Net Transducer (PNT) over a bisemiring
S = (S, ⊕, ⊗, , 0, 1) is defined as a tuple N = (P, T, F,W, pI , pF , Σ , σ , ∆ , δ , ω), where

 – (P, T, F,W, m0 ) with m0 = pI is a marked PT-net (called the underlying PT-net),
   pI ∈ P is the source place satisfying • pI = 0/ and pF ∈ P is the sink place satisfying
   pF • = 0,
           /
 – Σ is a set of input symbols and σ : T → Σ ∪ {ε} is the input mapping,
 – ∆ is a set of output symbols and δ : T → ∆ ∪ {ε} is the output mapping and
 – ω : T → S is the weight function.

A wLPO wlpo = (V, <, l, ν) over T is a wLPO-run of N
                                                                             lpo
 – if the underlying LPO lpo = (V, <, l) is an LPO-run of N with pI −→ pF and
 – if ν(v) = ω(l(v))) holds for all v ∈ V .


                                           145
We denote by wLPO(N) the set of all wLPO-runs of N.
A PNT can be used to translate a partial language into another partial language, relating
so-called input words to so-called output words. Input and output words are defined
as LPOs (V, <, l) with a labelling function l : V → A ∪ {ε} for some input or output
alphabet A . Such LPOs we call ε-LPOs. For each ε-LPO (V, <, l) we construct the
corresponding ε-free LPO (W, <|W ×W , l|W ), W = V \ l −1 (ε) by deleting ε-labelled
nodes together with their adjacent edges. Since partial orders are transitive, this does not
change the order between the remaining nodes.
Definition 4 (Input and Output Labels of Runs). Let N = (P, T, F,W, pI , pF , Σ , σ , ∆ ,
δ , ω) be a PNT and let wlpo = (V, <, l, ν) ∈ wLPO(N). The input label of wlpo is the
LPO σ (wlpo) corresponding to the ε-LPO (V, <, σ ◦ l). The output label of wlpo is the
LPO δ (wlpo) corresponding to the ε-LPO (V, <, δ ◦ l).
     For LPOs u over Σ and v over ∆ , we denote by wLPO(N, u) the subset of all wLPOs
wlpo from wLPO(N) with input label σ (wlpo) = u, and by wLPO(N, u, v) the subset of
all wLPOs from wLPO(N, u) with output label δ (wlpo) = v.
The input language of a PNT is the set of all input labels of its weighted LPO-runs.
Its elements are called input words. Output language and output words are defined
analogously.
    A PNT assigns weights to all pairs of LPOs u over Σ and v over ∆ based on the
weights of its wLPO-runs (cf. [16,13]). Concerning the semantics, only the input output
behaviour of PNTs is relevant. Since transitions also may have empty input and/or empty
output, there are always (infinitely) many PNTs having the same semantics. For practical
application, such PNTs are equivalent (cf. [16]).
    For our example the PNT Nw generating the output word cp PNTs ATAED/ is shown
in the upper part of figure 4 on the following page. Note that the symbols themselves are
sequences of symbols and the typo of the user was corrected. This was possible since the
system already did several translations of the original input sequence. First the sequence
was translated into a set of sequences where every input symbol created alternatives
according to the keyboard layout. If you type an S on a QWERTZ keyboard it could be
that your intention was to type a D since the two keys are neighboured. Figure 5 on the
next page shows a PNT which realises this correction for the letter S. The next step was
translating from the many sequences of characters into sequences of known words. The
system could filter out many sequences since it knows only about some commands, files
and folders. Since there was no folder named ATAES but there exists a folder ATAED,
the shown sequence of words was generated. The PNT No in the lower part translates
between the two words depicted below of it. Since there is a file PNTs and also a folder
with the same name, the weight for FILE is also lesser than 1.
    Note that the input and output languages of a PNT N are extension closed, since
wLPO(N) is extension closed. This allows No from figure 4 to also accept the two step-
linearisations cp PNTs ATAED/ and cp ATAED/ PNTs as input. Indeed these would
be translated into step-linearisations of the depicted output word with an additional edge
between PNTs and ATAED.
    We now introduce the central transducer composition operation of language com-
position. It allows to put several transducers in a row to translate from the input of the


                                         146
              ε:cp/0.81    ε:PNTs/0.63ε:ATAED//0.01           cp        PNTs ATAED/
 Nw      1


                                                 ε:DST/1.00          ε:DIR/1.00 ATAED/:ATAED/1.00

              ε:CMD/1.00       cp:cp/1.00                                                           2
 No      1

                                                 ε:SRC/1.00        ε:FILE/0.50   PNTs:PNTs/1.00

                ATAED/                              DST       DIR       ATAED
         cp                       CMD       cp
                  PNTs                              SRC       FILE       PNTs


                             Figure 4. Two PNTs with input and output words


first one to the output of the last one utilising the intermediate ones. We consider a fixed
concurrent semiring S = (S, ⊕, ⊗, , 0, 1) and a PNT Na over S with input alphabet
Σa and output alphabet ∆a and a PNT Nb over S with input alphabet Σb = ∆a and output
alphabet ∆b . We assume Na and Nb to be disjoint.
     We define a binary relation $ ⊆ Ta × Tb on the Cartesian product of the transitions
from Na and Nb by ta $ tb :⇔ δa (ta ) = σb (tb ) and say that ta and tb build a composable
pair. Any transitions ta ∈ Ta with empty output (δa (ta ) = ε) and tb ∈ Tb with empty
input (σb (tb ) = ε) are called non-invasive. Any invasive transition which is not part of a
composable pair is said to be futile.
     The construction corresponds to the parallel product of Na and Nb and merging each
composable pair of transitions (ta ,tb ) to a new transition with input symbol σa (ta ) and
output symbol δb (tb ), weight ωa (ta )  ωb (tb ) and connections •ta + •tb and ta • + tb • .
Moreover, we keep all non-invasive transitions of Na and Nb unchanged, and omit all
other transitions of Na or Nb , i.e. all futile transitions.
     Figure 6 on the following page shows the result of the language composition of Nw
and No from figure 4. The nodes filled with dots establish the parallel product, the greyed
transitions are the result of merging transitions from composable pairs, and all white
transitions are non-invasive ones from No . The greyed nodes also represent Nw inside
the result.
     To provide a formal definition of language composition we set for i ∈ {a, b}
– for the construction of the parallel product of Na and Nb
      • Pk = {p◦I , p◦F } as the new source and sink places,
      • Tk = {t◦I ,t◦F } as the splitting and joining transitions,


                                                      1


             S:S/0.88       S:A/0.02     S:W/0.02         S:E/0.02        S:D/0.02    S:X/0.02      S:Y/0.02


                           Figure 5. A PNT suggesting alternatives for a typed S


                                                      147
  CMD       cp        DST    DIR
                                        ATAED
                      SRC    FILE
                                        PNTs


        ε:CMD/1.00          ε:cp/0.81   ε:SRC/1.00    ε:FILE/0.50       ε:PNTs/0.63


                                                                                           2


                                        ε:DST/1.00     ε:DIR/1.00       ε:ATAED/0.01


                     Figure 6. Language composition of Nw and No with output word


    • Fk = {(p◦I ,t◦I ), (t◦I , pIa ), (t◦I , pIb ), (pFa ,t◦F ), (pFb ,t◦F ), (t◦F , p◦F )} as the corres-
       ponding flow relation,
– for the merged transitions
    • T m = $ using the composable pairs as new transitions,
    • πi : T m → Ti as the projections onto the first resp. second component of a com-
       posable pair and for any set X we also write πi to mean (idX , πi ) or (πi , idX ),
    • Fim = {(p,t) ∈ Pi × T m | πi (p,t) ∈ Fi } ∪ {(t, p) ∈ T m × Pi | πi (t, p) ∈ Fi } to be the
       merged flow relation part from Ni ,
    • Wim = ∑(p,t)∈Pi ×T m Wi (πi (p,t))πi (p,t) + ∑(t,p)∈T m ×Pi Wi (πi (t, p))πi (t, p) to be the
       merged flow weight function part from Ni ,
    • ω m : T m → S as the merged weight function with ω m (t) = ωa (πa (t))  ωb (πb (t)),
– for the keeping of non-invasive transitions
    • Tin to be the non-invasive transitions from Ni ,
    • Fin = Fi |(Pi ×Tin )∪(Tin ×Pi ) as the non-invasive flow relation of Ni .

Definition 5 (Language Composition). Let Na = (Pa , Ta , Fa ,Wa , pI , pF , Σa , σa , ∆a , δa ,
ωa ) and Nb = (Pb , Tb , Fb ,Wb , pI , pF , Σb , σb , ∆b , δb , ωb ) with ∆a = Σb be two PNTs over
the same concurrent semiring S = (S, ⊕, ⊗, , 0, 1). Then, using the notations from
above, the language composition Na ◦ Nb is the PNT N = (P, T, F,W, pI , pF , Σ , σ , ∆ , δ ,
ω) over S , where
  – P = Pk ∪ Pa ∪ Pb and T = Tk ∪ T m ∪ Tan ∪ Tbn ,
  – F = Fk ∪ Fam ∪ Fbm ∪ Fan ∪ Fbn ,
  – W |Fk ≡ 1, W |Fam ≡ Wam , W |Fbm ≡ Wbm , W |Fan ≡ Wa , W |Fbn ≡ Wb ,
  – pI = p◦I and pF = p◦F ,
  – Σ = Σa and ∆ = ∆b ,
  – σ |Tk ≡ ε, σ |T m ≡ σa ◦ πa , σ |Tan ≡ σa , σ |Tbn ≡ ε,
  – δ |Tk ≡ ε, δ |T m ≡ δb ◦ πb , δ |Tan ≡ ε, δ |Tbn ≡ δb ,
  – ω|Tk ≡ 1, ω|T m ≡ ω m , ω|Tan ≡ ωa and ω|Tbn ≡ ωb .


                                                148
Since input and output languages of PNTs are extension closed it is possible that the
operation of language composition propagates dependencies from one PNT to the other.
Consider two PNTs Na and Nb like in the above definition, an output word u0 of Na , an
input word u of Nb with u0 ≤ u and the output word v into which Nb translates u. Then
Nb also accepts u0 as input word and translates it into an output word v0 ≤ v.
    This conjuncture can also be seen on the output word in figure 6 where the language
composition took over the order between PNTs and ATAED/ from Nw whereas the
corresponding transitions from No were unordered.
    The problem is discussed in more detail in [11] where solutions based on separating
input and output processing are proposed. All those solutions have in common that the
processing of weights, i.e. merging of transitions, no longer occurs where the output is
produced. To overcome this we use another approach from [11] namely Hierarchical
Petri Net Transducers. We extend them by augmenting the transitions with weights
drawn from a bisemiring and introduce the additional concept of shared transitions.
    A Hierarchical Petri Net Transducer is a PNT together with a refinement operation to
substitute transitions by other PNTs. This induces sets of transitions which can be used
to eliminate some edges from the input and output labels of runs by restricting the order
relation. With this post processing transitions can be isolated from one another solving
the problem of propagating unwanted order during language composition operation.

Definition 6 (Hierarchical Petri Net Transducer). For a given PNT N0 = (P0 , T0 , F0 ,
W0 , pI , pF , Σ0 , σ0 , ∆0 , δ0 , ω0 ) over a bisemiring S = (S, ⊕, ⊗, , 0, 1) a hierarchical
PNT (hPNT) over S is a 6-tuple H = (N0 , N , ρ, Fc ,Wc , Ts ), where

  – N0 is the initial PNT,
  – N = {N1 , . . . , Nk } is a family of refinement PNTs over S whereat N0 , N1 , . . . , Nk
    are pairwise disjoint,
  – ρ : T0 → {1, . . . , k} is a partial refinement function which is injective (ρ(t) =
    ρ(t 0 ) =⇒ t = t 0 ) and associates transitions from the initial PNT with PNTs from N .
    A transition t ∈ T0 , if t ∈ Dom(ρ) holds, is refined by the PNT Nρ(t) . Any transition
    t ∈ T0 for which t 6∈ Dom(ρ) holds is called simple. Any simple transition t must
    have empty input and output σ0 (t) = ε = δ0 (t) as well as neutral weight ω0 (t) = 1.
                 S           S
  – Fc ⊆ (P0 × ki=1 Ti ) ∪ ( ki=1 Ti × P0 ) is the crossing flow relation which allows to
    connect transitions from refinement PNTs to places from the initial PNT,
  – Wc is the corresponding crossing flow weight function analogous to PT-nets and
  – Ts ⊆ T0 is the set of shared transitions which allow to relax the isolation of transi-
    tions.

The crossing components are initially empty and only needed for holding information
when computing the language composition of a PNT and an hPNT where the result is
again an hPNT (see definition 8). The same holds for the set of shared transitions.
    In figure 7 on the next page one can see the hPNT Ho where the transitions t1 , t2 , and
t3 are refined by the PNTs N1 resp. N2 resp. N3 .
    The refinement operation is defined in an obvious way. Any non-simple transition is
substituted with two new ε-transitions with neutral weight 1. One is called down-going


                                           149
                            N1          ε:DST/1.00         ε:DIR/1.00
    Ho                                                                                    N3
         ε:CMD/1.00                                                             t3
                                                                                          2
     1                      t1                                                  t2
                                                                                          N2
                                        ε:SRC/1.00         ε:FILE/0.50

             cp:cp/1.00                PNTs:PNTs/1.00                ATAED/:ATATED/1.00

    N1      1                    N2     1                       N3       1


                Figure 7. hPNT Ho , refinement PNTs Ni for transitions ti (i = 1, 2, 3)


refinement transition and only inherits the incoming arcs and the other one is called
up-going refinement transition and inherits only the outgoing arcs. The down-going
refinement transition is connected to the source place of the refinement PNT and the
up-going refinement transition with its sink place. For the formal procedure of refining
we refer to [11]. The result of refining is a PNT N called the interface of H and the set
of runs of N is per definition the set of runs of H. We use all other notations introduced
for PNTs as well as definition 4 also for hPNTs by means of their interfaces.
    The hPNT Ho in figure 7 has the same input and output words as the PNT No from
figure 4 since the interface of Ho is equivalent to No .
    But hierarchical PNTs have additional input and output labels where the order
relation is restricted. Dependencies between different refinement PNTs are excluded
except for shared transitions. Dependencies between the initial PNT and refinement
PNTs other than by the refinement itself and through shared transitions are excluded as
well. We focus on the output labels.
    Figure 8 shows an output word of Ho where the nodes are grouped to visualise the
allowed connections. The narrower lines represent the refinement paths.
    We use the notations from [11] where • T and T • are the sets of down-going resp. up-
going refinement transitions and set Tr = • T ∪T • to be the set of all refinement
                                                                               S
                                                                                   transitions.
We use K = ρ(T0 ) as the image of ρ and K0 = K ∪ {0}. Then we have r = i∈K0 l −1 (Ti ∪
Tr ∪ Ts ) × l −1 (Ti ∪ Tr ∪ Ts ) as the set of all allowed relations.

Definition 7 (Relaxed Output Label of Runs). Let H = (N0 , N , ρ, Fc ,Wc ) be a hPNT
and let wlpo = (V, <, l, ν) ∈ wLPO(H). The relaxed output label of wlpo is the LPO
δr (wlpo) corresponding to the ε-LPO (V, <|r + , δ ◦ l).
    For LPOs u over Σ and v over ∆ , we denote by wLPOO,r (H, u, v) the subset of all
wLPOs wlpo from wLPO(H, u) with relaxed output label δr (wlpo) = v.

We consider a fixed concurrent semiring S and a PNT Na over S with input alphabet Σa
and output alphabet ∆a and an hPNT H over S with its interface Nb and input alphabet
Σb = ∆a and output alphabet ∆b . We assume Na and Nb to be disjoint.

                                                     DST      DIR       ATAED

                                 CMD        cp
                                                     SRC      FILE       PNTs


                                      Figure 8. Output word of Ho


                                                 150
   Using the notations introduced before definition 5 as well as the notations from
above of definition 7 we additionally set
– the right partial language composition Na |◦ Nb to be the PNT N = (P, T, F,W, pI , pF ,
  Σ , σ , ∆ , δ , ω) with
    • P = Pb and T = T m ∪ Tbn ,
    • F = Fbm ∪ Fbn ,
    • W |Fbm ≡ Wbm , W |Fbn ≡ Wb ,
    • pI = pIb and pF = pFb ,
    • Σ = Σa and ∆ = ∆b ,
    • σ |T m ≡ σa ◦ πa , σ |Tbn ≡ ε,
    • δ |T m ≡ δb ◦ πb , δ |Tbn ≡ δb ,
    • ω|T m ≡ ω m , ω|Tbn ≡ ωb and
– the remaining flow relation F(Na |◦ Nb ) = Fam as the non-invasive flow relation of Na .

Definition 8 (Hierarchical Language Composition). Let Na = (Pa , Ta , Fa ,Wa , pI , pF ,
Σa , σa , ∆a , δa , ωa ) be a PNT over the concurrent semiring S = (S, ⊕, ⊗, , 0, 1), H be
an hPNT over S , and Nb be the interface of H with ∆a = Σb . Then, using the notations
from above, the hierarchical language composition Na ◦ H is the hPNT H 0 = (N00 , N 0 ,
ρ 0 , Fc0 ,Wc0 ) where

 – N00 = Na ◦ N0 ,
 – N 0 = {N10 , . . . , Nk0 } where Ni0 = Na |◦ Ni for every i = 1, . . . , k,
 – ρ 0 = ρ,
                S
 – Fc0 = Fc ∪     i=1,...,k F(Na |◦ Ni ),
 – Wc0 |Fc ≡ Wc , Wc0 |F(Na |◦Ni ) ≡ Wa |F(Na |◦Ni ) for every i = 1, . . . , k and
 – Ts0 = Ts ∪ Tan .

The hierarchical language composition corresponds to the language composition of the
PNT and the interface of the hPNT. Technically, the above definition only states where
to put the new places and transitions into.
    Consider now the language composition Nw ◦ Ho . The output word of the result
would be the same as for Nw ◦ No depicted in figure 6 but the relaxed output word would
be the one from figure 8 where the order between PNTs and ATAED is restricted away.


4 Adding and Spreading the Weight

The introduced framework allows for weighted relations between input and output but
neither are weighted objects. After processing there is no way to see which part of a run
had what impact on the result. Since we want to have a more detailed view of the weight
we want to relate an input to a weighted output. Therefore we propose an additional
definition of weighted output labels of runs for PNTs where the weight function is
recomputed during the deletion of ε-labelled nodes. For this recomputation a wLPO


                                              151
is segmented by its labels. This also has influence on the definition of equivalence for
PNTs and results in technical restrictions for PNT-algorithms. Language composition of
several PNTs can propagate the segments from one end to the other. The application to
hPNTs leads to Segmenting Petri Net Transducers. These new ideas are introduced in
two steps in the next subsections.

4.1   Segments of Weighted Structures
A wLPO is segmented by means of its labels. Every segment contains only one non-ε-
labelled node but ε-labelled nodes can be part of more than one segment. The weight of
a single segment is computed as its sp-weight.
    For a given ε-wLPO wlpo = (V, <, l, ν) and for any subset V 0 ⊆ V of nodes we set
– V6ε0 = {v ∈ V 0 | l(v) 6= ε)} to be the set of all non-ε-labelled nodes from V 0 ,
– VVε0 = V 0 ∪ {v ∈ V \ V6ε | ∃v0 ∈ V 0 : v < v0 } to be the set of all nodes from V 0 and all
  their ε-labelled predecessors,
– VV 0 = V 0 ∪ {v ∈ V | ∃v0 ∈ V 0 : v < v0 } to be the set of all nodes from V 0 and all their
  predecessors,
– poV 0 = (V 0 , <|V 0 ×V 0 ) to be the corresponding partial order on V 0 ,
– Min(V 0 ) = Min(poV 0 ) to be the set of all minimal nodes from V 0 and
– Max(V 0 ) = Max(poV 0 ) to be the set of all maximal nodes from V 0 .
Definition 9 (Segments of ε-wLPOs). Let wlpo = (V, <, l, ν) be an ε-wLPO. Let N be
a set. Then, using the notations from above, we compute the segments seg(wlpo) of wlpo
with the following procedure:
 1. Initialise N with all nodes from V .
 2. Let w = (N, <|N×N , l|N , ν|N ) be the current ε-wLPO.
                                                             ε into seg(wlpo).
 3. For every minimal non-ε-labelled node l ∈ Min(N6ε ) put N{l}
 4. Set N to be N \ NMin(N6ε ) .
 5. If N6ε 6= 0/ then continue with step 2.
 6. Put N into seg(wlpo).
For a non-ε-labelled node v ∈ V6ε we say that the wLPO seg(v) = (Sv , <|Sv ×Sv , l|Sv , ν|Sv ),
where Sv is the one set from seg(wlpo) with Max(Sv ) = {v}, is the segment of v. The set
from step 6 is called ε-segment of wlpo.
For an ε-wLPO (V, <, l, ν) we construct the corresponding ε-free wLPO (Vs , <s , ls , νs ),
where (Vs , <s , ls ) is the corresponding ε-free LPO, and the weight function is defined by
νs (v) = ω sp (seg(v)) for all nodes v ∈ Vs .
     Figure 9 on the following page depicts on the left an ε-wLPO which is a wLPO-
run of No projected on its input symbols. Here the nodes are grouped to visualise the
segments of the wLPO. On the right the corresponding ε-free wLPO is shown.
     Note that for an ε-wLPO wlpo and its corresponding ε-free wLPO wlpos in general
ω sp (wlpo) 6= ω sp (wlpos ) holds (even if wlpo is series-parallel) since the weight of the
ε-segment is not considered on the right side – which can be seen also in figure 9 – but
the weight of some nodes is possibly taken into account multiple times.


                                          152
                          ε/1.00   ε/1.00 ATAED//1.00                               ATAED//1.00
       ε/1.00   cp/1.00                                      ε/1.00       cp/1.00
                          ε/1.00   ε/0.50   PNTs/1.00                                PNTs/0.50


                Figure 9. Segmented ε-wLPO and corresponding ε-free wLPO


4.2   Segmenting Petri Net Transducers
With the introduced notion of segments we extend the definitions for PNTs focusing on
the output side. Extension of the input side is subject to further research. Eventually we
introduce Segmenting Petri Net Transducers as hPNTs using the new concepts.
    While the runs of PNTs are wLPOs, the input and output words of PNTs are un-
weighted LPOs. We add another output label resulting in weighted output words.

Definition 4 (add.) (Weighted Output Label of Runs). Let N = (P, T, F,W, pI , pF , Σ ,
σ , ∆ , δ , ω) be a PNT over a concurrent semiring S = (S, ⊕, ⊗, , 0, 1) and let wlpo =
(V, <, l, ν) ∈ wLPO(N). The weighted output label of wlpo is the wLPO δω (wlpo) cor-
responding to the ε-wLPO (V, <, δ ◦ l, ν).

If different wLPO-runs result in the same output word, i.e. have isomorphic underlying
LPOs, then we build the sum of all their corresponding ε-free wLPOs.

Definition 10 (Weighted Output of PNTs). Let N = (P, T, F,W, pI , pF , Σ , σ , ∆ , δ , ω)
be a PNT over a concurrent semiring S = (S, ⊕, ⊗, , 0, 1), u be an LPO over Σ and v
be an LPO over ∆ . The weighted output NO,ω (u, v) is defined by

                          NO,ω (u, v) =        ⊕
                                          wlpo∈wLPO(N,u,v)
                                                             δω (wlpo).


We set NO,ω (u, v) = (0, / l, ν) if wLPO(N, u, v) = 0.
                      / 0,                          /

This definition favours the definition of the output weight of PNTs – the weight which is
assigned to an input-output-pair (see definition 5 of [16]).
     The weighted output language of a PNT is the union of all its weighted output and
its elements are called weighted output words.
     The additional semantics of PNTs should also be considered in the concept of
equivalence (see definition 6 of [16]) to reflect the notion of segments. This would have
a severe impact on algorithms for PNTs. Any modification of weights, input or output
symbols would have to respect segments. Also ε-segments should be observed since it is
not obvious how they interfere with the sequential product of PNTs.
     Language composition of PNTs can propagate segments from one PNT to another.
Consider a segment of the upper PNT from figure 10. It originates from one transition
carrying a non-ε output symbol and a (probably empty) set of transitions with ε output
symbol. All transitions can have ε or non-ε input symbols. During language composition
with the lower PNT from figure 10, transitions with non-ε input symbols can get merged
not changing their output symbols, or transitions from the other PNT having ε output
symbols can be added. This way transitions from the lower PNT join the segments
induced by the upper one. Additionally, their weight is incorporated into the segments.


                                            153
                  P:ε         N:ε          T:ε        s:PNTs
                                                                      :ε

             1    c:ε                                  p:cp


                  A:ε                                 /:ATAED


       T:ε        A:ε         E:ε          D:ε


                  ε:c        ε:p           ε:         ε:P           ε:N         ε:T   ε:s
             1                                                                              ...


                              Figure 10. Propagation of segments


    But segments can also be reduced or even deleted when transitions with non-ε input
symbols are not carried over to the resulting PNT. Nevertheless, segments from one PNT
induce a segmentation of another PNT through language composition and in general
this segmentation will be different than the PNT’s own one. This even raises the bar for
algorithms since segments are subject to change depending on the PNT’s environment.
    Now we are able to translate an input signal into a weighted output structure. To
address the problem of unwanted propagated order, we combine the notion of weighted
output with hPNTs. Therefore the algorithm from definition 9 runs on relaxed output
labels and is adapted in step 3 to only consider nodes originating from the same set
of transitions or from the set of shared transitions. The modified algorithm leads to
hierarchical correspondence of ε-wLPOs and ε-free wLPOs.
Definition 7 (add.) (Relaxed Weighted Output Label of Runs). Let H = (N0 , N , ρ,
Fc ,Wc , Ts ) be an hPNT over a concurrent semiring S and let wlpo = (V, <, l, ν) ∈
wLPO(H). The relaxed weighted output label of wlpo is the wLPO δr,ω (wlpo) hierarch-
ical corresponding to the ε-wLPO (V, <|r + , δ ◦ l, ν).
This time we build the sum of all hierarchical corresponding ε-free wLPOs of wLPO-runs
leading to the same relaxed output word.
Definition 11 (Relaxed Weighted Output of sPNTs). Let H = (N0 , N , ρ, Fc ,Wc , Ts )
be an hPNT over a concurrent semiring S = (S, ⊕, ⊗, , 0, 1), u be an LPO over Σ and
v be an LPO over ∆ . The relaxed weighted output HO,r,ω (u, v) is defined by

                        HO,r,ω (u, v) =           ⊕
                                          wlpo∈wLPOO,r (H,u,v)
                                                                 δr,ω (wlpo).


We set HO,r,ω (u, v) = (0, / l, ν) if wLPOO,r (u, v) = 0.
                        / 0,                           /
Finally, we call hPNTs with shared transitions and the above extensions segmenting Petri
net transducers or sPNTs for short.
    A cascade of PNTs with an sPNT on top allows for a seamless propagation of inform-
ation in both directions. An additional PNT on top representing semantic expectation
leads to adjustments through the hierarchy down to the lowest level priming the whole
recognition network. An incoming signal leads to adjustments up to the highest level
resulting in weighted semantic structures directly derived from the input.


                                                154
   Since sPNTs still use terminal languages (cf. [6]) there is no added expressiveness
compared to PNTs. Though a recursive refinement operation as introduced in [11] adds
expressiveness it prohibits a closed representation of the interface thus making analysis
much harder. When using clean transducers (see [16,11]) the closedness regarding com-
position operations still holds. It is our belief, that sPNTs have a bigger expressiveness
concerning weighted output (after language composition) than PNTs since weights can
now be multiplied without taking over dependencies.


5 Conclusion and Outlook

We introduced segmenting Petri net transducers for the translation of sequential input
into non-sequential weighted output and the segmentation of the input on the basis of
semantic structures. We showed how to compute segments on wLPOs and how segments
are propagated during language composition.
    State-of-the-art speech dialogue systems decide for an interpretation of an input
signal based on a single overall weight assigned to the whole translation process. Seg-
menting allows using parts of the signal which are recognised well and initiating an
enquiry call for the other parts. The notion of segments and their propagation also gives a
nice solution to the problem of alignment from which tree transducers, DAG transducers,
and other rewriting systems suffer.
    The definition of equivalence for sPNTs and the combination with weighted input
are subject to further research as well as analysis of algebraic properties of sPNTs and
concrete technical restrictions for PNT-algorithms.
    Most open questions arise from the fact that a single run is used to represent input
and output. Another idea is to use a pair of loosely synced runs – one for input and one
for output. This way propagation of dependencies could be completely circumvented in
both directions. Namely the PNTs generating the semantic structure from our example
force a prefix notation on the typed command and neither postfix nor infix notation
would be accepted. However, such restrictions should be subject to the syntactic level –
another PNT within the cascade.
    Also notions from [22], namely the dependency of the flow weight function on
the marking, could be used to eliminate the need for a recursive refinement operation.
Regarding the mentioned real-time capabilities we plan to adapt on-the-fly composition
algorithms for FSTs [4] to precomputed unfoldings of PNTs.
    Future work will also deal with the computation of confidence scores, rather than
weights, for the nodes of the perceived LPOs. To this end we want to create a hierarchy
of semantic units leading to a more accurate computation of weights. These can be used
to assess the reliability of semantic units of the input thus enhancing decisions made by
the behaviour controller of cognitive systems.


Acknowledgements This work has been developed in the project Universal Cognitive
User Interface (UCUI) which is partly funded by the German Federal Ministry of Educa-
tion and Research (BMBF) within the research program IKT2020 (grant #16ES0297).


                                         155
References
 1. Chulan, U.A.U., Sulaiman, M.N., Mahmod, R., Selamat, H., Hamid, J.A.: Organizing the
    semantics of text with the concept relational tree. IJCSNS 8(9), 236 (2008)
 2. Droste, M., Kuich, W., Vogler, H. (eds.): Handbook of Weighted Automata. Monographs in
    Theoretical Computer Science, Springer (2009)
 3. Duckhorn, F., Huber, M., Meyer, W., Jokisch, O., Tschöpe, C., Wolff, M.: Towards an autarkic
    embedded cognitive user interface. In: Interspeech 2017, 18th Annual Conference of the
    International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017.
    ISCA (2017)
 4. Duckhorn, F., Wolff, M., Hoffmann, R.: A new epsilon filter for efficient composition of
    weighted finite-state transducers. In: Cosi, P., Mori, R.D., Fabbrizio, G.D., Pieraccini, R.
    (eds.) Interspeech 2011, 12th Annual Conference of the International Speech Communication
    Association, Florence, Italy, August 27-31, 2011. ISCA (2011)
 5. Geßler, P.: Kognitive Gerätesteuerung. Master’s thesis, Brandenburgische Technische Uni-
    versität Cottbus-Senftenberg, Deutschland (2017)
 6. Hack, M.: Petri net languages. Tech. Rep. Memo 124, computation structures group, mas-
    sachusetts institute of technology (1975)
 7. Haykin, S.: Cognitive Dynamic Systems: Perception-action Cycle, Radar and Radio. Cam-
    bridge University Press, New York, NY, USA (2012)
 8. Hoffmann, R., Eichner, M., Wolff, M.: Analysis of verbal and nonverbal acoustic signals with
    the Dresden UASR system. In: Verbal and Nonverbal Communication Behaviours. LNAI, vol.
    4775, pp. 200–218. Springer (2007)
 9. Huber, M., Kölbl, C., Lorenz, R., Römer, R., Wirsching, G.: Semantische Dialogmodellierung
    mit gewichteten Merkmal-Werte-Relationen. In: Hoffmann, R. (ed.) Proceedings of ”Elektron-
    ische Sprachsignalverarbeitung (ESSV)”. Studientexte zur Sprachkommunikation, vol. 53, pp.
    25–32. TUDpress, Dresden (2009)
10. Huber, M., Kölbl, C., Lorenz, R., Wirsching, G.: Ein petrinetz-modell zur informationsüber-
    tragung per dialog. In: Lohmann, N., Wolf, K. (eds.) 15th German Workshop on Algorithms
    and Tools for Petri Nets, Algorithmen und Werkzeuge für Petrinetze, AWPN 2008, Rostock,
    Germany, September 26-27, 2008. Proceedings. CEUR Workshop Proceedings, vol. 380, pp.
    15–24. CEUR-WS.org (2008), http://ceur-ws.org/Vol-380/paper03.pdf
11. Huber, M., Römer, R., Wolff, M.: Little Drop of Mulligatawny Soup, Miss Sophie? Auto-
    matic Speech Understanding provided by Petri Nets. In: Trouvain, J., Steiner, I., Möbius, B.
    (eds.) Proceedings of ”Elektronische Sprachsignalverarbeitung (ESSV)”. Studientexte zur
    Sprachkommunikation, vol. 86, pp. 122–129. TUDpress, Dresden (Mar 2017)
12. Jones, B., Andreas, J., Bauer, D., Hermann, K.M., Knight, K.: Semantics-based machine
    translation with hyperedge replacement grammars. In: COLING. pp. 1359–1376 (2012)
13. Lorenz, R.: Modeling Quantitative Aspects of Concurrent Systems Using Weighted Petri Net
    Transducers. In: Devillers, R.R., Valmari, A. (eds.) Application and Theory of Petri Nets and
    Concurrency – 36th International Conference, PETRI NETS 2015, Brussels, Belgium, June
    21-26, 2015, Proceedings. Lecture Notes in Computer Science, vol. 9115, pp. 49–76. Springer
    (2015), http://dx.doi.org/10.1007/978-3-319-19488-2_3
14. Lorenz, R., Huber, M.: Petri net transducers in semantic dialogue modelling. In: Wolff, M.
    (ed.) Proceedings of ”Elektronische Sprachsignalverarbeitung (ESSV)”. Studientexte zur
    Sprachkommunikation, vol. 64, pp. 286–297. TUDpress, Dresden (2012)
15. Lorenz, R., Huber, M.: Realizing the Translation of Utterances into Meanings by Petri Net
    Transducers. In: Wagner, P. (ed.) Proceedings of ”Elektronische Sprachsignalverarbeitung
    (ESSV)”. Studientexte zur Sprachkommunikation, vol. 65, pp. 103–110. TUDpress, Dresden
    (2013)


                                           156
16. Lorenz, R., Huber, M., Wirsching, G.: On Weighted Petri Net Transducers. In: Ciardo, G.,
    Kindler, E. (eds.) Application and Theory of Petri Nets and Concurrency – 35th International
    Conference, PETRI NETS 2014, Tunis, Tunisia, June 23-27, 2014. Proceedings. Lecture
    Notes in Computer Science, vol. 8489, pp. 233–252. Springer (2014), http://dx.doi.
    org/10.1007/978-3-319-07734-5_13
17. Pust, M., Hermjakob, U., Knight, K., Marcu, D., May, J.: Parsing english into abstract meaning
    representation using syntax-based machine translation. Training 10, 218–021 (2015)
18. Quernheim, D., Knight, K.: Towards probabilistic acceptors and transducers for feature
    structures. In: Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in
    Statistical Translation. pp. 76–85. Association for Computational Linguistics (2012)
19. Ramachandran, D., Ratnaparkhi, A.: Belief tracking with stacked relational trees. In: 16th
    Annual Meeting of the Special Interest Group on Discourse and Dialogue. p. 68 (2015)
20. Ramachandran, D., Yeh, P.Z., Jarrold, W., Douglas, B., Ratnaparkhi, A., Provine, R., Mendel,
    J., Emfield, A.: An end-to-end dialog system for tv program discovery. In: Spoken Language
    Technology Workshop (SLT), 2014 IEEE. pp. 602–607. IEEE (2014)
21. Römer, R., Wirsching, G.: Ein beitrag zu den natur- und geisteswissenschaftlichen grundlagen
    kognitiver systeme. In: Wagner, P. (ed.) Proceedings of ”Elektronische Sprachsignalver-
    arbeitung (ESSV)”. Studientexte zur Sprachkommunikation, vol. 65, pp. 93–102. TUDpress,
    Dresden (2013)
22. Valk, R.: Self-modifying nets, a natural extension of petri nets. Automata, Languages and
    Programming pp. 464–476 (1978)
23. Wirsching, G., Huber, M., Kölbl, C., Lorenz, R., Römer, R.: Semantic dialogue modeling.
    In: Esposito, A., Esposito, A.M., Vinciarelli, A., Hoffmann, R., Müller, V.C. (eds.) COST
    2102 Training School. Lecture Notes in Computer Science, vol. 7403, pp. 104–113. Springer
    (2011)
24. Wolff, M.: UASR: Unified Approach to Signal Synthesis and Recognition (2000–...). On-
    line: https://www.b-tu.de/en/fg-kommunikationstechnik/research/
    projects/uasr, last visited: 31.03.2017
25. Wolff, M., Tschöpe, C., Römer, R., Wirsching, G.: Subsymbol-Symbol-Transduktoren. In:
    Wagner, P. (ed.) Proceedings of ”Elektronische Sprachsignalverarbeitung (ESSV)”. Studien-
    texte zur Sprachkommunikation, vol. 65, pp. 197–204. TUDpress, Dresden (2013)


                                           157