-

Approach to Picture Automata

Yvo Ad Meeres

Yvo.AdMeeres@mailbox.org 1 2

František Mráz

0 1 0 Charles University, Department of Software and Teacher Training , Malostranské nám. 25, 118 00 Prague 1 , Czech Republic 1 ITAT'25: Information Technologies - Applications and Theory 2 University of Bremen, Department of Theoretical Computer Science , Bibliothekstr. 5, 283 59 Bremen , Germany

2025

A directed acyclic graph (DAG) can represent a two-dimensional string or picture. We propose recognizing picture languages using DAG automata by encoding 2D inputs into DAGs. An encoding can be input-agnostic (based on input size only) or input-driven (depending on symbols). Three distinct input-agnostic encodings characterize classes of picture languages accepted by returning finite automata, boustrophedon automata, and online tessellation automata. Encoding a string as a simple directed path limits recognition to regular languages. However, input-driven encodings allow DAG automata to recognize some context-sensitive string languages and outperform online tessellation automata in two dimensions.

directed acyclic graph DAG automaton picture language regular language

CEUR Workshop

ISSN1613-0073 # # # # # # # # # #

# # # # # # # # # # # # # # # # # #

# (c) Visualizing the arrows defined by ▦ as colored borders lets the DAG look like .̂ # # # # # # # # # # # # # # # # # # # (a) ▦() ̂ encodes the picture ∈ {○}3,3 as a DAG with the left upper corner as root.

(b) The picture DAG is drawn less traditionally but more like a rasterized picture.

Input-driven encodings enhance DAG automata’s power, enabling recognition of some contextsensitive string languages and a proper super-class of REC in two dimensions.

DAG automata have many appealing properties. Blum and Drewes [ 4 ] proved that emptiness and ifniteness are decidable for DAG automata, and the class of DAGs accepted by DAG automata is closed under union and intersection. These properties carry over to the case of accepting encodings of picture languages. Deterministic DAG automata can be reduced and minimized, and their equivalence is decidable in polynomial time [ 4 ]. Equivalence for deterministic DAG automata with respect to picture language recognition is still an open problem, as two deterministic DAG automata accepting the same encodings of pictures need not be equivalent, because they can accept diferent sets of DAGs that do not encode pictures. Ad Meeres [ 5 ] provides a framework to port known eficient finite state automata algorithms and properties from string to DAG languages. By applying these techniques, we could obtain classes of picture languages that can be eficiently recognized by DAG automata.

DAG automata have many strong properties [ 4 ]: emptiness and finiteness are decidable, and accepted DAG classes are closed under union and intersection, including for picture language encodings. We can minimize deterministic DAG automata, and their equivalence is polynomial-time decidable, though equivalence for deterministic DAG automata with respect to picture language recognition is still open. A recent framework by Ad Meeres [ 5 ] enables adapting eficient finite automata algorithms from strings to DAG languages.

The paper is structured as follows. After establishing the basic notation, Sect. 2 provides an overview of common automata accepting two-dimensional inputs. Sect. 2.2 presents a model of a DAG automaton suitable for recognizing pictures due to its limited capabilities. Sect. 3 introduces various encodings of pictures into DAGs. Sect. 4 compares DAG automata first to classical string automata to show the capabilities of DAG automata in the 1D case and then to the picture automata for the 2D case. The concluding Sect. 5 summarizes the obtained results and indicates open problems for future research. The majority of the proofs have been placed in the Appendix.

1. Notation

For the positive integers ℕ with ℕ0 = ℕ ∪ {0}, let [] = {1, … , } and [] 0 = {0, … , } where ∈ ℕ . The cardinality of a set equals || . A finite set of symbols forms an alphabet. Concatenated symbols yield a string. The concatenation of two strings and is written as ⋅ or as their juxtaposition . For a string = 1 2 … … ∈ ∗ with ∈ [] , symbol ∈ occurs at position where || denotes the length of it, represents the empty string, || gives the number of occurrences of the symbol in and [] denotes the smallest subset ⊆ such that ∈ ∗. All finite strings over form the set ∗, with any subset of ∗ called a (string) language, and ⊂ ∗ comprising all strings over of length . A doubly ranked alphabet Σ is defined as an alphabet in which each symbol is assigned a rank through a function ∶ Σ → ℕ 20.

An automaton model is a computational model that defines how a specific instance of , an automaton , recognizes a language () . Two automata models and ′ are considered equivalent, denoted by ≡ ′, if they recognize the same class of languages. Throughout the paper, in an automaton definition, the following symbols do not require further definition when they appear: Σ for an input alphabet excluding the border signal # ∉ Σ, for a finite set of states, 0 ∈ for an initial state, ⊆ for a set of final states. A finite state automaton ( FSA), in its nondeterministic variant (NFA) (or its deterministic one (DFA)) is defined as a tuple = (, Σ, , 0, ) , where ⊆ × Σ × is the transition relation (or a function in the case of DFA). The reflexive transitive closure ∗ ⊆ × Σ ∗ × is defined by: (i) for all ∈ , (, , ) ∈ ∗, and (ii) if (, , ′) ∈ and ( ′, , ″) ∈ ∗ for ∈ Σ , ∈ Σ ∗, then (, , ″) ∈ ∗. If ( 0, , ′) ∈ ∗ and ′ ∈ , accepts ∈ Σ ∗. The automaton recognizes the language () which is the set of all strings that accepts. 2. Computational Models for Two Dimensions Both rectangular arrays of symbols and other binary, i.e. 2-ary, relations can be regarded as two-dimensional structures. For both notions of two-dimensionality, automata have been introduced in the literature. Picture automata process two-dimensional matrices over an alphabet (see Sect. 2.1), while DAG automata digest finite multisets of an alphabet with a binary relation defined on them (see Sect. 2.2).

2.1. Picture Automata

Inspired from natural languages, a one-dimensional string is called a word. Inspired from image processing, a two-dimensional string or a matrix is called a picture. Both terms are standard in automata theory. A picture automaton recognizes a picture language.

We consider the three picture automaton models returning and boustrophedon finite automaton (RFA and BFA, Def. 2.2) and two-dimensional online tessellation automaton (2DOTA and 2OTA, Def. 2.5). We consolidate these fundamentally diferent picture automata under a single DAG-automaton model, providing a complete characterization for each individual automaton while simultaneously unifying them within a common framework, thereby capturing the full expressive power of each characterized picture automaton.

Notably, when restricted to one-row input pictures (i.e. in the one-dimensional case), each of these picture automaton models recognizes exactly the regular string languages.

Definition 2.1 ((Boundary) Picture). A two-dimensional string over a finite alphabet Σ is called a picture. The dimensions of a picture with rows and columns are given by the ordered pair (, ) . The symbol Λ denotes the unique empty picture with dimensions (0, 0). Σ, represents the set of all pictures with dimensions (, ) over Σ. The notation , refers to the symbol from Σ at the position (, ) of the picture , where ∈ [] denotes the row counted from top to bottom and ∈ [] the column counted from left to right: = 1,1 ⋮ ,1 … ⋱ … , 1, ⋮ .

Any subset of the set of all pictures over Σ, denoted Σ∗,∗, is called a picture language. Given a picture ∈ Σ ,

, its boundary picture ̂ is the picture itself surrounded by the border symbol #, thus a picture over Σ ∪ {#}with dimensions ( + 2, + 2) respectively. Let ,̂ where ∈ {0, + 1} = , , for all ∈ [] or ∈ {0, + 1} .

where the rows and columns range over [ + 1] 0 and [ + 1] 0, and ∈ [] , and ,̂ = #, for all ∈ [ + 1] 0 and ∈ [ + 1] 0 0̂ ,0 1̂ ,0 ⋮ ,0̂ +̂1,0 1,1 ̂ over Σ ∪ {#}. Afterward, the picture is accepted if the corresponding FSA accepts this serialized picture .̂ Both models scan their input picture horizontally line by line, according to their respective scanning strategy,but they difer in their scanning directions. An RFA proceeds uniformly from left to right, this corresponds to the reading of Western texts where the eye has to ‘jump’ from the end of a line to the beginning of the next one. A BFA, on the other hand, proceeds like an ox plowing a field, alternating its direction.

Definition 2.2 (RFA, BFA). Both a (deterministic) returning finite automaton (RFA) and a (deterministic) boustrophedon finite automaton (BFA) are given by a 6-tuple = (, Σ, , (Σ ∪ {#}) × is the transition relation (transition function ∶ × (Σ ∪ {#}) → ). 0, , #) , where ⊆ ×

Both automata models are equipped with a scanning strategy, which defines a total order on the elements

of a two-dimensional string. Formally, for a two-dimensional string holds ( 1, 1) ≺ ( 2, 2) if and only if, in case of an • RFA ∶ ( 1 + 1 < 2 + 2) ∨ ( 1 = 2 ∧ 1 < 2) and in case of a • BFA ∶ ( 1 + 1 < 2 + 2) ∨ (( 1 + 1 mod 2 = 0 ∧ 1 > 2) ∨ ( 1 + 2 mod 2 = 1 ∧ 1 < 2)). With = (, Σ ∪ {#}, ,

0, ) being an NFA (DFA), the language accepted by is given as () = { ∈ Σ ∣ for all ∈ [ + 1] 0 and ∈ [ + 1]

0 1 2… +1 … (+2)(+2) ∈ () with 1̂ 1 ≺ 2̂ 2 where = 1̂ 1 and +1 = 2̂ 2 } using the respective total order ≺ as specified above.

We presented an alternative, yet equivalent, definition of [ 7 ], which serves as a characterization emphasizing both the treatment of the boundary symbols as well as the scanning strategy. It is easy to see that our modified definition is correct since it does not change the expressive power of the automata models: every state computed within the top or bottom boundary could be computed within the first, resp. last row of the picture as well and two consecutive boundary symbols an automaton with expressiveness since it can do the same in one step while reading the squeezed ## do not equip single #, see below.

Here, we defined scanning strategies on a boundary picture with all four sides explicitly represented. The original definition, in contrast, delimits pictures only on the left and right and even then, it does not reference all those boundary symbols. Using boundary pictures with the full boundary, on all four sides, uniformly across all automaton models supports a coherent perspective and facilitates their systematic comparison in Sect. 4. By limiting the boundary symbols to acting as a ‘fence’ in the serialized string, we mimic the behavior of the original definition easily: () = { ∈ Σ ∣ for all ∈ [ + 1]

and ∈ [ + 1] 1 2… +1 … ⋅(+1)−1 ∈ () with 1̂ 1 ≺ 2̂ 2 where = 1̂ 1 and +1 = 2̂ 2 } Interpreting the strict order ≺ above as a partial order ⪯ by regarding consecutive positions labeled with # as equal, like (, + 1) = ( + 1, 0) , allows us to take canonical representatives of the resulting equivalence classes. We take the boundary symbols of the right border for both models as canonical representatives. In this way, in above language-defining formula, no longer ranges over zero, excluding the left boundary. E.g., in case of an RFA, for every pair (, + 1) and ( + 1, 0) , the canonical representative is (, + 1) where ∈ [ + 1] 0. While the corresponding formalism may seem intricate at first glance, the underlying idea is entirely natural and aligns closely with the structure imposed by the scanning strategy, shown in Sect. 4.2.1. This partial order ⪯ mimics the original definition of RFA and BFA [ 7 ] where all vertices labeled # are present, but, only the canonical representative is referenced in a run which would be (, + 1) for an RFA.

By making the scanning strategy explicit, our characterization reveals its efect on the family of languages recognized by the models. As a result of the explicit scanning strategy, Sect. 4.2.2 demonstrates that parametrizing it enables the recognition of a wider range of picture languages. Theorem 2.3 (RFA ≡ BFA [ 7 ]). RFA and BFA recognize the same picture language family. Theorem 2.4 (det. RFA ≡ nondet. RFA, det. BFA ≡ nondet. BFA [7, Lemma 14]). RFA and BFA have the same expressive power in their deterministic and in their nondeterministic variant.

The computation of an online tessellation automaton on a picture ∈ Σ ∗,∗ consists of associating a state , ∈ to each position (,̂ ) depending only on the two states at the positions above and left from (,̂ ) .

Definition 2.5 (2OTA, 2DOTA [ 1 ]). A nondeterministic (deterministic) two-dimensional online tessellation automaton referred to as 2OTA (2DOTA), is defined by the five-tuple = (Σ, , 0, , ) where ∶ × × Σ → 2 ( ∶ × × Σ → ) is a transition relation (function). A run of on an input picture ∈ Σ ∗,∗ is a function assigning a state ∈ to every position of its boundary picture (,̂ ) . The state of a run at position (,̂ ) is referenced as ( (,̂ )) = , . A run of on a picture assigns the initial state 0 to all positions in the first column and the first row of its boundary picture .̂ That is, 0, = ,0 = 0, for all ∈ [ + 1] 0 and ∈ [ + 1] 0. If is deterministic, assigns a state , to each position (, ) , for ∈ [] and ∈ [] with , = ( −1, , ,−1 , , ); if is nondeterministic, guesses all states such that , ∈ ( −1, , ,−1 , , ). It accepts , if there exists a run such that , ∈ .

Here, we are interested in DAG automata. Blum and Drewes [ 4 ] showed that DAG automata have many desirable properties. The class of languages accepted by DAG automata is closed under union and intersection, but not under complementation. Emptiness and finiteness for languages accepted by DAG automata are decidable in polynomial time. Deterministic DAG automata can be minimized and tested for equivalence in polynomial time.

A deterministic DAG automaton (DDA) for strings encoded as strings DAGs corresponds exactly to an NFA that can only guess that it is at the end of the string. A DAG automaton is aware of the termination of the input earlier than a string automaton because it can infer that a symbol is the last one when it detects no outgoing edges, unlike a string automaton, which only realizes the word has ended after the last symbol has already been consumed. Thus, apart from this nondeterministic guessing of the end of a string, a DDA acts on a string DAG like a DFA on a string. However, we will add start and end markers to simplify the proof. Then, a one-dimensional DDA works precisely the same way as a DFA.

Theorem 2.6 ([ 7, 1 ]). The picture automata RFA, BFA, 2DOTA, and 2OTA run on one-dimensional pictures ∈ Σ ,1 , thus on strings, recognize exactly the regular string languages.

2.2. Graph Automata

A graph is a set with a 2-ary relation on it. Consequently, we introduce DAG-automata for parsing two-dimensional strings. We limit our focus to DAGs due to the inherent orientation of binary tuples, and the acyclic restriction ofers algorithmic benefits. We choose the most limited model of a DAG automaton defined in the literature [ 4 ] to ensure that the model is not overly powerful. It operates on multigraphs with ordered edges.

The source is referenced by src() and the target by tar () 1. By in, out ∶ → Definition 2.7 (DAG). A directed acyclic graph over Σ, abbreviated as DAG, is a tuple ( , , ℓ , in, out ) with Σ, , and being disjoint finite sets, alphabet of vertex labels, the sets of vertices and edges, respectively. The vertices are labeled by ℓ ∶ → Σ . An edge ∈ joins two distinct vertices , ∈ . to each vertex ∈ tar () = ⇔ ∈ [

its incoming and outgoing edges such that src() = ⇔ ∈ [ in()] . These edges are ordered as specified by the strings in() and out () . For the ∗, we assign out ()] and graphical representation of a graph, each vertex is drawn as a circle. The ingoing edges are placed along the upper half of the circle and the outgoing edges along the lower half, arranged from left to right according to the strings in() and out () .2 For every sequence 1 2 ⋯ of edges 1, … , ∈ in a DAG with ∈ ℕ , it holds that 0 ≠ in {src( ), tar ( )} = { −1 , } for all ∈ [] , meaning that the DAG is acyclic. The empty graph ∅ is the graph with = ∅ . A vertex is called a root if in() = and a leaf if out () = .

A DAG = ({ 0, … , }, { 1, … , }, ℓ , in, out ) encodes a string 0 1 … as a string DAG if ℓ( ) = for ∈ [] 0, in( ) = for ∈ []

and out ( −1 ) = for ∈ [] .

Definition 2.8 (DAG Automaton). A DAG automaton is a triple = (, Σ, ) , where is a finite set of rules. A state ∈ may also be called an edge label. A rule ∈ is either of the form ( where ∈ Σ, or of the form ∅3 and the head and the tail are elements of ∗. A run of on a ) , DAG = ( , , ℓ , in, out ) is a mapping ∶ → , extended to strings ∶ component-wise to every edge ∈ , such that, for every vertex ∈ , (( in()) ∗ → ∗ by applying ℓ (v) ( out ()) ) ∈ . accepts a nonempty DAG if such a run exists, and accepts the empty DAG ∅ if ∅ is in . The DAG language recognized by is () = { ∣

accepts and is connected}.

The DAG automaton is called top-down deterministic if for every fixed combination of ∈ Σ , ∈ and ∈ ℕ 0 there exists at most one ∈ such that ( ) ∈ .4 A rule cycle5 is a ring of rules (a sequence where the first rule is adjacent to the last) where in each pair of adjacent rules a state ∈ connected to the same state , once in a head and once in a tail. We abbreviate a top-down deterministic

DAG automaton as DDA and a nondeterministic one as NDA.

∗, is

For a DAG automaton , let onerow() denote the set of string DAGs accepted by the DAG automaton , and ℒonerow() = { onerow() ∣

is a DAG automaton} is the class of string languages that encoded as strings DAGs are accepted by DAG automata.

Theorem 2.9. ℒonerow() = , where

is the class of regular languages. 3. Encoding Strings and Pictures as DAGs How to encode a picture as a graph? The obvious idea is to represent all positions in a picture as vertices of a DAG. We call this a picture DAG. However, this gives us an arbitrary, even possibly disconnected DAG.

Definition 3.1 (Picture DAG). A picture DAG = ( , , ℓ, in, out ) encodes a two-dimensional string with positions (, ) and , ∈ ℕ

by interpreting the positions within as the uniquely referenced vertices 1Note that can contain multiple edges connecting the same pair of nodes. Such distinct edges with same source and same target can even cross since the sources and targets are ordered. 2Hence, the ingoing edges appear clockwise and the outgoing edges counterclockwise around the circle. 3Adding the empty graph to the rule set allows consistency with the string languages but difers from the definition of DAG automata in [ 4 ]. 4Intuitively, this means that for a vertex with fixed vertex label, fixed degree, and known ingoing edge labels only one labeling of the outgoing edges is allowed by . In such a case, an input DAG permits at most one run, and this run can be constructed deterministically from the roots downward (the leaves upward). 5See [ 5 ] for a more formal definition. of with = {(, ) ∣ ∈ [] and ∈ []} , labeled as ℓ((, )) = , . The set of edges is arbitrary (i.e., neither in, out nor , is specified) unless specified by a picture-to-DAG encoding. The set of all picture DAGs over an alphabet Σ given by the context is denoted as = { ∣ encodes a picture from ∈ Σ ∗,∗}; the set of the boundary pictures is =̂ { ∣ encodes a picture , ̂ where ∈ Σ ∗,∗}.

This definition implies that a picture DAG encodes a picture with dimensions (, ) as a DAG with the set of vertices = {(, ) ∣ ∈ [] and ∈ []} , and its boundary picture ̂ through = { (,̂ ) ∣ ∈ [ + 1] 0 and ∈ [ + 1] 0}.

A DAG encoding specifies the set of edges for a picture DAG. One possible DAG encoding of a picture is illustrated in Figure 1. A DAG encoding determines the dependencies the automata can use while processing a picture.

Definition 3.2 (Picture-to-DAG Encoding). A picture-to-DAG encoding specifies the edges for a picture DAG. It is a function ⬚ ∶ (Σ ∪ {#})∗,∗ ∪ ∪ →̂ ∪ ̂ that maps its input to a picture DAG = ( , , ℓ , in, out) by specifying the set of edges and the mappings in and out for . The argument of the function ⬚ can be a (boundary) picture (the resulting DAG will have nodes corresponding to its positions) or a picture DAG (its original set of edges will be replaced with a new set of edges).

The picture-to-DAG encoding specifies the edges either uniquely, resulting in one specific DAG , or partially, which allows for several distinct DAGs, but the encoding maps the picture (DAG) to an arbitrary but fixed DAG in this case. The symbol ⬚ denotes the abstract DAG encoding without edges specified. It stands for a specific DAG encoding defined below.

Note, this definition means that the picture-to-DAG encoding does not afect the presence of a boundary. A picture is mapped to a picture DAG ∈ without boundary, whereas a boundary picture ̂ is mapped to a picture DAG ∈̂ ̂ encoding additionally the boundary. The same holds for picture DAG input: the encoding does preserve the membership in or .̂

Picture automata use diferent scanning strategies, like row-by-row or diagonal-by-diagonal. These scanning strategies correspond roughly to the ‘wiring’ of the vertices in a DAG. However, not exactly, since scanning strategies impose a total order on the pixels of a picture, whereas a DAG imposes only a partial order on its vertices. All of the following picture-to-DAG encodings specify the sets of edges that depend only on the dimensions of input pictures; therefore, we call them input-agnostic. Definition 3.3 (Input-agnostic picture-to-DAG encodings). The input-agnostic DAG encoding maps its input to a unique DAG = ( , , ℓ , in, out). In case the input is a picture DAG ′ = ( , ′, ℓ, in′, out′), the edges ′, as well as their ordering and wiring specified by in′ and out′ are replaced by the DAG encoding. This allows for reencoding with another DAG encoding. The following sets of input-agnostic edges depend only on the dimensions (, ) of the picture to be encoded, but not on its symbols.

For ∈ [] , ∈ [] , 0 ∈ [] 0, 0 ∈ [] 0, , ̂ ∈̂ [ + 1] 0 and , ̂ ∈̂ [ + 1] 0, we define a set of edges. • ↓̂⬚ = { ∣ src() = ( ̂ 0, 0), tar () = ( ̂ 0 + 1, 0)} are downward along the left border. • ⬚̂↓ = { ∣ src() = ( ̂ 0, + 1), tar () = ( ̂ 0 + 1, + 1) } are downward the right border. • ⇊̂ = { ∣ src() = ( ̂ 0, ),̂ tar () = ( ̂ 0 + 1, ) ̂ } are the vertical edges oriented downwards. • ⇉̂ = { ∣ src() = ( ̂ , ̂ 0), tar () = ( ̂ , ̂ 0 + 1)} are horizontal oriented to the right. • r̂fa↙ = { ∣ src() = ( ̂ 0, + 1), tar () = ( ̂ 0 + 1, 0)} are the ‘return’ edges for the RFA. • The horizontal edges oriented alternatingly to the left and to the right for the BFA are ⇄̂ = { ∣ src() = ( ̂ , ̂ 0), tar () = ( ̂ , ̂ 0 + 1), is odd }

∪ { ∣ tar () = ( ̂ , ̂ 0), src() = ( ̂ , ̂ 0 + 1), is even } ↓̂bfa↓ = { ∣ src() = ( ̂ 0, 0), tar () = ( ̂ 0 + 1, 0), is odd }

∪ { ∣ src() = ( ̂ 0, + 1), tar () = ( ̂ 0 + 1, + 1), is even} are vertical BFA edges. • ↘̂↘ = { ∣ src() = ( ̂ 0, 0), tar () = ( ̂ 0 + 1, 0 + 1)} are the diagonals to the south east. Now we define several picture-to-DAG encodings by specifying their set of edges. We assume the corresponding in and out functions are defined in a natural way such that the edges do not cross. Furthermore, we define all = { ∣ src() = , tar () = , ≠ for all , ∈ } as the set of all possible edges for a picture, in order to get rid of the edges connected to the boundary in case the picture-to-DAG encoding maps a picture without a boundary to a DAG.

• ↓⇉ defines = ↓̂⬚ ∪ ⇉̂ ∩ all joining the leftmost vertices and all vertices in each row. • ⇉↓ defines = ⇉̂ ∪ ⬚̂↓ ∩ all joining the rightmost vertices and all vertices in each row. • ▤ defines = ↓̂⬚ ∪ ⇉̂ ∪ ⬚̂↓ ∩ all joining the left and the right border and the rows. • ⥂ defines = ⇉̂ ∪ r̂fa↙ ∩ all as the RFA scanning strategy. • ↩↪ defines = ⇄̂ ∪ ↓̂bfa↓ ∩ all as the BFA scanning strategy. • ▦ defines = ⇉̂ ∪ ⇊̂ ∩ all as a vertex pointing to two of its neighbours: below and right. • ▧ defines = ↘̂↘ ∩ all as every vertex points to one neighbour, the one right below. Instead of using the total order of a scanning strategy or partial orders to define input-agnostic edges, we introduce edges depending on the input symbol at the picture’s positions, called input-driven. (ℓ()) = ((ℓ()) edges: Definition 3.4 (Input-Driven picture-to-DAG encoding). Let ′ = ( , ′, ℓ, in′, out′) be a picture DAG over Σ, either encoding the input picture or ,̂ with ′ = ∅ and in′ = out′ = , or given as input itself. Then, the input-driven DAG encoding maps ′ to an arbitrary but fixed DAG = ( , ∪ , ℓ , in, out) over the doubly ranked alphabet (Σ, r). The edge set denotes the set of input-driven edges. The ranks 1, (ℓ()) 2) of the labels’ vertices, with the vertices ∈ , determine the input-driven • in() = in′() ⋅ • out() = out′() ⋅ with ∈ ∗ and || = () with ∈ ∗ and || = () 1 and all edges in are pairwise distinct and 2 and all edges in are pairwise distinct.

The input-driven picture-to-DAG encoding is undefined if the above construction does not yield a DAG. 6

Each edge in lacks explicit designation of its source and target vertices, allowing for all combinations of joining the vertices according to the input-driven in- and outgoing specifications, as long as the combination yields a valid DAG. As such, this DAG encoding accommodates a variety of wiring combinations. follows:

A swap operation is a fundamental operation on DAGs [ 4 ]. Let = ( , , ℓ, in, out) be a DAG. Two edges 0, 1 ∈ are independent if there is no directed path from tar ( ) to src( 1− ) for ∈ {0, 1} . Then, the edge swap of 0 and 1 yields the DAG ′ = ( , , ℓ, ℎ ∘ in, )

, where ℎ is a bijection defined as ℎ() = { 1− otherwise.

if = for some ∈ {0, 1},

A DAG automaton cannot be restricted to picture DAGs by its rule set. The rules allow accepting DAGs that do not represent pictures and are obtained from DAGs encoding pictures by the swap operation. For using DAG automata as a device recognizing picture languages, it is necessary to always have an intersection with the picture languages Σ∗,∗.

Definition 3.5 (Equivalence ≡⬚). Let be a DAG automaton and ′ be an automaton accepting a picture language () ⬚, denoted as ≡ . The automata and ′ are picture equivalent with respect to a picture-to-DAG encoding ⬚ ′, if () ∩ ⬚(Σ ∗,∗) = ⬚(( ′)). If is a family of DAG automata and ′ is a family of picture automata, the families are equivalent with respect to ⬚, denoted as ≡ DAG language , it holds ∈ ℒ() ∩ ⬚(Σ ∗,∗) ⇔ ∈ ⬚(ℒ( ′)).

⬚ ′ if, for each

4. Comparison

This section provides a comparative analysis of string automata, designed for one-dimensional input (see Sect. 4.1), in relation to their capabilities vis-à-vis DAG automata, and picture automata, which operate in two dimensions (c.f. Sect. 4.2). added input derived edges do not introduce a cycle. 6It is thus defined if and only if the in- and outgoing edges are balanced: ∑∈ (ℓ()) 1 = ∑∈ (ℓ()) 2 and the newly 4.1. 1D: Comparison of String Automata with DAG Automata Just as DAG automata recognize the regular string languages if the strings are encoded as string DAGs (Theorem 2.9), so do picture automata RFA, BFA, 2OTA, and 2DOTA when operating in one dimension only (Theorem 2.6). String DAG (Def. 2.7) joins vertices corresponding to the letters of a string in a simple path. But obviously, a node in a graph can have more than one incoming edge and one outgoing edge.

String DAGs and picture-to-DAG encodings from Def. 3.3 define edges connecting nodes representing symbols in a string or in a picture in an input-agnostic fashion. However, based on Def. 3.4, we can also add edges connecting symbols in an input-driven way. Since we use DAG automata for one- and two-dimensional strings, extra edges need to be added.

Edges are added either in an input-agnostic fashion [ 9 ] or an input-driven way [ 10 ], meaning that the edges depend either only on the dimensions of the input or also on the input itself. DAG encodings as defined in Def. 3.3 are input-agnostic.

Adding to a string DAG input-agnostic edges computed by FSAs does not seem to add power to DAG automata. An input-agnostic edge corresponds to the length of the substring 1 2 … with src() = 1 and tar () =

, because if an FSA is input-agnostic, it sees a unary language where the strings can only difer in their length. But a

DDA on string DAGs accepts the same strings as a FSA.

Every additional input-agnostic edge requires an FSA for a unary language. Since the DAG automata recognize graphs of bounded degree, only a constant number of input-agnostic edges are allowed, which in turn means that in each vertex, only a constant number of FSAs can start the computation of the spanning length of the input-agnostic edge. A vertex with both in and outgoing input-driven edges would use the final state for the unary FSA as starting state for the new unary FSA. This indicates that an FSA can, by power set construction, keep track of all FSAs computing the input-agnostic edges.

Thus, only a device more powerful than a FSA can exceed the regular languages by computing input-agnostic edges for the string. One example would be to add edges spanning vertices where is a power of two, a square, or a prime, requiring a stack, a linear bounded tape, or an infinite tape, respectively. A DAG automaton processing strings encoded with input-agnostic edges computed by machines more powerful than a mere FSA exceeds thus the power of the regular languages. Because input-agnostic edges require this trick of preprocessing with a more powerful device, we take a look at input-driven edges in the hope of avoiding devices more powerful than FSAs.

Input-driven edges do not require edges to be computed by a more powerful device than an FSA. () = (0, 1) , () = (1, 0) , in case of 1, and () = (1, 1) , in case of 2, and () = (1, 0) . 2 = { Nevertheless, the power of DAG automata recognizing strings encoded with input-driven edges exceeds the class of regular languages. Such automata can accept, e.g., languages 1 = { ∣ ∈ ℕ} and ∣ ∈ ℕ} when input-driven edges are added so that the double rank of the symbols is Theorem 4.1. DAG automata can recognize both certain context-free as well as context-sensitive languages if strings in the string languages are encoded as a string DAG with additional input-driven edges determined by a ranked alphabet. 4.2. 2D: Comparison of Picture Automata with DAG Automata The picture automata RFA and BFA are compared to DAG automata in Sect. 4.2.1, and based on that comparison, these two models are modified in Sect. 4.2.2. This modification in turn leads to comparing online tessellation automata to DAG automata in Sect. 4.2.3. 4.2.1. Comparison of RFA and BFA to DAG automata Returning Finite Automata (RFA) and Boustrophedon Automata (BFA) are the most limited picture automata in literature and recognize the same family of picture languages (Theorem 2.3).

How can a DAG automaton model these two picture automata? Scanning strategies of RFA and BFA are total orders on the positions of the picture. Both scanning strategies are drawn as oriented graphs in Figure 2a. For the RFA the big jump back from right to left was viewed critically. Therefore, the uses the alternating directions. Let us look at an example of how DAG automata process pictures. Example 4.2. Let ≣ be the picture language of horizontal stripes where adjacent rows are of diferent colors. Formally, over a binary alphabet Σ = {○, ●}, this would be ⎧ ⎪ ⎨ ⎪ ⎩ ⋮ ≣ = | | | {, } = {○, ●} and ∈ ℕ ⎫ ⎪ ⎬ ⎪ ⎭ .

Both RFA and BFA scan the input horizontally, to recognize horizontal lines. Also, they can recognize stripes, as well as the above-defined stripe language ≣, illustrated in Fig. 2a. And so can the DAG automaton, which simulates those two models by applying the encodings ⥂ and ↩. This can be accomplished ↪ deterministically, since the scanning strategy is a serialization. The serialization of a picture yields a string, or, in the case of a DAG automaton, a string DAG, and apart from the scanning strategy, both an RFA and a BFA act like an FSA for which the nondeterminism adds no power, see Theorem 2.4.

Abandoning the total order, we now consider encodings based on partial orders. We examine the encodings

↓⇉() ̂ and ⇉↓() ̂ , which provide structured representations of an input picture under relaxed ordering. We begin with ↓⇉() ̂ . Consider a DAG automaton for the language ≣ which uses the states ◓ and ◒ for encoding the alternation between two row colors in the edge labels. A rule cycle for the vertical edges ensures the alternating colors: … ◓ … # … ◒ … … ◒ … # … ◓ … A DAG automaton ({◓, ◒, ●, ○, #}, {●, ○}, ∪ ⬓ ∪ ≣ ∪ #) for the encoding ↓⇉ shown in Fig. 2b could label the edge path on the left border with the states ◒ and ◓ by a rule cycle alternating the rules ⬓ = {(◒ # ◓○), (◓ # ◒●)}. In this way, it can alternate between the colors on the left border and accept the lines in case they comply with the pattern memorized with the help of the states, vertical edge labels, at the left border. Additionally, it needs the rules ≣ = {(● ● ●), (○ ○ ○), (● ●

), (○ selves, and the rules # = {(◒ # #), (◓ # #), (# ○ # rules specified so far are top-down deterministic, but lack a specification for a root. The rule set of a nondeterministic DAG automaton may contain both rules in due to its determinism, a DDA is restricted to choosing one of the two root rules in allows the automaton to recognize ≣. In contrast, a DDA lacks the ability to guess the initial color. Since, for its rule set. It cannot include both of them. But this hardcodes the color. Consequently, a deterministic DAG automaton using encoding ↓⇉ can only recognize the stripe language starting with a fixed color. This is due to the left border being blind to the colors of the stripes when processing the picture top-down deterministically. The stripes depend on the left border but not vice versa. A

DDA cannot recognize ≣ encoded as ↓⇉.

However, with the right border vertically connected as in Fig. 2c, the vertical edges depend on the colors of the lines in a top-down deterministic run. Consequently, a DDA can recognize all pictures ⇉↓() ̂ ∈ rather than only those pictures with a fixed initial color. The rules (●◒ # ◓) and (○◓ #

≣ ◒) assign the edge labels deterministically downward, thereby propagating the color information. )} for accepting the colored lines them#), (# = {( # # )} for ’̂s lower border. The ◓#), ( # ◒#)}, which

Let us now formalize and generalize the observations that we have made on ≣.

DDAs run on pictures encoded by the scanning strategy of RFA and BFA, ⥂ and ↩↪, respectively. Theorem 4.3 (DDA ≡⥂ RFA, DDA ≡ ↩↪ BFA). RFAs and BFAs recognize the same picture languages as

Contrary to the total order on the pictures’ pixels (resulting in ordinary string languages), when we use DAG encodings inducing a proper partial order (proper meaning no total order), nondeterminism can equip a DAG automaton with more power. (c) … however this is possible with the right border connected, instead. For ⇉↓, all stripe DAGs are accepted, not only those with the fixed first color. The above figures illustrate with their red edges those edges where the first row’s color is determined. (a) While the total orders on the vertices imposed by the RFA and the BFA scanning strategies with the DAG encodings ↩↪ and ⥂ can detect the first row’s color, (b) the DAG encoding ↓⇉, even though permitting the same number of edges, is blind to the row colors in the input. Therefore, even though a DAG automaton can detect striped pictures with the encoding ↓⇉, it cannot recognize ≣, but only a subset of it – the set of those pictures whose first line exhibits the hard-coded color. (c) However, sees the first row’s color and can alternate the colors subsequently. encoding ↓⇉ for a picture by encoding it as ⇉↓() ̂ , thus ℒ↓⇉(DDA) ⊆ ℒ⇉↓(DDA). Lemma 4.4 (DDA

⇉↓ simulates DDA↓⇉). A deterministic DAG automaton can simulate using the DAGTheorem 4.5 (ℒ↓⇉(DDA) ⊊ ℒ⇉↓(DDA)). The class of picture languages recognized by deterministic DAG automata with input pictures encoded as ↓⇉() ̂ is a proper subclass of those encoded as ⇉↓() ̂ . Corollary 4.6. DDA for the DAG-encoding ↓⇉ is not closed under union.

The scanning strategy seems to determine, which languages an RFA or, equivalently, a BFA can recognize, which would mean that the language is partly encoded within the scanning strategy itself. This leads us to the discussion on the modification of RFAs and BFAs, which allows for greater flexibility and capability in recognizing a wider range of languages. 4.2.2. Adapted Scanning Direction for RFA and BFA As hinted in the previous section, an RFA can recognize diferent languages by altering its scanning strategy. More precisely, it need not adapt the strategy; changing direction sufices.

Example 4.7 (DAG language of diagonals). Let \ be the language where on every diagonal line oriented southeast all pixels have the same color. Fig. 3 illustrates this language with one diagonal only. \ = { ∈ Σ , ∣ , = +,+ for , + ∈ [] and , + ∈ [] } The language \ is not recognizable by an RFA [7, Lemma 47]7. rules

A DAG automaton can recognize a picture \ ∈ \ deterministically by encoding \ as ▧( \̂). The # # and # # detect the border. For all colors ∈ Σ , the rules # , and # ensure one fixed color for each diagonal. Like that, a DDA accepts a DAG ▧( \̂). By Def. 3.2, ▧( \̂) is a disconnected DAG. The DAG encoding ▧ uses the edges illustrated in green in Fig. 3a. We could add, as illustrated with black arrows in Fig. 3a, the edge sets ⬚̂↓ and edges from right to left at the 7The language in the cited proof is restricted to a specific diagonal but that does not afect the validity of the pumping argument for our language. bottom border to the DAG encoding, in order to obtain a connected DAG. But we do not need these edges for recognizing the language \. The automaton would need it however, for recognizing diagonal stripes. Recall that the border ensured the stripes in Example 4.2. Analogously, with these additional edges, a DAG automaton recognizes diagonal stripes deterministically.

An RFA scans row by row. In order for an RFA to recognize \, we could either rotate the pictures by the angle of 45∘ clockwise, or we could adapt the direction of the scanning strategy. If an RFA uses its scanning strategy in the direction southeast instead of east, thus ⦪, it will scan diagonals instead of rows. Similarly, a BFA can alternate its scanning direction between northwest and southeast instead of west and east. The total order for the diagonal scanning of an RFA of a two-dimensional string in the direction southeast is given by:

( 1 1) ≺ ( 2 2) if ( 1 + 2 < 1 + 2) ∨ (( 1 + 2 = 1 + 2) ∧ ( 1 < 2)) and similarly for a BFA.

This shows that an RFA can detect lines in the scanning direction. With a DAG encoding ▥ or a scanning direction downwards, DDA or RFA, respectively, could recognize a language with vertical lines |. With a DAG encoding ▨ or a scanning direction southwest, DDA or RFA could recognize a language with diagonal lines oriented to the southwest /. Adapting the scanning strategy to the language is kind of ‘cheating’ because we should not know which language we will parse. Otherwise, we could always encode a part of the language into the scanning strategy. This would outsource the complexity of the membership problem for a language into its automaton’s description via the scanning strategy. So we would hide a part of the language in the scanning strategy. This can be appropriate for certain use cases. However, if we do not know which angle for lines to expect in advance, we can use a grid. A grid has the advantage that it is not tailored to the language. The DAG encoding ▦ serves as one option for implementing a grid. Given a picture DAG ▦( \̂) with one diagonal of the color ● ∈ Σ, the rule for a pixel of the diagonal ● and the rule for the pixel on its right-hand side ○ form the rule cycle in Fig. 3b. This rule cycle yields the diagonal in the picture shown in Fig. 3c. It is easy to verify that the DAG encoding ▦ can detect deterministically horizontal, vertical, and diagonal lines.

The DAG encoding ▦ enables the detection of horizontal, vertical, and diagonal patterns within a single language. In contrast, an RFA or BFA must be tailored with a specific scanning direction and, as a result, can recognize only one such pattern per language. Consequently, a DAG automaton operating on pictures encoded via ▦ exhibits strictly greater expressive power than both RFA and BFA. The following section identifies the classical picture automaton model to which the picture-to-DAG encoding ▦ corresponds. 4.2.3. Comparison of online tessellation automata to DAG automata The DAG encoding ▦ provides the capability to recognize 2D patterns that an RFA cannot. This leads us to the question: What capabilities does a DAG automaton possess when processing pictures encoded via the picture-to-DAG encoding ▦? In order to answer this question, we first show that the boundary of a picture does not increase the expressiveness of a DAG automaton. This observation will ease the proof of expressiveness for the encoding ▦.

The concept of encoding a picture as a boundary picture ̂ raises the question of whether the presence of this boundary influences expressive power. The following lemma asserts that, for a picture language , padding pictures with sentinels #, when encoding pictures using the picture-to-DAG encoding ▦, does not afect whether an NDA can accept the picture language .

Lemma 4.8 (Boundary invariance of NDA). Let =̂ ( , Σ̂ ∪ {#}, ) ̂ be an NDA for recognizing boundary pictures ∈̂ ̂ encoded as ▦() ̂ with ∈ . Then there exists an equivalent NDA = (, Σ, ) recognizing the same set of pictures encoded without the boundary as ▦() , but restricted to proper pictures, meaning pictures with more than one row and column: ∀, > 1 ∀ ∈ Σ , ∶ ▦() ̂∈ ( ) ̂ ⟺ ▦() ∈ (). # # # # # # # # # # # # # # #

# (a) RFA, BFA with adjusted scanning direction and DDA can detect diagonal lines.

•

∘ (b) This rule cycle detects the diagonal for pictures encoded with ▦.

# # # # # # #

# p q p q # # # # # # #

# (c) The rule cycle in b yields this path in the picture, comprising the diagonal.

In fact, the use of the encoding ▦ corresponds directly with an online tessellation automaton: Theorem 4.9 (2OTA ≡▦ NDA). Online tessellation automata and DAG automata are equivalent with respect to the picture-to-DAG encoding ▦.

If we use input-driven DAG encoding instead of input-agnostic DAG encoding, a DAG automaton can recognize more picture languages than online tessellation automata.

Theorem 4.10. There exists a picture language that a DDA recognizes with the help of an input-driven DAG encoding, which neither an online tessellation nor a sgrafito 8[ 11, 12 ] automaton can recognize.

The DAG vertex and edge labels do not reflect the topology of the picture. We can retrieve the picture by the unique names of their vertices ∈ {(, ) for ∈ [], ∈ []} , but if we prefer a topology preserved also by the wiring of the DAG, we can combine an input-agnostic with an input-driven DAG encoding. Encoding a picture ∈ balance as ▦() ̂ with additional input-driven edges, as specified in the Proof of Theorem 4.10 above, yields proper DAGs for encoding this picture language while preserving the picture not only via the unique vertex naming but also via the topology given by the DAG encoding ▦. Note that balance includes the picture ′, where ⥂( ′̂) = . Using the rule set from the theorem above does not sufice since it would introduce cycles for ′. This can be fixed easily by adding ( b ′) and ( ′ a ) to the rules. However, these additional rules turn the DDA into an NDA since they introduce nondeterminism. Hence, the topology ▦ within the edge wiring comes at the price of losing determinism.

5. Conclusions

We have proposed a unifying framework for recognizing picture languages via DAG automata by encoding two-dimensional inputs into directed acyclic graphs (DAGs). Using diferent input-agnostic encodings of pictures into DAGs, we obtained DAG automata equivalent to two-dimensional returning ifnite automata, boustrophedon automata, and online tessellation automata.

Encoding pictures as DAGs using input-driven edges increases the power of DAG automata for recognizing both string and picture languages. Input-driven edges enable recognition of some contextsensitive string languages. In two dimensions, input-driven DAG automata accept a proper super-class of the class of recognizable languages. 8We do not provide the definition of a sgrafito automaton here, since we only refer to the result that recognized by a sgrafito automaton. balance cannot be

Deterministic DAG automata can be reduced and their equivalence is decidable in polynomial time [ 4 ]. Equivalence for deterministic DAG automata with respect to picture language recognition is still an open problem, as two deterministic DAG automata accepting the same encodings of pictures need not be equivalent, as they can accept diferent sets of DAGs that do not encode pictures.

Ad Meeres [ 5 ] provides a framework to port known eficient finite state automata algorithms and properties from strings to DAG languages. By applying the techniques, we could obtain classes of picture languages that can be eficiently recognized by finite state automata for DAG languages.

Declaration on Generative AI

The authors have not employed any Generative AI tools. [12] D. Průša, F. Mráz, F. Otto, Two-dimensional sgrafito automata, RAIRO Theor. Informatics Appl. 48 (2014) 505–539. doi:10.1051/ITA/2014023.

[1]

Giammarresi ,

Restivo , Two-dimensional languages , in: G. Rozenberg, A . Salomaa (Eds.), Handbook of Formal Languages , Volume 3 : Beyond

Words

, Springer, 1997 , pp. 215 - 267 . doi: 10 . 1007/978-3- 642 -59126- 6 _ 4 .

[2]

Kamimura , G. Slutzki, Parallel and two-way automata on directed ordered acyclic graphs , Information and Control 49 ( 1981 ) 10 - 51 . doi: 10 .1016/S0019- 9958 ( 81 ) 90438 - 1 .

[3]

Drewes ,

Knight ,

Kuhlmann , Formal Models of Graph Transformation in Natural Language Processing (Dagstuhl Seminar 15122) , Dagstuhl Reports 5 ( 2015 ) 143 - 161 . doi: 10 .4230/ DagRep.5.3.143.

[4]

Blum ,

Drewes , Language theoretic properties of regular DAG languages , Inf. Comput . 265 ( 2019 ) 57 - 76 . doi: 10 .1016/j.ic. 2017 . 07 .011.

[5]

Y. A.

Meeres , A new notion of regularity: Finite state automata accepting graphs , in: F. Manea, G. Pighizzini (Eds.), Proceedings 14th International Workshop on Non-Classical Models of Automata and Applications (NCMA 2024 ), NCMA 2024 , Göttingen, Germany, 12 -13 August 2024 , volume 407 of EPTCS , 2024 , pp. 5 - 26 . doi: 10 .4204/EPTCS.407.2.

[6]

Y. A.

Meeres ,

Mráz , A unifying approach to picture automata, 2025 . URL: https://arxiv.org/abs/ 2509.12077. arXiv: 2509 . 12077 .

[7]

Fernau ,

Paramasivan ,

M. L.

Schmid , D. G. Thomas, Simple picture processing based on ifnite automata and regular grammars , J. Comput. Syst. Sci . 95 ( 2018 ) 232 - 258 . doi: 10 .1016/J. JCSS. 2017 . 07 .011.

[8]

Průša ,

Mráz , Restarting tiling automata, Int. J. Found. Comput. Sci . 24 ( 2013 ) 863 - 878 . doi: 10 .1142/S0129054113400236.

[9]

Kuboň ,

Mráz , I. Rychtera , Learning automata using dimensional reduction , in: A. C. B. Garcia , M.

Ferro , J. C. R.

Ribón (Eds.), Advances in Artificial Intelligence - IBERAMIA 2022 - 17th Ibero-American Conference on AI , Cartagena de Indias, Colombia, November 23-25 , 2022 , Proceedings, volume 13788 of Lecture Notes in Computer Science, Springer, 2022 , pp. 41 - 52 . doi: 10 . 1007/978-3- 031 -22419- 5 _ 4 .

[10]

Kutrib ,

Malcher ,

Wendlandt , Input-driven multi-counter automata , Theor. Comput. Sci . 870 ( 2021 ) 121 - 136 . doi: 10 .1016/J.TCS. 2021 . 01 .032.

[11]

Otto ,

Mráz , Cyclically extended variants of sgrafito and restarting automata for picture languages , in: H. Bordihn , R.

Freund , B.

Nagy , G. Vaszil (Eds.), Eighth Workshop on NonClassical Models of Automata and Applications, NCMA

2016 , Debrecen, Hungary, August 29-30 , 2016 . Proceedings, volume 321 of books@ocg.at, Österreichische Computer Gesellschaft, 2016 , pp. 259 - 273 .