Nondeterministically Selecting Positive Instances of Context-Free Languages? Tomoyuki Yamakami Faculty of Engineering, University of Fukui, 3-9-1 Bunkyo, Fukui 910-8507, Japan Abstract. The notion of effectively selective languages was introduced into computational complexity theory in late 1970s and it has been dis- cussed extensively in connection to the P=?NP question. Its scope has been extended to finite automata and formal languages in the past liter- ature. By further extending the scope, this paper discusses the compu- tational complexity of selective languages whose underlying selectors are computed in various ways by nondeterministic one-way pushdown au- tomata equipped with write-once output tapes. We explore basic prop- erties of those languages and demonstrate their relationships to the ex- isting language families. We also seek a generalization of such selective languages by taking selectors from a higher functional hierarchy over multi-valued CFL functions. Keywords: pushdown automaton, selective set, advice, multi-valued function, CFL hierarchy, CFLMV hierarchy 1 Complexity of Selecting Positive Instances One-way pushdown automata are generally considered as simple-structured ma- chines with limited access to (pushdown) memory devices and one may think that various questions on their “structural” properties could be easily answered. In reality, on the contrary, there still remain numerous unsolved questions con- cerning the family CFL of context-free languages recognized by one-way nonde- terministic pushdown automata (or 1npda’s, for short) as well as its deterministic variant DCFL. As a few quick examples, it is yet unknown whether every lan- guage in co-CFL is “dissectible” by appropriately-chosen regular languages [23] and whether there is a CFL/n-“pseudorandom” language in CFL(2) [21], where CFL(k) consists of intersections of k context-free languages and CFL/n is a Karp-Lipton advice variant of CFL [16]. Viewing DCFL and CFL as reasonable analogues of P and NP, many complexity-theoretic concepts have been adapted into automata theory. Along this line of research, we intend to study the notion of “selectivity” of Selman [9] strictly within the framework of formal languages and automata theory. Given two arbitrary input instances to a pre-determined language, a selector chooses at least one of the two instances so that the chosen instance is most ? © Copyright Tomoyuki Yamakami for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). likely to be a “positive” instance of the language (namely, to belong to the language). We are particularly concerned with the efficiency of such a sector. The computational complexity of languages whose selectors are efficiently computable has been discussed in connection to the P =?NP question. The importance of selectivity also comes from the fact that each selector can induce a certain type of partial pre-order on equivalence classes over strings (see [7, 9]). The notion of P-selective languages, which was first introduced into compu- tational complexity theory in 1979 by Selman [9] as a polynomial-time analogue of Jockusch’s semi-recursive sets in recursion theory. The selective languages have been since then used as an important tool in studying various structural properties, such as completeness, self-reducibility, search-to-decision reductions, refinements, and promise problems (see [2] for references therein). Later, this no- tion was extended to NPSV- and NPMV-selective languages [2] and beyond to ΣkP SV- and ΣkP MV-selective languages [6], where two suffixes “SV” and “MV” indicate “single-valued function” and “multi-valued function”, and ΣkP is the kth level of the polynomial-time hierarchy [8]. More generally, given any function class F, F-selective languages are defined simply by the choice of appropriate selectors from F. Since the selectivity notion is heavily linked to languages and multi-valued partial functions, it is ideal to study the computational complexity of languages and functions together. Taking the opposite research direction, we look into more restrictive compu- tational models in order to witness how dramatically the complexity of select- ing positive instances changes. A precursor to our investigation is the study of “DFA-selectivity” (which was originally called fa-selectivity) of Tantau [13]. Our main interest, on the contrary, lies on the behaviors of selectors when they are computed on 1npda’s equipped with write-once? output tapes. Those machines are quite different in nature from powerful Turing machines and their behav- iors are quite restrictive because of the limited usage of their memory device. Multi-valued partial functions computed by such machines are generally called CFL-functions and they have been recently discussed in a series of works [17, 20, 18, 21]. More specifically, we intend to study selective languages whose selectors are particularly chosen from function classes, such as CFLSV and CFLMV [20] (see Section 2.2 for their definitions), which are natural analogues of the afore- mentioned function classes NPSV and NPMV, respectively. In a later section, we then turn our attention to higher complexity classes ΣkCFL SV and ΣkCFL MV, which constitute a functional version of the CFL hierarchy [19] defined in parallel to the polynomial(-time) hierarchy. 2 Basic Notions and Notations We will briefly explain existing notions and notations necessary to read through the subsequent sections. ? A tape is write-once if its tape head never moves to the left and, whenever it writes a nonempty symbol, it must move to the next blank cell. 2 2.1 Alphabets, Strings, and Languages Given a finite set A, the notation kAk denotes the number of elements in A. Let N indicate the set of all natural numbers (i.e., nonnegative integers) and set N+ = N−{0}. For two integers m and n with m ≤ n, the notation [m, n]Z denotes the integer interval {m, m + 1, m + 2, . . . , n}. In particular, we abbreviate [1, n]Z as [n] for any n ∈ N+ . We use the term “polynomials” to mean polynomials with nonnegative integer coefficients. The notation A − B for two sets A and B indicates the difference {x | x ∈ A, x 6∈ B}. Given a set A, P(A) denotes the power set of A. An alphabet is a nonempty finite set Σ of “symbols” or “letters”. A string x over Σ is a finite series of symbols chosen from Σ and its length, denoted by |x|, is the total occurrences of symbols in x. The empty string λ is a unique string of length 0. To express a pair of strings simultaneously, we use the track notation [ xy] and follow its convention, as discussed in [12]. For two symbols σ and τ , [ στ] expresses a new symbol. For two strings x = x1 x2 · · · xn and y = y1 y2 · · · yn of length n, [ xy] denotes the concatenated string [ xy11 ][ xy22 ] · · · [ xynn ]. We further expand m this notation as follows. In the case of |x| < |y|, [ xy] expresses [ x#y ], where m = |y| − |x| and # is a designated new symbol. Similarly, when |x| > |y|, the same succinct notation [ xy] expresses [ y#xm ] with m = |x| − |y|. A language over alphabet Σ is a collection of strings over Σ. Given an index k ∈ N, Σ kSis composed of all strings of length k; in particular, Σ 0 equals {λ}. Let Σ ∗ = k∈N Σ k . For any language A over Σ, its complement is Σ ∗ −A, which is also denoted by A as long as Σ is clear from the context. A function h : N → Σ ∗ is called length preserving if |h(n)| = n holds for all n ∈ N. Recall from [19, 20] the notion of \-extension. For any string x ∈ Σ ∗ , a \-extension of x is a string x̃ over Σ ∪ {\} such that x is obtained from x̃ by deleting all occurrences of \. Given a language A, the characteristic function χA for A is a unique function defined as χA (x) = 1 if x ∈ A, and χA (x) = 0 otherwise. Our basic computation models are 1-way (one-tape, one-head) nondeter- ministic finite automata (or 1nfa’s, for short) with λ-moves and 1-way (one- tape, one-head) nondeterministic pushdown automata (or 1npda’s) with λ-moves, Where a λ-move is a step at which an input-tape head stays still reading nothing. We also use their deterministic variants: 1dfa’s and 1dpda’s. Formally, a 1npda N with a write-once output tape computing f : Σ ∗ → P(Θ∗ ) has the following form: N = (Q, Σ, {|c, $}, Γ, Θ, δ, q0 , ⊥, Qacc , Qrej ), where Q is a finite set of inner states, Γ is a stack alphabet, Θ is an output alphabet, q0 (∈ Q) is the initial state, ⊥ (∈ Γ ) is the bottom marker, δ : (Q − Qhalt ) × Σ̌λ × Γ → P(Q × Γ ∗ × Θλ ), where Qhalt = Qacc ∪ Qrej , Σ̌λ = Σ ∪ {λ, c| , $}, c| and $ are respectively the left and the right endmarkers, and Θλ = Θ ∪ {λ}. Following [17, 19, 20, 18, 21], our 1npda’s are assumed to reach halting states along all computation paths in O(n) steps, where n refers to input size (see [19, 20] for reasoning). When M is in inner state q, scanning σ on the input tape, and a in the topmost stack cell, a transition (p, z, τ ) ∈ δ(q, σ, a) changes q to p, replaces a by z, and writes τ on the output tape. When σ 6= λ, the input-tape head must move to the 3 right. In contrast, when M makes a λ-move, its input-tape head reads λ and does not move. As for a 1nfa N , we use another transition function of the form δ : (Q − Qhalt ) × Σ̌λ → P(Q × Θλ ). It is important to remark that N is allowed to make λ-moves like 1npda’s. Along each computation path, a machine may produce a string on the out- put tape. We call such a string a valid output (or a legitimate output) if it is produced along an appropriate accepting computation path. Only valid outputs are considered as the machine’s true “outputs” and we automatically disregard any string produced along any rejecting computation path. This convention is particularly important for one-way machines because this makes it possible for such a one-way machine to start producing all possible strings nondeterministi- cally while reading an input string and then to invalidate any unwanted strings by simply entering rejecting states in the time of halting. We say that a machine with two heads on input and output tapes is syn- chronous if, whenever the input-tape head moves to the right, the output-tape head also moves to the right, and vice visa. To treat input instances of a given “2-ary” partial function f on a model of “1-way” machine M , we use the aforementioned track notation. For technical reason, unlike the case of Turing machines, we do not use two-tape 1nfa’s and 1npda’s to access two independent input strings x and y; instead, we split a single input tape into two tracks and write x and y on these tracks separately. More precisely, when a pair (x, y) of strings is given as an input instance to the machine M , we place x in the upper track and y in the lower track as [ xy] so that the tape head can read x and y simultaneously from left to right. More formally, if we wish to compute f on input (x, y), we start its underlying machine M whose input tape contains the string of the form [ xy]. From this formalism, we write f ([ xy]), instead of f (x, y), to describe the function f that takes an input of the form [ xy]. The notations REG and CFL stand for the family of all regular languages and that of all context-free languages, respectively. For each number k ∈ N+ , the k- conjunctive closure of CFL, denoted CFL(k), is defined recursively as CFL(1) = CFL and CFL(k + 1) = {A ∩ B | A ∈ CFL(k), B ∈ CFL} (cf. [22]). The advised language family REG/n consists of all languages L for which there exist an advice alphabet Γ , a length-preserving total function (called an advice function) x h : N → Γ ∗ , and a language A ∈ REG satisfying L = {x | [ h(|x|)] ∈ A} [12]. Similarly, by replacing REG in REG/n with CFL, we obtain CFL/n [15, 16]. Moreover, P (resp., NP) is composed of all languages recognized by one- tape deterministic Turing machines or DTMs (resp., one-tape nondeterministic Turing machines or NTMs) in polynomial time. 2.2 Multi-Valued Partial Functions and Refinement Throughout this paper, similarly to [20], the generic term “function” refers to “multi-valued partial function,” provided that single-valued functions are viewed as a special case of multi-valued functions and partial functions include total 4 functions. We are mostly interested in multi-valued partial functions mapping?? Σ ∗ to Γ ∗ for certain alphabets Σ and Γ , not necessarily limited to {0, 1}. When- ever f is single-valued, we often write f (x) = y instead of y ∈ f (x). The domain dom(f ) of f is defined to be the set {x ∈ Σ ∗ | f (x) 6= ∅}. When x 6∈ dom(x), f (x) is said to be undefined. A multi-valued partial function f : Σ ∗ → P(Γ ∗ ) is called polynomially bounded if there exists a polynomial p such that, for any two strings x ∈ Σ ∗ and y ∈ Γ ∗ , if y ∈ f (x) then |y| ≤ p(|x|) holds. We understand that all function classes dealt with in this paper are assumed to be polynomially bounded. Given two alphabets Σ and Γ , a multi-valued partial function f : Σ ∗ → P(Γ ∗ ) is called length preserving if, for any two strings x ∈ Σ ∗ and y ∈ Γ , y ∈ f (x) implies |x| = |y|. Similarly, f is length decreasing if y ∈ f (x) implies |y| < |x| for any x, y ∈ Σ ∗ . Given two multi-valued functions f and g, we say that g is a refinement of f , denoted by f vref g, if (1) dom(g) = dom(f ) and (2) for every input x ∈ dom(f ), g(x) ⊆ f (x) (as a set inclusion) holds (see [11]). When Condition (1) is replaced by (10 ) dom(f ) ⊆ dom(g), we say that g is a pseudo refinement (+) of f , denoted by f vref g. For two sets F and G of functions, F vref G (resp., (+) F vref G) if, for every function f ∈ F, there exists a function g ∈ G for which (+) (+) f vref g (resp., f vref g). Clearly, F vref G implies F vref G. When g is further single-valued, we often call f a single-valued (pseudo) refinement of f . The following four basic function classes were introduced in [17]. The function class CFLMV is composed of all multi-valued functions f , each of which maps Σ ∗ to P(Σ ∗ ) for a certain alphabet Σ and there exists a 1npda N with a 1-way read-once input tape together with a write-once output tape such that, for every input x ∈ Σ ∗ , f (x) is the set of all valid outcomes of N on the input x. CFLMVt is a natural restriction of CFLMV onto total functions. Two classes CFLSV and CFLSVt are subclasses of CFLMV and CFLMVt consisting only of single- valued functions. Another function class CFLMV(2) contains all multi-valued functions f such that there are two functions g1 and g2 in CFLMV satisfying f (x) = g1 (x) ∩ g2 (x) (set intersection) for any string x [21]. In a similar way, CFL2V(2) and CFL2Vt (2) are defined. 3 Efficiently Selective Languages and Their Basic Properties In the polynomial-time setting, multi-valued functions have been used to de- fine the notions of NPMV- and ΣkP MV-selectivity. Here, we introduce CFLMV- selectivity and its variants. ?? To describe a multi-valued function f , the expression “f : Σ ∗ → Γ ∗ ” is customarily used in the literature. This is equivalent to saying that f is a total mapping from Σ ∗ to P(Γ ∗ ). 5 3.1 Definition of Selectors and Selective Languages We begin with a general notion of selectors. Definition 1. Given a multi-valued function f : Σ ∗ × Σ ∗ → P(Σ ∗ ) (called a selector), a language A is f -selective if, for any pair x, y ∈ Σ ∗ , (a) f ([ xy]) ⊆ {x, y}, and (b) {x, y} ∩ A 6= ∅ implies both f ([ xy]) 6= ∅ and f ([ xy]) ⊆ A. As an essential difference from two-way Turing machines, when a 1npda on input [ xy] produces a string, say, s on its output tape along each computation path, the 1npda cannot scan the passed tape cells again and thus there seems no general way of knowing that the produced string s is x or y unless the 1npda is synchronous. To circumvent the lack of such a post-production checkup of the output strings, we require the following extra technical setup. When a selector handles a string of the form [ xy], we want to provide an underlying machine with [ x$ y$2 ], where $1 and $2 are designated endmarkers used for the first and the second 1 tracks. To do so, we introduce the notion of “endmarked extensions.” Given a selector f , another function g is called the endmarked extension of f if, for any distinct x, y ∈ Σ ∗ , (1) f ([ xy]) = ∅ iff g([ x$ x x$1 y$2 ]) = ∅, (2) x ∈ f ([ y ]) iff x$1 ∈ g([ y$2 ]), 1 x (3) y ∈ f ([ y ]) iff y$2 ∈ g([ x$ x x$1 y$2 ]), and (4) x ∈ f ([ x]) iff g([ x$2 ]) ∈ {x$1 , x$2 }. For 1 brevity, we call x$1 and y$2 endmarked strings. Definition 2. Given a class F of multi-valued partial functions, a language A is F-selective if there exists a multi-valued function f and its endmarked extension f˜ such that (i) A is f -selective and (ii) both f and f˜ belong to F. We write F-SEL for the collection of all F-selective languages. We stress that Definition 2 does not change the well-known selective lan- guage families, such as P-SEL and NPSV-SEL. For a further discussion on our introduction of endmarked extension, see Section 6. Unlike P-SEL, however, we do not intend to discuss DFA-SEL and DCFL-SEL throughout this paper because they are too restrictive in practice to contain even their associated underlying language families REG and DCFL, respectively. Let us make simple observations. Obviously, for any two function classes F and G, if F ⊆ G, then F-SEL ⊆ G-SEL. Next, we remark that a 1nfa can decide which of two given strings x and y in the form of [ xy] is lexicographically smaller than the other. This fact will be used in later arguments. Let ≤ denote the lexicographic order (i.e., sort firstly by length and sort secondly by a fixed alphabetical order). Lemma 3. The function f defined as f ([ xy]) = min{x, y}, where min is taken according to the lexicographic order, can be computed by a certain 1nfa. Proof. Given the function f , let us consider the following 1nfa M . On input [ xy], first “guess” either x or y (i.e., nondeterministically choose either x or y), write down the guessed string, say, s. Assuming |x| = |y|, determine whether or not s = min{x, y} and then check whether |x| = |y| by reading the entire input. 6 In the case of |x| =6 |y|, we immediately invalidate this computation path by entering a rejecting state. More precisely, we compare x and y symbol by symbol from left to right to check whether x precedes y or y precedes x according to the lexicographic order ≤. If we discover that x precedes y, then we check whether s = x and |x| = |y|. If so, we accept; otherwise, we reject. When y precedes x, we do the same by exchanging x and y. 2 A quick application of the above lemma is the existence of special selectors. We say that a selector f is symmetric if f ([ xy]) = f ([ xy ]) for any pair x, y ∈ Σ ∗ . It is possible to make certain selectors symmetric. Lemma 4. For every language A in CFLSVt -SEL, there always exists a sym- metric CFLSVt -selector f for A. This statement also holds for CFLSV-SEL, CFL2V-SEL, and CFL2Vt -SEL. Lemma 5. CFLSVt -SEL = co-(CFLSVt -SEL) and co-(CFLSV-SEL) ⊆ CFL2V-SEL. The F-selectivity is associated with a certain partial pre-ordering, which sat- isfies reflexivity and transitivity. We give a simple example of CFLSVt -selective language using the standard lexicographic ordering. Given a machine M and an input x, M (x) = 1 (resp., M (x) = 0) means that M accepts (resp., rejects) x. Example 6 (1) All regular languages are NFASVt -selective; namely, REG ⊆ NFASVt -SEL. To see this fact, for each regular language A, we take an arbitrary 1dfa M and construct its associated function f as follows. Define f ([ xy]) = x if (i) M (x) = M (y) = 1 and x ≤ y, (ii) M (x) = 1 6= M (y), or (iii) M (x) = M (y) = 0. It is not difficult to see that f is indeed a selector and in NFASVt -SEL. (2) Given a “single-valued” symmetric selector f for a language A over Σ, we write x f y if f ([ xy]) = x. This binary relation f becomes a partial pre-order on Σ ∗ and it satisfies that, for every y ∈ Σ ∗ , y ∈ A implies {x | x f y} ⊆ A, and y ∈/ A implies A ⊆ {x | x f y}. This property will be used later. 3.2 Closure Properties It is known that CFL is closed under union and inverse homomorphism but not intersection. Most F-SEL’s are naturally closed under union. By sharp con- trast, intersections of even NFASVt -selective languages possess arbitrarily high complexities. Lemma 7. Let us consider an arbitrary language A over the binary alphabet Σ = {0, 1} and assume that kA ∩ Σ n k ≤ 1 holds for all n ∈ N+ . There exist two languages L1 , L2 ∈ NFASVt -SEL satisfying A = L1 ∩ L2 . Proof. Using the standard lexicographic ordering ≤ on Σ ∗ , we define L1 = {x ∈ Σ ∗ | ∃y ∈ A ∩ Σ |x| [x ≤ y]}. By Lemma 3, the function f ([ xy]) = min{x, y}, where “min” is according to ≤, belongs to NFASVt . Thus, f is an NFASVt -selector for 7 L1 . In a similar way, we define L2 = {x ∈ Σ ∗ | ∃y ∈ A ∩ Σ |x| [y ≤ x]}. It then follows that A = L1 ∩ L2 . 2 Since we can freely choose A in Lemma 7, we instantly obtain the following consequence. This is similar to the non-closure property of the class P-SEL of all P-selective sets under intersection [5]. Let TALLY be the set of all tally languages over arbitrary alphabets Σ (i.e., languages consisting only of strings of the form ai for a certain fixed symbol a ∈ Σ). Proposition 8. NFASVt -SEL is not closed under intersection. The same holds for CFLSVt -SEL and CFLSV-SEL. Proof. Take a tally language A not in NFASVt -SEL. By Lemma 7, there are two languages L1 , L2 ∈ NFASVt -SEL for which A = L1 ∩ L2 . If NFASVt -SEL is closed under intersection, then A must be in NFASVt -SEL, a contradiction. 2 Note that CFL ∩ TALLY ⊆ NFASVt -SEL because all “unary” context- free languages are regular. However, it is not known whether or not CFL ⊆ CFLSVt -SEL. Next, we argue various closure properties of CFLSVt -SEL. The first property is under a certain weak form of many-one reduction. Given two languages A and B, we say that A is NFASVt -translatable, denoted by A ≤NFASV trans t B, if there exists a synchronous 1nfa N (called a translator) such that, on input x, N produces a single valid string, say, w on its output tape such that x ∈ A iff w ∈ B. If we further require N to satisfy the length decreasing property (i.e., the length of any string produced on the output tape must be strictly shorter than the input length), then we instead use the term of length-decreasing NFASVt - translatability. We claim the following lemma. In comparison, P-SEL is known to be closed downward under polynomial-time truth-table reductions [10]. Lemma 9. CFLSVt -SEL is closed downward under NFASVt -translations; that is, for any two languages A and B, if A ≤NFASV trans t B and B ∈ CFLSVt -SEL, then A ∈ CFLSVt -SEL. Proof. For any two languages A and B, assume that A ≤NFASV trans t B via a translator N with synchronized tape heads and that B ∈ CFLSVt -SEL via a selector f . Let Nf denote a 1npda computing f . For simplicity, we assume that A and B are languages over the same alphabet, say, Σ. We want to show that A ∈ CFLSVt -SEL. Given two strings x, y ∈ Σ ∗ , assume that N (x) produces a single valid string, say, z and similarly N (y) produces w. Next, we define g([ xy]) = x if f ([ wz ]) = z, and g([ xy]) = y if f ([ wz ]) = w. We claim that g is a selector for A. Let us consider the case where {x, y}∩A 6= ∅. If x ∈ A and y ∈ / A, then z ∈ A and w ∈ / A; thus, we obtain g([ xy]) = x ∈ A. Assume that x, y ∈ A. Since z, w ∈ A by the definition of N , we conclude that {x, y} ∩ g([ xy]) 6= ∅. Next, we show that g is in CFLSVt by constructing an appropriate 1npda M computing g. On input [ xy], M simulates both N (x) and N (y) simultaneously by reading [ xy] symbol by symbol since N ’s tape heads are synchronized, and 8 it produces [ wz ] on an “imaginary” output tape. During this process, as each symbol σ of the string [ wz ] is produced on the imaginary tape, M simulates one step (together with a series of λ-moves) of Nf on σ. It is not difficult to see that M computes g correctly. 2 As for P-selectivity, Buhrman and Torenvliet [1] demonstrated that a lan- guage A is in P iff A is polynomial-time self-reducible and P-selective. In a similar spirit, we show the following theorem concerning NFASVt -translatability. Theorem 10. Given any language A, if A is CFLSVt -selective and A is length- decreasing NFASVt -translatable to A, then A belongs to CFL ∩ co-CFL. Proof. Given any language A, let M denote a 1nfa that translates A to A. Let f be a CFLSVt -selector for A and take an appropriate 1npda Mf computing f . To show that A ∈ CFL, we define N as follows. On input x, N runs M on x. While M produces z along a certain computation path, N produces [ xz] on an imaginary tape of N . Note that x 6= z since x ∈ A iff z ∈ A. Whenever a single symbol σ is produced on the imaginary tape, N simulates one step (together with a series of λ-moves, if necessary) of Mf on σ using a stack and the imaginary tape. Since |z| < |x|, by simulating Mf , N easily notices which of x and z is a valid outcome of Mf . If f ([ xz]) = x, then N accepts; otherwise, N rejects. Next, we show that N correctly recognizes A. Assume that x ∈ A. If f ([ xz]) = z, then we conclude that z ∈ A because {x, z} ∩ A 6= ∅. However, this implies x∈ / A, a contradiction. Thus, we obtain f ([ xz]) = x; in other words, N accepts. Next, assume that x ∈ / A. If f ([ xz]) = x, then z ∈ / A holds because, otherwise, f ([ xz]) = z must hold. Since z ∈ / A, we obtain x ∈ A, a contradiction. This implies f ([ xz]) = z, and thus N rejects. Therefore, N correctly recognizes A. By Lemma 5, A is also CFLSVt -SEL. Moreover, A is also length-preserving NFASVt -translatable to A. Since the above argument for A also works for A, we thus conclude that A is in CFL. Therefore, A belongs to CFL ∩ co-CFL. 2 Lemma 11. CFLSVt -SEL is closed under inverse homomorphism. Proof. Let A denote any language over alphabet Σ in CFLSVt -SEL and let h be any homomorphism from another alphabet Γ to Σ. Our goal is to show that the language Ah−1 = {x ∈ Γ ∗ | h(x) ∈ A} belongs to CFLSVt -SEL. Take a selector f in CFLSVt for A and consider a 1npda M computing its endmarked extension f˜. Consider the following algorithm. On input [ xy], we guess z ∈ {x, y} and produce it on an output tape. At the same time, we apply h to each symbol of [ xy]. By adjusting the size of obtained strings by h symbol by symbol, we run M using an “imaginary” output tape. When M ends its computation, M produces either $1 or $2 , which indicates which of x and y is actually produced. We accept the input if the string produced by M matches the guessed string; otherwise, we reject the input. 9 4 Power and Limitation of Selective Languages We will continue exploring basic properties of F-selective languages. In Example 6(1), we have already shown that REG ⊆ NFASVt -SEL. Hereafter, we intend to establish a further connection between CFL variants and their selectivity. Proposition 12. (1) CFL ⊆ CFL2V-SEL. (2) CFL ∩ co-CFL ⊆ CFLSVt -SEL. (3) CFL(2) ⊆ CFL2V(2)-SEL. Proof. (1) For each language A ∈ CFL, take a 1npda M that recognizes A. To show that A ∈ CFL2V-SEL, it suffices to construct an appropriate selector for A. In what follows, we will describe a 1npda N equipped with a write-once output tape computing this selector. Let (x, y) be any input pair. Assume that [xy] is given to an input tape of N . On this input [ xy], N guesses (i.e., nondeterministically chooses) either x or y and runs M on this guessed string. Assume that N has guessed a string s ∈ {x, y}. While reading s, N copies it onto the output tape symbol by symbol. Along each computation path, when M halts in an accepting state or a rejecting state, N enters the same inner state. Remember that, if N enters a rejecting state along a computation path, the string s produced on the output tape is automatically invalidated. Let f ([ xy]) be the valid outcome of N on the input [ xy]. It is obvious that f is a two-valued partial function. Since f ([ xy]) ⊆ {x, y} for any pair (x, y), f belongs to CFL2V. It is easy to show that f ([ xy]) ⊆ A when {x, y} ∩ A 6= ∅. Thus, A is in CFL2V-SEL. (2) Assume that A ∈ CFL ∩ co-CFL. Recall from [21, Lemma 2.1] that A ∈ CFL ∩ co-CFL iff χA ∈ CFLSVt . By this equivalence, we obtain χA ∈ CFLSVt . Take a 1npda M computing χA . We then define another 1npda N as follows. On input [ xy], guess two strings s, t ∈ {x, y} (nondeterministically). In the case of s = x, we run M on s, and simultaneously writes t on the output tape. Along a computation path, assuming t = x, if M outputs 1, then N rejects; otherwise, N accepts. In contrast, assuming t = y, if M outputs 1, then N rejects; otherwise, N accepts. In the case of s = y, N runs M on s and writes y. If M outputs 1, then N accepts; otherwise, N rejects. Let g be a multi-valued partial function computed by N . It is easy to show that f is total and single-valued. Moreover, it is not difficult to show that g is a selector for A and in CFLSVt . (3) Let L be any language in CFL(2) and take two languages A1 , A2 ∈ CFL satisfying L = A1 ∩ A2 . By (1), there are CFL2V-selectors f1 and f2 for A1 and A2 , respectively. We define g(z) = f1 (z) ∩ f2 (z) for any z. Clearly, g belongs to CFL2V(2). Since g is a selector for L, L belongs to CFL2V(2)-SEL. 2 The family NFASVt -SEL contains REG by Example 6(1). However, it is not powerful enough to include DCFL. Proposition 13. DCFL * NFASVt -SEL. Proof. Consider the non-regular language Lab = {an bn | n ∈ N} over the binary alphabet Σ = {a, b}. Assume that Lab ∈ NFASVt -SEL and take a selector f for Lab in NFASVt . This f induces a partial pre-order f on Σ ∗ as in Example 10 6(2). Note that {x ∈ Σ n | x f an bn } ⊆ Lab and Lab ⊆ {x ∈ Σ n | i, j ∈ N, x f ai bj , i + j = n, i 6= j}. Thus, Lab equals {x ∈ Σ ∗ | n ∈ N, x f an bn }. It then follows that an bn ≤f ai bj for any i, j with i + j = n. We define co-1nfa N as follows. On input ai bj (with i + j = n), guess ak bl (with k + l = n), run N on i j i j [ aakbbl ] to see f ([ aakbbl ]) = ai bj (that is, ai bj f ak bl ). If so, we accept, or else we reject. We then conclude that x ∈ Lab iff all computation paths of N are accepting. Since N is a co-1nfa, this characterization of Lab implies that Lab is in REG, a contradiction. 2 In Proposition 12(1), we have already seen that CFL ⊆ CFL2V-SEL. We fur- ther relate CFL ⊆ CFLSV-SEL to a pseudo refinement of CFL2Vt by CFLSVt . (+) Lemma 14. If CFL ⊆ CFLSV-SEL, then CFL2Vt vref CFLSVt . Proof. Assume that CFL ⊆ CFLSV-SEL. Take any function f ∈ CFL2Vt having its range D of size exactly 2. Let D = {y1 , y2 } and let d = max{|y| : y ∈ D}. Let M denote a 1npda computing f . Define L = {[ xy] | |x| ≥ d, y ∈ f (x)}, which belongs to CFL since D is finite. Since L ∈ CFLSV-SEL by our assumption, take a CFLSV-selector g for L and let N be its underlying 1npda that computes g with a write-once output tape. We then define g̃ as ĝ(x) = g([ uv]) for u = [ yx1 ] and v = [ yx2 ]. Note that ĝ is a single-valued function. Consider the following 1npda, say, N . On input x, compute ĝ on x and produce its outcome on an imaginary tape. Instead of producing an outcome of the form [ xy] with y ∈ D, write down y on the output tape. Let h be a function computed by this machine N . Note that h is in CFLSVt since f is a total function. (+) Finally, we want to show that f vref h. Assume that f (x) 6= ∅. Consider the case where f (x) = {y1 , y2 }. Since [ yx1 ], [ yx2 ] ∈ L, ĝ(x) must output a single x string of the form [ w ] in L, where w ∈ {y1 , y2 }. By the definition of h, we obtain h(x) = w. The case of f (x) = {y} ⊆ D is similarly treated. 2 Lemma 14 helps us prove the following assertion. Proposition 15. CFL2V-SEL = CFLSV-SEL ⇒ CFL ⊆ CFLSV-SEL ⇒ CFL2Vt -SEL = CFLSVt -SEL. Proof. Assume that CFLSV-SEL = CFL2V-SEL. Since CFL ⊆ CFL2V-SEL by Proposition 12(1), it follows that CFL ⊆ CFLSV-SEL. Assuming CFL ⊆ CFLSV-SEL, we show that CFL2Vt -SEL = CFLSVt -SEL. For any language L ∈ CFL2Vt -SEL, take a CFL2Vt -selector f for L. By Lemma 14, there exists (+) a g ∈ CFLSVt satisfying f vref g. Since dom(f ) ⊆ dom(g) and g([ xy]) ⊆ x f ([ y ]) ⊆ {x, y}, we conclude that g is a selector for L as well. Hence, L belongs to CFLSVt -SEL. 2 We further discuss the limitations of CFLSVt -SEL and CFLMV(2)-SEL. It is known that DCFL * REG/n [12] and CFL(2) * CFL/n [21, Corollary 3.10]. These separation results together with Proposition 12 yields the following immediate consequence. 11 Proposition 16. (1) CFLSVt -SEL * REG/n. (2) CFL2V(2)-SEL * CFL/n. Proof. (1) Assume that CFLSVt -SEL ⊆ REG/n. By Proposition 12(2), we conclude that CFL ∩ co-CFL ⊆ REG/n. Since DCFL ⊆ CFL ∩ co-CFL, we obtain DCFL ⊆ REG/n. This contradicts the separation DCFL * REG/n [12]. (2) Similarly to (1), assume that CFL2V(2)-SEL ⊆ CFL/n. Since CFL(2) ⊆ CFL2V(2)-SEL by Proposition 12(3), we conclude that CFL(2) ⊆ CFL/n, a contradiction against CFL(2) * CFL/n [21]. 2 5 Extensions to Higher Complexity Classes over CFL Analogously to the polynomial(-time) hierarchy over P and NP, the CFL hierar- chy was introduced in [19] over DCFL and CFL by way of respectively viewing DCFL and CFL as P and NP. We quickly review the definition of this intriguing hierarchy. An oracle 1npda is a 1npda equipped with a write-once query tape by which the 1npda makes queries (adaptively) to a given oracle. Such a query tape is marked by two endmarkers {|c, $}, where $ is placed on a tape cell whose index is bounded by O(n) [19]. Every time when a query is made, the query tape is au- tomatically initialized with the endmarkers. The role of the oracle is to reset the oracle 1npda’s inner state to a “yes” state (qyes ) or a “no” state (qno ) according to the case where a query word belongs to the oracle or it does not, respectively. Given a language A, the notation CFLA T (or CFLT (A)) expresses the collection of all languages recognized by those oracle 1npda’s with an access to the oracle A. ForSa language family C, we use the notation CFLCT (or CFLT (C)) for the union A∈C CFLA T . The CFL hierarchy is {∆k CFL , ΣkCFL , ΠkCFL | k ∈ N+ } [19], CFL CFL CFL where ∆1 = DCFL, Σ1 = CFL, Πk = co-ΣkCFL , Σk+1 CFL = CFLT (ΠkCFL ), CFL CFL and ∆k+1 = DCFLT (Πk ) for each k ≥ 1. We have seen a lower bound of the complexity of CFLSVt -SEL in Proposition 12. In what follows, we show an upper bound of the complexity of CFLSVt -SEL. In comparison, we note that P-SEL ⊆ NP/lin ∩ co-NP/lin [4]. It is important to note that our advice can use an arbitrary alphabet, not necessarily limited to {0, 1} in the case of NP/lin. Proposition 17. CFLSVt -SEL ⊆ CFLT (CFL(2))/n ∩ co-CFLT (CFL(2))/n. Proof. Since co-(CFLSVt -SEL) = CFLSVt -SEL by Lemma 5, it suffices to prove that CFLSVt -SEL ⊆ CFLT (CFL(2))/n. Let L be any language with a symmetric CFLSVt -selector f . Let f˜ denote the endmarked extension of f . We consider a tournament (graph) G = (V, E) induced by f as V = L ∩ Σ n and E = {(a, b) | a 6= b, f ([ ab]) = b}. We use the following result by Hohn, Landau, and Vaughan (cited in [14] and also in [3]): given a k-tournament G = (V, E), there is a special node from which all nodes can be reached via paths of length at most 2. We use the advice alphabet {0, 1, a} and define h(n) = an if L ∩ Σ n = ∅, and wn otherwise, where wn (∈ L) is the special player (i.e., node) of the tournament 12 from which all the other players (nodes) can be reached along paths of length at u most 2. We define A = {[ w ] | u = [ xz], (w, x) ∈ E ∨ [(w, z) ∈ E ∧ (z, x) ∈ E]}. We u claim that A is in CFL(2). We define B1 as follows: on input [ w ] with u = [ xz], nondeterministically check one of the following two conditions: (i) f˜([ w$ x$1 ]) = x$2 1 and (ii) w 6= z and f˜([ w$ z$2 ]) = z$2 . Let B2 be defined similarly by replacing (ii) 1 with the following condition: (ii0 ) z 6= x and f˜([ x$ z$1 2 ]) = x$2 . It follows that B1 , B2 ∈ CFL. Since A = B1 ∩ B2 , we obtain A ∈ CFL(2). Next, we define an u u oracle 1npda N as follows: on input [ w ], guess z and query [ w ] with u = [ xz] to A. If A answers affirmatively, then N accepts; otherwise, N rejects. Since x L = {x | [ h(|x|)] ∈ A}, L belongs to CFLT (A)/n ⊆ CFLT (CFL(2))/n. The second part CFLSVt -SEL ⊆ co-CFLT (CFL(2))/n follows from the fact that CFLSVt -SEL is closed under complementation. 2 C Multi-valued functional versions of CFLA T and CFLT are denoted respec- A C tively by CFLMVT (or CFLMVT (A)) and CFLMVT (or CFLMVT (C)) [18]. The CFLMV hierarchy then consists of language families: Σ1CFL MV = CFLMV, CFL and Σk+1 MV = CFLMVT (ΠkCFL ), and ΠkCFL MV = co-ΣkCFL MV for every level k ≥ 1 [20]. Similarly, ΣkCFL SV is defined for every k ≥ 1. Here, we only remark that Lemmas 4–5, Theorem 10, and Proposition 12(1) are easily extended to an arbitrary level k ≥ 1. For instance, a generalization of Proposition 12(1) is stated as follows. Proposition 18. Let k ≥ 1. (1) ΣkCFL ⊆ ΣkCFL 2V-SEL. (2) ΠkCFL ⊆ CFL Σk+1 2V-SEL. Corollary 19. For any level k ≥ 1, if ΠkCFL * ΣkCFL 2V-SEL, then ΣkCFL 2V-SEL 6= Σk+1 CFL 2V-SEL. CFL Proof. Since ΠkCFL ⊆ Σk+1 2V-SEL holds by Proposition 18, the assumption ΠkCFL * Σk 2V-SEL implies ΣkCFL 2V-SEL 6= Σk+1 CFL CFL 2V-SEL. 2 Proposition 12(2) can be also extended into a higher level; however, its proof is more involved than that of the proposition. Theorem 20. For any level k ≥ 1, ΣkCFL ∩ ΠkCFL ⊆ ΣkCFL SVt -SEL. Corollary 21. Let k ≥ 2. If ΣkCFL = ΠkCFL , then ΣkCFL ⊆ ΣkCFL SVt -SEL. Proof of Theorem 20. We first note the following claim. Claim 22 For any index k ≥ 1. Given a language A, A ∈ ΣkCFL ∩ ΠkCFL iff χA ∈ ΣkCFL SVt . The case of k = 1 of Claim 22 was already shown as [21, Lemma 2.1] and its generalization for k ≥ 2 can be similarly proven. Let A ∈ ΣkCFL ∩ ΠkCFL . Claim 22 implies that χA is in ΣkCFL SVt . Define x f ([ y ]) = [ uv], where u = χA (x) and v = χA (y). Next, we claim the following. 13 Claim 23 Let k ≥ 2 and let p be a function in ΣkCFL SVt . The function g defined as g([ xy]) = [ p(x) ∗ CFL p(y)] for all x, y ∈ Σ belongs to Σk SVt . The proof of this claim is quite technical. One may prove it by first estab- lishing that underlying oracle 1npda’s can be replaced by oracle 1nfa’s having access to Dyck languages, as shown for ΣkCFL in [19, arXiv version]. Proof of Claim 23. This proof relies on a result of [19, arXiv version]. Let k ≥ 2 and let p ∈ ΣkCFL SVt ∩ Ff in . Recall that, in [19], for any language A, NFA,A NFA,A two language families Σm,k and Πm,k were introduced. In a similar way, we A A NFA NFA can define Σm,k SVt and Πm,k SVt as follows. Let Σm,1 NFA SVt A = (NFASVt )Am, Π NFA,A NFA Πm,k SVt A = co-Σm,k NFA SVt A , and Σm,k+1 NFA SVt A = (NFASVt )mm,k−1 for any k ≥ 1. Following an argument given in [19, arXiv version], we obtain the following claim. Claim 24 For any k ≥ 1, ΣkCFL SVt = Σm,k NFA SVt DY CK , where DY CK denotes the Dyck language. Since f ∈ ΣkCFL SVt , take an oracle 1nfa M that computes f with access NFA,DY CK to oracle B in Πm,k−1 . Consider the following machine N . On input [ xy], N simulates in parallel both M (x) and M (y) without using any stack. If M (x) produces u and M (y) produces w on their query tapes, then N produces [ w̃ ũ ] on its query tape, where ũ and w̃ are \-extensions of u and w. Moreover, N outputs [ ab], where a and b are outcomes of M on x and y, respectively. Finally, define C = {[ w̃ũ ] | u, w ∈ B}. It was shown in [19] that Π CFL = Π NFA,DY CK . k−1 m,k−1 CFL NFA,DY CK Since C ∈ Πk−1 , we obtain C ∈ Πm,k−1 . By the definition of N and C, the function defined by g([ xy]) = [ ff (x) (y)] is computed by N with the oracle C. Therefore, g belongs to ΣkCFL SVt . 2 By Claim 24, it follows that the function f belongs to ΣkCFL SVt . Next, con- sider the following oracle 1npda, say, N . On input [ xy], guess b ∈ {0, 1} and write x (resp., y) on an output tape if b = 0 (resp., b = 1). At the same time, calculate f ([ xy]) and let z be its valid outcome (along a certain accepting computation path). If b = 0 and z is either [ 00] or [ 11], then N accepts. In the case where z is [ 01] (resp., [ 10]), N accepts if b = 1 (resp., b = 0). In all other cases, N rejects. Notice that N runs with synchronized tape heads. Let h denote the function computed by N . It is not difficult to show that h is indeed a selector for A. 2 Let us consider a basic relationship between “refinement” and “selectivity.” (+) Proposition 25. For each level k ≥ 2, ΣkCFL 2V vref ΣkCFL SV implies ΣkCFL ⊆ ΣkCFL SV-SEL. The same statement holds when ΣkCFL SV is replaced by ΣkCFL SVt . (+) Proposition 26. Let k ≥ 2. If ΣkCFL = ΠkCFL , then ΣkCFL 2V vref ΣkCFL SVt . 14 Proof. It is shown in [19, arXiv version] that ΣkCFL = ΠkCFL implies ΣkCFL = CFL Σk+1 . From ΣkCFL = ΠkCFL , it follows that Σk+1 CFL CFL MV vref Σk+1 SV [20]; actu- ally, we can improve this to (*) ΣkCFL MV vref ΣkCFL SV. (+) Claim 27 For every k ≥ 2, ΣkCFL SV vref Σk+1 CFL SVt . Proof of Claim 27. Let f ∈ ΣkCFL SV. Assume that k ≥ 2. Since ΣkCFL SV = CFL CFLSVT (Πk−1 ) by the definition, take an oracle 1npda M and an oracle B in CFL Πk−1 computing f . Let C = {1y | y ∈ B} ∪ {0x | f (x) 6= ∅}. Since {1y | y ∈ B} CFL is in Πk−1 and {0x | f (x) 6= ∅} is in ΣkCFL , it follows that C belongs to ΣkCFL . Consider the following machine. On input x, guess b ∈ {0, 1}. If b = 1, then simulate M with C as an oracle. If b = 0, then query 0x to C. If C’s answer is “yes,” then reject; if the answer is “no,” then accept. Moreover, whenever M queries y, N queries 1y. Let g denote the function computed by N with C. Note that g belongs to CFL CFLSVt C ⊆ CFLSVt Σk CFL ⊆ Σk+1 SVt . Moreover, it is not difficult to show that g is a total function and that, if x ∈ dom(f ), f (x) = g(x). Therefore, g is a pseudo refinement of f . 2 (+) By Claim 27 together with (*), we obtain ΣkCFL 2V vref Σk+1 CFL SVt . Since CFL CFL CFL CFL Σk = Σk+1 , we obtain Σk SVt = Σk+1 SVt . Therefore, we can conclude (+) that ΣkCFL 2V vref ΣkCFL SVt . Notice that combining Propositions 25 and 26 also leads to Corollary 21. 6 A Closing Discussion The study of efficient selectivity was initiated in 1979 by Selman [9] using polynomial-time computable selectors and it has been since then expanded to other types of selectors. An introduction of selectivity to formal languages and automata theory was found in [13]. To further broaden the scope of study on the selectivity issues in formal languages and automata theory, we have considered in this paper one-way nondeterministic pushdown automata as an underlying machine model to compute selectors. It is unfortunate that we have left numerous selectivity issues unsettled throughout this paper. For example, it is unclear at this moment that the in- troduction of the notion of “endmarked extension” of selectors in Definition 2 is truly necessary. We hope to prove that the use of this endmarked extension is actually redundant and we can simplify Definition 2 without endmarked exten- sion. References 1. Buhrman, H., Torenvliet, L.: P-selective self-reducible sets: a new characterization of P. J. Comput. System Sci. 53, 210–217 (1996) 15 2. Hemachandra, L.A., Hoene, A., Naik, A.V., Ogiwara, M., Selman, A.L., Thierauf, T., Wang, J.: Nondeterministically selective sets. Int. J. Found. Comput. Sci. 6, 403–416 (1995) 3. Hemaspaandra, L., Nasipak, C., Parkins, K.: A note on linear-nondeterminism, linear-sized, Karp-Lipton advice for the P-selective sets. J. Universal Computer Science 4, 670–674 (1998) 4. Hemaspaandra, L., Torenvliet, L.: Optimal advice. Theor. Comput. Sci. 154, 367– 377 (1996) 5. Hemaspaandra, L.A., Jiang, Z.: P-selectivity: intersection and indices. Theor. Com- put. Sci. 145, 371–380 (1995) 6. Hemaspaandra, L.A., Naik, A.V., Ogihara, M., Selman, A.L.: Computing solutions uniquely collaspses the polynomial hierarchy. SIAM J. Comput. 25, 697–708 (1996) 7. Ko, K.: On self-reducibility and weak P-selectivity. J. Comput. System Sci. 26, 209–221 (1983) 8. Meyer, A.R., Stockmeyer, L.J.: The equivalence problem for regular expressions with squaring requires exponential space. In: Proc. of the 13th Annual IEEE Sym- posium on Switching and Automata Theory. pp. 125–129 (1972) 9. Selman, A.L.: P-selective sets, tally languages, and the behavior of polynomial time reducibilities on NP. Math. Systems Theory 13, 55–65 (1979) 10. Selman, A.L.: Analogues of semirecursive sets and effective reducibilities to the study of NP complexity. Inform. Control 52, 36–51 (1982) 11. Selman, A.L.: A taxonomy of complexity classes of functions. J. Comput. System Sci. 48, 357–381 (1994) 12. Tadaki, K., Yamakami, T., Lin, J.C.H.: Theory of one-tape linear-time Turing machines. Theor. Comput. Sci. 411, 22–43 (2010) 13. Tantau, T.: On the structural similarities of finite automata and Turing machine enumerability classes. Doctoral Dissertation, Elektrotechnik und Informatik der Technischen Universität Berlin (2003) 14. West, D.: Introduction to Graph Theory. Prentice Hall (1996) 15. Yamakami, T.: Swapping lemmas for regular and context-free languages (2008), manuscript. Available at arXiv:0808.4122, 2008. 16. Yamakami, T.: The roles of advice to one-tape linear-time Turing machines and finite automata. Int. J. Found. Comput. Sci. 21, 941–962 (2010) 17. Yamakami, T.: Immunity and pseudorandomness of context-free languages. Theor. Comput. Sci. 412, 6432–6450 (2011) 18. Yamakami, T.: Not all multi-valued partial CFL functions are refined by single- valued functions (extended abstract). In: Proc. of IFIP TCS 2014. Lecture Notes in Computer Science, vol. 8705, pp. 136–150. Springer (2014), There is an error in the main proof. 19. Yamakami, T.: Oracle pushdown automata, nondeterministic reducibilities, and the hierarchy over the family of context-free languages. In: Proc. of SOFSEM 2014. CEUR Workshop Proceedings, vol. 8327, pp. 514–525. Springer (2014), A complete and corrected version appears at arXiv:1303.1717. 20. Yamakami, T.: Structural complexity of multi-valued partial functions computed by nondeterminsitic pushdown automata (extended abstract). In: Proc. of the 15th Italian Conference of Theoretical Computer Science. CEUR Workshop Proceed- ings, vol. 1231, pp. 225–236 (2014) 21. Yamakami, T.: Pseudorandom generators against advised context-free languages. Theor. Comput. Sci. 613, 1–27 (2016) 16 22. Yamakami, T.: Intersection and union hierarchies of deterministic context-free lan- guages and pumping lemmas. In: Proc. of LATA 2020. Lecture Notes in Computer Science, vol. 12038, pp. 341–353. Springer (2020) 23. Yamakami, T., Kato, Y.: The dissecting power of regular languages. Inf. Process. Lett. 113, 116–122 (2013) 17