A Gray code for a regular language Elena Barcucci Antonio Bernini Renzo Pinzani elena.barcucci@unifi.it antonio.bernini@unifi.it renzo.pinzani@unifi.it Dipartimento di Matematica e Informatica “U. Dini”, Università degli Studi di Firenze, Viale G.B. Morgagni 65, 50134 Firenze, Italy. Abstract Given a sequence {am }m≥0 of strictly increasing positive integers such that a0 =P1, any non-negative integer N can be uniquely represented by N = i≥0 di ai , where di are non negative integers. The coefficients di form a string over a finite alphabet and all these strings form a language having different properties depending on the sequence. We investigate on the possibility of defining a Gray code over the language arising from particular choices of {am }m≥0 . We consider the sequence defined by a two termed linear recurrence with constant coefficients having some particular properties. 1 Introduction Given a class of combinatorial objects, it is a common problem to list them in some specified order. A famous kind of list is the well-known Gray code [Gra53] where two successive objects differ according to some fixed constraints. Very often successive objects in a Gray code are required to differ as little as possible, depending on the nature of the objects we are dealing with. For instance, if the objects are strings (of the same length) they form a Gray code if two successive strings differ only in a few positions or, more precisely, if their Hamming distance dH [Ham50] is bounded by some positive integer q (dH = i ≤ q). Gray codes have been constructed for several combinatorial structures (permutations, binary strings, Motzkin and Schröder words, derangements, involutions) and used in various technical applications such as circuit testing, signal encoding, data compression (see [BBPSV14, BBPSV15, BBPV15, BGPP07] and references therein). Even though the most well-known Gray codes have been studied in the context of the mentioned topics, here we only recall that recently [BBPSV14, BBPV17] Gray codes have been considered also in the framework of cross-bifix-free sets of strings (two strings are bifix-free if they can not be overlapped in any way, for details see for example [BPP12]). Moreover, in [BBBP17a], where cross-bifix-free sets of matrices are considered, the authors propose a Gray code where the matrices are listed in such a way that any two successive matrices differ in only one digit. Note that the problem of finding a Gray code for bidimensional structures (matrices) is also considered in [BBBP17b]; nevertheless the approach used in [BBBP17a] doesn’t seem to be not useful. In the present paper we consider a language of strings defined in [BR01] over a particular alphabet. These strings are the representations of integers by means of a system of numeration [Fra82] derived from certain sequences. Given a strictly increasing sequence of non-negative integers 1 = a0 < a1 < a2 < ..., it is possible to represent each non-negative integer N by recording the quotient dn obtained by dividing N by the largest member an of the sequence that is less than or equal to N , then dividing the remainder rn by an−1 and recording the Copyright c by the paper’s authors. Copying permitted for private and academic purposes. In: L. Ferrari, M. Vamvakari (eds.): Proceedings of the GASCom 2018 Workshop, Athens, Greece, 18–20 June 2018, published at http://ceur-ws.org 87 quotient dn−1 and so on until you get to d0 , since dividing by 1 leaves a remainder of 0. The various quotients form a string which is an element of the language. In [BR01] sequences are defined by a two termed linear recurrence depending on two parameters k and h, and the arising language, strictly depending on k and h, is seen as a combinatorial interpretation of the sequence ai = kai−1 + hai−2 . In this paper we aim at providing a Gray code for the language derived from particular conditions on k and h. We point out that the considered language was introduced in order to provide a general solution to a previous issue appeared in [BSS93], where the authors asked for a combinatorial interpretation of the recurrence fm+1 = 6fm − fm−1 , with f0 = 1, f1 = 7 (sequence M4423 of [SP96]). After some interesting answers (see [BBDD98, PP90, Sul98]), in their paper [BR01] the authors gave a general combinatorial interpretation for the recurrences of the form am = kam−1 + ham−2 , under some conditions on h and k which include the most frequently occurring two-termed recurrences. We also noted, in the previous paragraph, that this language is strictly linked to the system of numeration presented in [Fra82]. This is based on a very simple iterated division algorithm which is the main tool which produces the representation for any non-negative integer N , in terms of any sequence of the form 1 = u0 < u1 < u2 < . . . . Clearly, the particular choice of the sequence affects the representation of N , and in some particular cases interesting and useful representations can be obtained. In the references within [Fra82] several application of different systems of numeration can be found, ranging from compressing and partitioning large dictionaries, ranking permutations with repetitions, up to designing error-insensitive codes for data transmission. The definition of the Gray code in the present paper could be presented independently of the definition of numeration systems and without showing which are the links between the considered language and the sequence. Nevertheless, in order to provide a self-contained paper and for the sake of completeness, we prefer to present concepts and preliminaries useful for our purpose. Section 2 and Section 3 are devoted to the presentation of some main notion about numeration systems and properties of the considered recurrence relations. In Section 4 we recall the definition of the language introduced in [BR01] and finally we define a Gray code for listing its elements. 2 Preliminaries Given a sequence {am }m≥0 of positive integers such that a0 = 1 and am < am+1 for each m ∈ N, let N be any non-negative integer. Consider the largest term an of the sequence such that an ≤ N . More precisely, an = max{am | am ≤ N } (for the particular case N = 0, see below). We divide N by an obtaining N = dn an +rn . Obviously, for the remainder rn , it is clear that rn < an . If we divide rn by an−1 , we get rn = dn−1 an−1 + rn−1 , with rn−1 < an−1 . Then, iterating this procedure until the division by a0 = 1 (where of course the remainder is 0), we have: N = dn an + rn 0 ≤ rn < an , rn = dn−1 an−1 + rn−1 0 ≤ rn−1 < an−1 , rn−1 = dn−2 an−2 + rn−2 0 ≤ rn−2 < an−2 , ··· = ············ ············ ··· = ············ ············ r3 = d2 a2 + r2 0 ≤ r2 < a2 , r2 = d1 a1 + r1 0 ≤ r1 < a1 , r1 = d0 a0 . The above relations imply that: N = dn an + dn−1 an−1 + dn−2 an−2 + . . . . . . + d1 a1 + d0 a0 . (1) Expression (1) is the representation of N in the numeration system S = {a0 , a1 , a2 , . . . . . .}, and the string dn dn−1 . . . d1 d0 is associated to the number N (in what follows the term “representation” equivalently refers to 88 the expression (1) or to its associated string). The method presented here can be applied to every non-negative integer and in the case N = 0, clearly, all the coefficients di are 0 (in other words the representation of 0 is simply the string 0). Moreover, we have ri = di−1 ai−1 + di−2 ai−2 + . . . . . . + d1 a1 + d0 a0 < ai , (2) for each i ≥ 0. Pn It is possible to show [Fra82] that if N = i≥0 di ai with di ai + di−1 ai−1 + . . . + d1 a1 + d0 a0 < ai+1 (3) Pn for each i ≥ 0, then the representation N = i≥0 di ai is unique. For the sake of completeness, we recall the complete theorem: Theorem 2.1. Let 1 = a0 < al < a2 < . . . be any finite or infinite sequence of integers. Any non-negative n X integer N has precisely one representation in the system S = {a0 , a1 , a2 , ...} of the form N = di ai where the i≥0 di are non-negative integers satisfying (3). As an example, consider the well-known sequence of Pell numbers (sequence M1413 in [SP96]) pm = 1, 2, 5, 12, 29, . . . defined by p0 = 1, p1 = 2, pm = 2pm−1 + pm−2 . The representation of N = 16 is associ- ated to the string 1020. 3 The language and the alphabet The next step is the definition of the language arising from the representations of non-negative numbers. Given m > 0, we consider all the integers ` ∈ {0, 1, 2, . . . , am − 1}. According to the scheme of the previous section, the representations of the integers j with am−1 ≤ j < am is j = dm−1 am−1 + dm−2 am−2 + . . . + d0 a0 (so that the associated string is dm−1 dm−2 . . . d0 ), while, following the same scheme, the remaining integers have a representation with less than m digits. For example: the representation of am−1 − 1 = dm−2 am−2 + . . . + d0 a0 has m − 1 digits. For our purpose (the construction of a Gray code), we require that all the representations of the considered integers ` ∈ {0, 1, 2, . . . , am − 1} have m digits, so we pad the string on the left with 0’s until we have m digits: the representation of am−1 − 1 becomes am−1 − 1 = 0am−1 + dm−2 am−2 + . . . + d0 a0 (therefore, the associated string is 0dm−2 . . . d0 ). With this little adjustment, we now define the following sets: L0 = {ε} , Lm = {dm−1 . . . d0 | the string dm−1 . . . d0 is the representation of each ` < am in the numeration system {am }m≥0 }. Finally, we denote by L the language obtained by taking the union of all the sets Lm : L = m≥0 Lm . S We remark that each element of Lm has precisely m digits, so that some string dm−1 . . . d0 can admit a prefix constituted by a certain number of consecutive 0’s. Moreover, each Lm contains precisely am elements (which are the representations of each ` ∈ {0, 1, . . . , am − 1}). 89 Referring to the sequence of Pell numbers pm = {1, 2, 5, 12, 29, . . .} defined in Section 2, we have: L0 = {ε} L1 = {0, 1} L2 = {00, 01, 10, 11, 20} L3 = {000, 001, 010, 011, 020, 100, 101, 110, 111, 120, 200, 201} L4 = {0000, 0001, 0010, 0011, 0020, 0100, 0101, 0110, 0111, 0120, 0200, 0201, 1000, 1001, 1010, 1011, 1020, 1100, 1101, 1110, 1111, 1120, 1200, 1201, 2000, 2001, 2010, 2011, 2020} The strings in L2 are, respectively, the representations of the integers ` ∈ {0, 1, 2, 3, 4}, being a2 = 5. Note that L2 contains exactly 5 = a2 elements. It is not difficult to realize that the alphabet of the language L strictly depends on the sequence {am }m≥0 . Pi−1 In general it is possible to set an upper bound for the digits di . From (3), we deduce di ai < ai+1 − j=0 dj aj , so that, since the numbers are all integers: i−1 X di ai ≤ ai+1 − 1 − dj aj < ai+1 − 1 , j=0 leading to   ai+1 − 1 di ≤ . (4) ai   ai+1 − 1 Therefore, the alphabet for Lm is given by {0, 1, . . . , s} with s = max , i=0,1,...,m−1 ai and, denoting by Σ the alphabet for L , we have Σ = {0, 1, . . . , t} with   ai+1 − 1 t = max . i ai In this paper we focus our attention on sequences defined by linear recurrences of the form am = kam−1 +ham−2 , with suitable initial conditions and some restrictions on k and h. More precisely, we consider sequences of the form:  1 if m = 0  am = k if m = 1  kam−1 + ham−2 if m ≥ 2  where k ∈ N+ and h ∈ Z . Using standard techniques for solving recurrences (see for example [Aig07]) it is possible to show the the general term of the sequence is √ !m+1 √ !m+1 1 k+ k 2 + 4h 1 k− k 2 + 4h am = √ −√ 2 k + 4h 2 2 k + 4h 2 and, following [BR01], we require the condition k 2 + 4h ≥ 0 which assures that am ≥ 0 and am+1 > am for each m ≥ 0 (which are the hypothesis of Theorem 2.1). Actually, here we further restrict to a more limiting case for k and h. More precisely we consider the case k ≥ h ≥ 0, which, of course, implies k 2 + 4h ≥ 0. From [BR01] it is possible to deduce that in this case the alphabet Σ of the language L is Σ = {0, 1, . . . , k}. The language L is the set of words w ∈ Σ∗ such that 90 1. w = dr dr−1 . . . d1 d0 with di ∈ Σ; 2. if di = k, then di−1 < h, for i = 1, 2, . . . , r; 3. d0 6= k. In the above list, point 2. is due to the fact that, if N < ai+1 is an integer such that di = k, then, if also di−1 = h, it is N ≥ kai + hai−1 = ai+1 , which is in contrast with N < ai+1 . Moreover, since in the representation of N it is r1 < a1 = k, we have d0 < k. We recall [BR01] that the following unambiguous regular grammar generates L : S → T 0|T 1| · · · |T (h − 1)|Sh| · · · |S(k − 1)|, T → T 0|T 1| · · · |T (h − 1)|Sh| · · · |S(k)|. 4 A Gray code for Ln We introduce some notations (as in [BBPSV14]) in order to express the language L in an alternative recursive way. • If α is a symbol and L is a list of words, α · L is the list obtained by concatenating α to each string of L; • if i and j are symbols, then ij · L is the list obtained by concatenating i to each string of i · L (or equivalently ij · L = i · (j · L)); • if L is a list of word, L̄ is the list in the reverse order; • if L is a list of word, (L)i is L if i is even and L̄ if i is odd; • if L and M are two lists, L ◦ M is their concatenation; r • if Lj , Lj+1 , . . . , Lj+r are lists, `=0 Lj+` is the list Lj ◦ . . . ◦ Lj+r . • if L is a list of words, then f irst(L) is the first element of L and last(L) is the last element of L. With the above notation it is not difficult to realize that L can be defined as follows:    {ε} if n = 0       {0, 1, . . . , k − 1} if n = 1 Ln = (5)    0 · Ln−1 ∪ 1 · Ln−1 ∪ . . . ∪ (k − 1) · Ln−1 ∪    k0 · Ln−2 ∪ k1 · Ln−2 ∪ . . . ∪ k(h − 1) · Ln−2  if n ≥ 2 .  Note that from the recursive construction of Ln (n ≥ 2), if an element w ∈ Ln starts with k, then k is followed by a symbol different from h, according to the definition of the language L in the previous section. If w starts with a symbol different from k, then it can be concatenated to any element of Ln−1 . We propose the following definition providing a particular placement of the elements of L which reveals itself to be a Gray code. Definition 4.1. Given k ∈ N, h ∈ Z, with k ≥ h ≥ 0, we define the string list Ln over the alphabet Σ = {0, 1, 2, . . . , k}:    {ε} if n = 0     Ln = {0, 1, . . . , k − 1} if n = 1 (6)       k−1 i · (L  )k+i ◦  h−1 k+i  i=0 n−1 i=0 ki · (Ln−2 ) if n ≥ 2 . 91 We have: Theorem 4.2. The string list Ln is a Gray code with Hamming distance equal to one. Proof. We proceed by induction on n. If n = 1, then L1 is trivially seen to be a Gray code. Suppose that Li is a Gray code for i = 2, . . . , n − 1 where n ≥ 2 and consider the list Ln . The sub-lists arising from each parenthesis in the third case of (6) are Gray codes since the lists Ln−1 and Ln−2 (which are Gray codes by the inductive hypothesis) are read alternatively from left to right and vice versa (depending on the parity of k + 1). k+i Therefore, we have to k+i  check only the Hamming distance between last (k − 1) · (Ln−1 ) with i = k − 1 and f irst k0 · (Ln−2 ) with i = 0. In the case of k even we have: last (k − 1) · (Ln−1 )k+k−1 = last (k − 1) · Ln−1 = (k − 1)last(Ln−1 ) = (k − 1)f irst(Ln−1 ) ,   and f irst k0 · (Ln−2 )k+0 = f irst (k0 · Ln−2 ) = k0f irst(Ln−2 ) .  Easily, from (6), we have f irst(Ln−1 ) = 0f irst(Ln−2 ) , then the only different digit between (k − 1)f irst(Ln−1 ) and k0f irst(Ln−2 ) is the first one. So Ln is a Gray code. The proof in the case of k odd can be conducted with similar arguments. 5 Further developments In the present paper we considered the recurrence relation defined by am = kam−1 + ham−2 with k ≥ h ≥ 0, which is one of the two cases analysed in [BR01]. The other interesting case is k ≥ −h ≥ 0 which again leads to a strictly increasing sequence defining a language L 0 over an alphabet Σ0 slightly different from Σ, with different properties. We think that also in this case it is possible to define a Gray code for listing the elements of L 0 , with Hamming distance 1. It could be interesting to investigate on the possibilities to characterize the language in terms of restricted words, following the line of [BBBP17b] or [Ber17] where sets of pattern avoiding words and recurrence relations are considered. Acknowledgements This work is partially supported by the GNCS 2018 project “Proprietà combinatorie e rilevamento di pattern in strutture discrete lineari e bidimensionali”. References [Aig07] M. Aigner. Discrete Mathematics. American Mathematical Society, USA, 2007. [BBBP17a] E. Barcucci, A. Bernini, S. Bilotta and R. Pinzani. Cross-bifix-free sets in two dimensions. Theoretical Computer Science, 664:29–38, 2017. [BBBP17b] E. Barcucci, A. Bernini, S. Bilotta and R. Pinzani. Non-overlapping matrices. Theoretical Computer Science, 658:36–45, 2017. [BBDD98] E. Barcucci, S. Brunetti, A. Del Lungo and F. Del Ristoro. A combinatorial interpretation of the sequence fn+1 = 6fn − fn−1 . Discrete Mathematics, 190:235–240, 1998. [BR01] E. Barcucci and S. Rinaldi. Some linear recurrences and their combinatorial interpretation by means of regular languages. Theoretical Computer Science, 255:679–686, 2001. [Ber17] A. Bernini. Restricted binary strings and generalized Fibonacci numbers. Lecture Notes in Computer Science, 10248:32–43, 2017. 92 [BBPSV14] A. Bernini, S. Bilotta, R. Pinzani, A. Sabri and V. Vajnovszki. Prefix partitioned Gray code for particular cross-bifix-free sets. Cryptography and Communication, 6:359–369, 2014. [BBPSV15] A. Bernini, S. Bilotta, R. Pinzani, A. Sabri and V. Vajnovszki. Gray code orders for q-ary words avoiding a given factor. Acta Informatica, 52:573–592, 2015. [BBPV15] A. Bernini, S. Bilotta, R. Pinzani and V. Vajnovszki. A trace partitioned Gray code for q-ary generalized Fibonacci strings. Journal of Discrete Mathematical Sciences and Cryptography, 18:751–761, 2015. [BBPV17] A. Bernini, S. Bilotta, R. Pinzani and V. Vajnovszki. A Gray code for cross-bifix-free sets. Mathe- matical Structures in Computer Science, 27:184–196, 2017. [BGPP07] A. Bernini, E. Grazzini, E. Pergola and R. Pinzani. A general exhaustive generation algorithm for Gray structures. Acta Informatica, 44:361-376, 2007. [BPP12] S. Bilotta, E. Pergola and R. Pinzani. A new approach to cross-bifix-free sets. IEEE Transactions on Information Theory, 58:4058–4063, 2012. [BSS93] J. Bonin, L. Shapiro and R. Simion. Some q-analogues of the Schröder numbers arising from combina- torial statistics on lattice paths. Journal of Statistical Planning and Inference, 34:35–55, 1993. [Fra82] A. S. Fraenkel. Systems of numeration. American Mathematical Monthly, 92:105–114, 1982. [Gra53] F. Gray. Pulse code communication. US Patent 2(632), 058, 1953. [Ham50] R. W. Hamming. Error detecting and error correcting codes. The Bell System Technical Journal, 29:147–160, 1950. [PP90] E. Pergola and R. Pinzani. A combinatorial interpretation of the area of Schröder paths. Electronic Journal of Combinatorics, 6:R40, 1990. [SP96] N. J. A. Sloane and S. Plouffe. The Encyclopedia of Integer Sequences. Academic Press, New York, 1996. [Sul98] R. A. Sulanke. Bijective recurrences concerning Schröder paths. Electronic Journal of Combinatorics, 5:R47, 1998. 93