<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On the Pseudorandomness of Parry-Bertrand Automatic Sequences</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Pierre Popoli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Manon Stipulanti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Departement of Mathematics</institution>
          ,
          <addr-line>ULiège, Liège</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The correlation measure is a testimony of the pseudorandomness of a sequence s and provides information about the independence of some parts of s and their shifts. Combined with the well-distribution measure, a sequence possesses good pseudorandomness properties if both measures are relatively small. In combinatorics on words, the famous -automatic sequences are quite far from being pseudorandom, as they have small factor complexity on the one hand and large well-distribution and correlation measures on the other. This paper investigates the pseudorandomness of a specific family of morphic sequences, including classical -automatic sequences. In particular, we show that such sequences have large evenorder correlation measures; hence, they are not pseudorandom. We also show that even- and odd-order correlation measures behave diferently when considering some simple morphic sequences.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Combinatorics on words</kwd>
        <kwd>Correlation measures</kwd>
        <kwd>Abstract numeration systems</kwd>
        <kwd>Morphic sequences</kwd>
        <kwd>Automatic sequences</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Pseudorandom sequences are generated by algorithms and behave similarly to truly random
sequences. To study the pseudorandomness of a sequence, a wide variety of measures have
been introduced, called measures of complexity or measures of pseudorandomness. For instance,
from a purely theoretical point of view, the famous Kolmogorov complexity of a sequence gives
the length of the shortest program that generates it [1]; see also [2, 3]. An infinite sequence is
said to be algorithmically random if its Kolmogorov complexity grows like the length of the
sequence itself. It is well known that Kolmogorov complexity is not a computable function; for
example, see [4] for a formal proof and history behind.</p>
      <p>As a consequence, many other measures of complexity can be used, especially for practical
issues. The series of papers [5, 6], initiated by Mauduit and Sárközy in 1997, introduces and
studies several such measures. Among the most important ones, the well-distribution measure
and the correlation measure are considered in various contexts. Let s = (s())≥ 0 be a sequence
over the alphabet {0, 1}. For  ∈ Z and ,  ∈ N, write  (s, , , ) = ∑︀=− 0 1(− 1)s(+)
and for  ∈ N≥ 1,  = (1, . . . , ) ∈ N such that 0 ≤ 1 &lt; 2 &lt; · · ·
&lt; , write
− 1
 (s, , ) = ∑︁ (− 1)s(+1)+s(+2)· +s(+).</p>
      <p>=0
The Nth well-distribution measure of the sequence s is defined by
 (s,  ) = m,a,x | (s, , , )| = max
,, ⃒⃒⃒ =0
⃒⃒⃒ ∑−︁1(− 1)s(+)⃒⃒⃒⃒ ,
⃒⃒
where the maximum is taken over all , ,  ∈ N such that 0 ≤  ≤  + ( − 1) &lt;  , and
the  th correlation measure of order  of s is given by
⃒⃒ ∑−︁1(− 1)s(+1)+s(+2)· +s(+)⃒⃒⃒ ,
(s,  ) = ma,x | (s, , )| = ma,x ⃒⃒⃒ =0 ⃒⃒
where the maximum is taken over all vectors  = (1, . . . , ) ∈ N and all integers  such
that 0 ≤ 1 &lt; 2 &lt; · · · &lt;  and  +  ≤  .</p>
      <p>
        It is said that the sequence s possesses good properties of pseudorandomness if both its
measures  (s,  ) and (s,  ) are relatively small compared to  . Indeed, Alon et al. [7],
improving a former result of Cassaigne, Mauduit, and Sárközy [8], prove that the expected
order of  (s,  ) and (s,  ) when s is a truly random binary sequence is √ (log  )(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ).
Furthermore, it is crucial to simultaneously consider the two measures, since the smallness of
one does not necessarily imply that of the other; see [5] for more details. However, it is suficient
to show that the correlation measure of a sequence is large to tell it apart from a pseudorandom
sequence. Notice that the combination of the well-distribution and the correlation measures
gives birth to the well-distribution-correlation measure, which is however more complicated to
study.
      </p>
      <p>Usually, the methods to analyze correlation measures are borrowed from number theory;
see [9, 10] and the references therein for general results. However, in this paper, we study a
large family of sequences through the lens of their correlation measures by mixing tools from
automata theory and general numeration systems. Such a family is that of automatic sequences;
see Section 2 for precise definitions. In combinatorics on words, they are classical and have
many beautiful properties. Among others, they are known to be bad pseudorandom sequences
as explained hereafter. First, it is well known that their factor complexity, which counts the
number of diferent blocks of each length appearing in the whole sequence, is bounded by a
linear function and is, therefore, small. On the other hand, over an alphabet of size ℓ, the th
term of the factor complexity of a random sequence is of order ℓ; see [11, Chapter 10] for more
details. Second, Mérai and Winterhof [12] prove that the order-2 correlation of an automatic
sequence is large, as stated below.</p>
      <p>Theorem 1 ([12, Theorem 7]). Let  ≥ 2 be an integer and let s be a -automatic binary sequence

generated by a DFAO with  states. Then 2(s,  ) ≥ (+1) for all  ≥ ( + 1).</p>
      <p>The focus on correlation measures of automatic sequences dates back to the previous century.
In the late 1920s, Mahler [13] analyzes the sequence ((− 1)s2())≥ 0, where s2 : N → N is the
sum-of-digit function in base 2. In particular, he considers, for any non-negative integer , the
quantity
 () =
1

 (s2, , ) =
1 ∑−︁1(− 1)s2()+s2(+)
 =0
where  = (0, ), and shows that the sequence ( ())≥ 0 converges to some real number
(). It turns out that this quantity is related to the order-2 correlation measure of the
2automatic Thue–Morse sequence t = (t())≥ 0 as it is defined by t() = s2() mod 2 for
all  ≥ 0. We also refer to [14] for more historical background on t and specific values of
the quantity (). Later on, Mauduit and Sárközy [6] prove that 2(t,  ) ≥ 112  for  ≥ 5.
However, when Theorem 1 is applied to t, the lower bound is improved as 2(t,  ) ≥ 16  for
 ≥ 6; see [12] for more details. In a similar context, Baake and Coons [15] recently study
the correlations of t with specific values of the shifted vector , as well as their means with
renormalization techniques. Mazáč [16] carries out an analogous approach for the correlations
of the Rudin–Shapiro sequence, another distinguished automatic sequence. We also mention
the work of Grant, Shallit, and Stoll [17] (and papers following this trend), where correlations
over alphabets of size more than 2 are considered.</p>
      <p>It is known that automatic sequences are morphic; see [11, Chapter 6] and Section 2 for
precise definitions. In this more general framework, the factor complexity is bounded by a
quadratic function, which, in turn, again shows that they are far from being pseudorandom;
see [11, Chapter 10]. Consequently, our study is driven by the natural question to know whether
the order-2 correlation measure of any morphic sequence is large. As a first step towards
answering this question, we presently consider a subfamily of morphic sequences based on
Parry–Bertrand numeration systems (see Section 2 for precise definitions). In the theory of
numeration systems, they have gained the attention of the scientific community for the nice
framework they provide and have especially been used in [18, 19, 20, 21, 22].</p>
      <p>Our main result is Theorem 2 below (see again Section 2 for precise definitions). It
improves Theorem 1 in at least two ways. First, we consider a much larger class of sequences
as classical automatic sequences are  -automatic for some integer  . Second, we deal with
all even-order correlations, whereas only the case of order 2 is considered in [12]. Notice
that we also discuss the case of odd-order correlations for which such a theorem fails to hold;
see Section 3.1. Note that, for two maps ,  such that  takes only non-negative values, we
write  ≫  (resp.,  ≪ ) if there exists a positive constant  such that | ()| ≥ () (resp.,
| ()| ≤ ()) for all suficiently large .</p>
      <p>Theorem 2. Let  ≥ 1 be an integer,  ∈ R&gt;1 be a Parry number and  be the
corresponding canonical Parry–Bertrand numeration. For all binary  -automatic sequences s, we have
2(s,  ) ≫  , where the constant depends on  , , and the size of the automaton generating s.</p>
      <p>The present paper is organized as follows. In Section 2, we recall the necessary background
on combinatorics on words, automatic sequences, and numeration systems. Section 3 is divided
into two parts. In Section 3.1, we deal with the case of linearly recurrent sequences. In particular,
we highlight the fact that even- and odd-order correlation measures behave diferently, which
will imply that there is no hope of extending Theorem 2 to odd-order correlations. In Section 3.2,
we show with Theorem 2 that our specific morphic sequences have a large correlation measure
of even orders. Finally, Section 4 presents possible future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <sec id="sec-2-1">
        <title>2.1. On Combinatorics on Words</title>
        <p>As a general reference on words, we point out [23]. An alphabet is a finite set of elements called
letters. A word on an alphabet  is a sequence of letters from . It is either finite or infinite .
In order to distinguish them, infinite words are written in bold. The length of a finite word,
denoted between vertical bars, is the number of letters it is made of. The empty word is the only
0-length word. For all  ≥ 0, we let  denote the set of all length- words over . We let *
denote the set of finite words over , including the empty word, and  or N that of infinite
words over . The first is equipped with the concatenation of words. Let  and  be finite
alphabets. A morphism  : * → * is a map satisfying  () =  () () for all ,  ∈ * .
In particular,  () = . For an integer ℓ ≥ 2, a morphism is ℓ-uniform if it maps each letter
to a length-ℓ word. A 1-uniform morphism is called a coding. An infinite word x is morphic
if there exist a morphism  : * → * , a coding  : * → * , and a letter  ∈  such that
x = ( ()), where  () = lim→∞  ().</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. On Generalized Automatic Sequences</title>
        <p>
          Abstract numeration systems were introduced at the beginning of the century by Lecomte and
Rigo [24]; see also [25, Chapter 3] for a general presentation. Such a numeration system is
defined by a triple  = (, , &lt;) where  is an alphabet ordered by the total order &lt; and 
is an infinite regular language over , i.e., accepted by a deterministic finite automaton. We
say that  is the numeration language of . When we genealogically (i.e., first by length, then
using the dictionary order) order the words of , we obtain a one-to-one correspondence rep
between N and . Then, the -representation of the non-negative integer  is the ( + 1)st
word of , and the inverse map, called the (e)valuation map, is denoted by val .
Example 1. Consider the abstract numeration system  built on the language * * over the
ordered alphabet {, } with  &lt; . The first few words in the language are , , , , , , .
For instance, rep (
          <xref ref-type="bibr" rid="ref5">5</xref>
          ) =  and val () = 6.
        </p>
        <p>Let  = (, , &lt;) be an abstract numeration system. An infinite word x is -automatic if
there exists a deterministic finite automaton with output (DFAO)  such that, for all  ≥ 0,
the th term x() of x is given by the output (rep ()) of . See [11, 26, 27] for general
references.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Representation of Real Numbers</title>
        <p>
          We now recall several definitions and results about representations of real numbers; see, for
instance, [25, 28]. Let  ∈ R&gt;1 and let  = {0, 1, . . . , ⌈ ⌉ − 1}. Every real number  ∈ [0, 1)
can be written as a series  = ∑︀+=∞1   −  , where  ∈  for all  ≥ 1. We let  () denote
the  -expansion of  obtained greedily. In an analogous way, the  -expansion  (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) of 1 is
 (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) = lim→1  (). We define the quasi-greedy  -expansion * (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) of 1 as follows. Write
 (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) = (())≥ 1. If  (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) = (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) · · · ()0 with () ̸= 0, then * (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) = ((
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) · · · ( −
1)(() − 1)); otherwise * (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) =  (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ). We define the  -shift  as the topological closure
of the set  = { () |  ∈ [0, 1)}. A real number  &gt; 1 is a Parry number if  (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) is
ultimately periodic.
        </p>
        <p>
          Example 2. Any integer  &gt; 1 is a Parry number. The golden ratio  is a Parry number for
which  (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) = 11 and * (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) = (10) (since 1 = 1/ + 1/ 2). The square  2 of the golden
ratio is also a Parry number with  2 (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) = 21 = * 2 (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ).
        </p>
        <p>
          Parry’s theorem [29] notably provides a purely combinatorial condition to check if an infinite
sequence is in  by comparing this sequence to * (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) using the lexicographic order.
Proposition 1 ([28, Corollaries 2.71 and 2.72]). Let  ∈ R&gt;1 be a Parry number. Write  (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) =
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) · · · ()(( + 1) · · · ( + )) (where ,  ≥ 0 are taken to be minimal). Then an infinite
word belongs to  if and only if it is the label of a path in the automaton  = (, 0,  , ,  ),
where  = {0, . . . , +− 1} and where the transition function  is defined as
 (− 1, ) = 0, for all 1 ≤  &lt;  +  and all 0 ≤  &lt; ();
 (− 1, ()) = , for all 1 ≤  &lt;  + ;
 (+− 1, ( + )) = .
        </p>
        <p>
          Every Parry number is canonically associated with a linear positional numeration system.
We now expose this framework. Let  = ( ())≥ 0 be an increasing sequence of integers with
 (0) = 1. Any integer  can be decomposed uniquely in a greedy way as  = ∑︀
=0   ()
with non-negative integer coeficients . The finite word rep () =  · · · 0 is called the
(greedy)  -representation of . By convention, the representation of 0 is the empty word . The
ifniteness of the digit-set  for greedy representations is implied by the boundedness of the
sequence sup≥ 0  ( + 1)/ (). A sequence  satisfying all the above conditions defines a
positional numeration systems. The numeration language is the set  = {rep () |  ≥ 0}.
For any  · · · 0 ∈ N* , we let val ( · · · 0) denote the integer ∑︀
=0   (). In addition, a
positional numeration system is linear if  satisfies a linear recurrence relation.
Example 3. The Fibonacci sequence  = ( ())≥ 0 with initial conditions  (0) = 1 and
 (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) = 2 and  ( + 2) =  ( + 1) +  () for all  ≥ 0 gives rise to the famous Zeckendorf
or Fibonacci linear positional numeration system [30]. The numeration language is  =
{} ∪ 1{0, 01}* .
        </p>
        <p>
          Let  ∈ R&gt;1 be a Parry number. We define a particular linear positional numeration
system  = ( ())≥ 0 associated with  as follows. Write  (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) = (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) · · · ()(( +
1) · · · ( + )) (,  are minimal and might be zero). We define  (0) = 1,  () =
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) ( − 1) + · · · + () (0) + 1 for all  ∈ {1, . . . ,  +  − 1} and
 () = (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) ( − 1) + · · ·
+ ( + ) ( −  − ) +  ( − )
        </p>
        <p>
          − (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) ( −  − 1) − · · · −
() ( −  − )
for all  ≥  + . We set  =  for the sake of readability.
        </p>
        <p>
          Example 4. We resume Example 2. For the golden ratio, we obtain  = 2, so  (0) = 1,
 (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) = 1 ·  (0) + 1 = 2 and  () = 1 ·  ( − 1) + 1 ·  ( − 2) for all  ≥ 2. We find
back the Zeckendorf numeration system of Example 3. For its square, we have  = 1 = , so
 2 (0) = 1,  2 (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) = 2 ·  2 (0) + 1 = 3 and  2 () = 3 ·  2 ( − 1) − 1 ·  2 ( − 2) for
all  ≥ 2. Its first few elements are thus 1, 3, 8, 21, 55, the even-indexed Fibonacci numbers.
        </p>
        <p>
          The numeration system  has two useful properties. First, it possesses the Bertrand property,
i.e., for all  ∈ + ,  ∈  ⇔ 0 ∈  ; see [31] for the original version and [18] for the
recently corrected one. In that case, any word  in the set 0*  of all  -representations with
leading zeroes is the label of a path in the automaton  from Proposition 1 [18, Theorem 2].
Second, it satisfies the dominant root condition, i.e., there is some complex number  such that
 ()
lim
→+∞  
= ;
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
see, for instance, [18, Theorem 2]. In the following, we will consider Parry–Bertrand automatic
sequences, which are defined as -automatic for the abstract numeration system built on the
language  of  for some Parry number  . For the sake of conciseness, we call them 
automatic. Notice that when  is an integer , the corresponding sequences are usually called
-automatic.
        </p>
        <p>Example 5. In the Zeckendorf numeration system  from Example 3, if we write rep () =
ℓ · · · 0 for each non-negative integer , then the sum-of-digit function s : N → N is defined
by s () = ∑︀ℓ</p>
        <p>=0 . When modded out by 2, it is  -automatic (or Fibonacci-automatic) as
the DFAO in Fig. 1 produces it. Note that it is [32, A095076].</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Main Results</title>
      <p>As stated in the introduction, the main purpose of this paper resides in Theorem 2, which
extends Theorem 1 to a larger family of morphic sequences by considering all even-order
correlations. Such a result will also imply that this family of morphic sequences cannot be
qualified as pseudorandom, such as classical automatic sequences, from the point of view of
correlations.</p>
      <sec id="sec-3-1">
        <title>3.1. Linearly Recurrent Sequences</title>
        <p>Before developing the arguments for Theorem 2, we first deal with so-called linearly recurrent
sequences. A sequence s is recurrent if any factor of s occurs infinitely often in s. It is moreover
uniformly recurrent if the distance between two consecutive occurrences of any factor  in s
is bounded by a constant that depends only on . For instance, the Thue-Morse sequence is
uniformly recurrent. We say that s is linearly recurrent if the gap between two consecutive
occurrences of any length- factor  in s is bounded by  · , where  is a constant. See [33]
for more examples.</p>
        <p>Theorem 3. Let s be a linearly recurrent sequence. For all  ≥ 1, we have 2(s,  ) ≫
 .</p>
        <p>Proof. Let  &gt; 0 be such that two consecutive occurrences of any factor  of s being at
pos=itio(n0s),(1)sa.t.i.sfie(s⌊ |+ 1− ⌋ −|&lt;1)o|f|s..LeBtyas≥ sum1bpetiosunfic,itehnetrlye leaxrigsetsan d≤ cons⌊ide+r1t⌋hesupcrhefixthat
() = ( + ) for all  ∈ {0, . . . , ⌊ + 1 ⌋ − 1} and  + ⌊ + 1 ⌋ − 1 ≤  . Therefore, for
 = (0, ) and  = ⌊ +1 ⌋, we have ∑︀=− 01(− 1)()+(+) = ∑︀− 1</p>
        <p>=0 (− 1)2() =  . This
proves that 2(s,  ) ≫  .</p>
        <p>
          The case of even-order of correlations can be treated similarly by considering 2 consecutive
occurrences of the prefix  = (0)(
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) . . . ( 21 ⌊ + 1 ⌋ − 1),  ≤ 2 ⌊ + 1 ⌋ such that () =

( + ) for all  ≥ 0 and  ∈ {0, . . . , ⌊ +1 ⌋ − 1}, and  = (0, , . . . , (2 − 1)).
Remark 1. Surprisingly, the case of odd-order correlations cannot be treated similarly. To
highlight this, we consider the case of periodic sequences, defined as the infinite repetition
 =  · · · of a (finite) pattern  and which are automatic in any numeration system. More
specifically, consider the simple periodic sequence s = (01). For all  ≥ 1 and all  ≥ 1,
we have 2+1(s,  ) = 1 since, for  = (1, . . . , ), | (s, , )| equals 1 if  is odd, 0
otherwise. Actually, this simple observation explains why the statement of our main result does
not hold for odd-order correlations.
        </p>
        <p>Note that there exist automatic sequences that are not uniformly recurrent (and
therefore not linearly recurrent). For instance, consider the 3-automatic sequence ca =
 · · · defined as the fixed point of the uniform morphism  ↦→ ,  ↦→
 (sometimes referred to as the Cantor sequence; see, for instance, [34, Section 2.4.6]) and
for which consecutive occurrences of the factor  are separated by larger and larger gaps.
Therefore, Theorem 2 covers a larger class of sequences than Theorem 3.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Proof of Theorem 2</title>
        <p>Now we attack the proof of Theorem 2 for which we need some additional preparation. First,
we count the number of words of each length in the numeration language with leading zeroes,
since this language will appear later on.</p>
        <p>Lemma 1. Let  ∈ R&gt;1 be a Parry number and consider the canonical Parry–Bertrand numeration,
 whose numeration language is  . For all  ≥ 0, the number of length- words in 0*  is
 ().</p>
        <p>Proof. First, for all  ≥ 1, the length- words in  correspond to the integers in [ ( −
1),  ()), so there are  () −  ( − 1) of them. The number of length- words in 0* 
is thus equal to
︃( − 1
∑︁  () −  ( − 1)
=1
Therefore, the lemma is proved.
)︃
+ 1 =  ().</p>
        <p>Since automata producing Parry–Bertrand automatic sequences recognize the language  ,
we build up one from that of Proposition 1.</p>
        <p>Lemma 2. Let  ∈ R&gt;1 be a Parry number. Then there exists a deterministic finite automaton
recognizing the language  .</p>
        <p>Proof. Let  = (, 0,  , ,  ) be the DFA recognizing the language 0*  from Proposition 1
with state-set  = {0, . . . , +− 1}. The automaton ′ = (′, ′0,  ,  ′, ′) defined as
follows recognizes  . We set ′ = {′0} ∪  where ′0 is a new state. We replace the initial
state 0 of  by ′0. The new transition function  ′ is defined as  ′(′0, ) =  (0, ) for all
non-zero letter , and  ′(, ) =  (, ) for all  ∈ {0, . . . ,  +  − 1} and all letter .</p>
        <p>It is well known due to Cobham that -automatic sequences are characterized by -uniform
morphisms; see [35] or [11, Theorem 6.3.2]. We have the following generalization for
automatic sequences.</p>
        <p>Theorem 4 ([26]). A word is morphic if and only if it is -automatic for some abstract numeration
system .</p>
        <p>The proof of the above theorem is constructive: given the morphisms producing the word s,
one can build an abstract numeration system  and a DFAO generating s, and vice versa. As
our proof of Theorem 2 relies on this construction, we develop it now.</p>
        <p>Fix a Parry number  ∈ R&gt;1 and consider the canonical Parry–Bertrand numeration  . For
the sake of conciseness, let  be the numeration language  and let  denote the (minimal)
alphabet over which  is defined. Order the letters of  as 0 &lt; 1 &lt; · · · &lt; . Consider
also a binary  -automatic sequence s. Let  = (, 0, ,  ,  ) be the DFA accepting  as
in Lemma 2 and let ℬ = (, 0, ,  ℬ,  :  → {0, 1}) be a DFAO generating the sequence s.</p>
        <p>We explicitly give a pair of morphisms producing s. Define the Cartesian product automaton
 =  × ℬ , which imitates the behavior of  on the first component and ℬ on the second.
Its set of states is  × , the initial state is (0, 0), and the alphabet is . Its transition
function Δ : ( × ) × * →  ×  is defined by Δ((, ), ) = ( (, ),  ℬ(, )) for all
 ∈ ,  ∈  and  ∈ * . In particular, Δ((, ), ) ∈  ×  if and only if  belongs to
. Furthermore, if Δ((0, 0), rep ()) = (, ), then s() =  (). Now let  be a symbol
not belonging to  × . Define the morphism   : ( ×  ∪ { })* → ( ×  ∪ { })* by
  ( ) =  (0, 0) and, for all  ∈  and  ∈ ,
  ((, )) = Δ((, ), 0) · · ·</p>
        <p>Δ((, ), ) = ∏︁ Δ((, ), ),
=0
where, if Δ((, ), ) is not defined for some , then it is replaced by  (note that the product
has to be understood as the concatenation). Observe that   is not necessarily uniform.</p>
        <p>Order the words in  genealogically as 0 &lt; 1 &lt; · · · . Define u =   ( ). Then, by [28,
Lemma 2.25], the shifted sequence of u is the sequence of states reached in  by the words in
 ordered genealogically, i.e., for all  ∈ N, the ( + 1)st letter of u is equal to Δ((0, 0), ).
Now define the morphism   : ( ×  ∪ { })* → {0, 1}* by   ( ) =  and, for all  ∈ 
and  ∈ ,   ((, )) =  (). Note that   is non-erasing on  × . By construction, we
ihnavEexasm=pl e 6 (buelo)w=.   (  ( )). To help the reader with the construction, we illustrate it</p>
        <p>We now give the proof of Theorem 2. In particular, one side feature of the proof is a
characterization of the distance between the repeating part in the sequence under study.
Proof of Theorem 2. Let  denote the numeration language  and let  be the (minimal)
alphabet over which  is defined. Let  = (, 0, ,  ,  ) be the DFA accepting  as
in Lemma 2, ℬ = (, 0, ,  ℬ,  :  → {0, 1}) be a DFAO generating the sequence s,  =
 × ℬ be the Cartesian product automaton, with transition function Δ : ( × ) × * →  × ,
and  be a symbol not belonging to  × . Consider the morphisms   : ( ×  ∪ { })* →
( ×  ∪ { })* and   : ( ×  ∪ { })* → {0, 1}* as defined above and write u =   ( ).</p>
        <p>Let ,  ≥ 1 where  is chosen later accordingly to  . Consider two distinct words
,  such that Δ((0, 0), ) = Δ((0, 0), ). Then, for all length- words  ∈ 0* ,
the states Δ((0, 0), ) and Δ((0, 0), ) are equal. The word   (Δ((0, 0), )) =

  (Δ((0, 0), )) is made of the sequence of such states. Indeed, for  ∈ {, }, we have
  (Δ((0, 0), )) =</p>
        <p>Δ((0, 0), ).</p>
        <p>∏︁
∈0* 
||=
Applying the morphism   then gives back the original sequence s. As   is non-erasing on
 × , |  (Δ((0, 0), ))| =  ( ) by Lemma 1.</p>
        <p>
          Order 2. Firstly, we study the case of the order-2 correlation, for the sake of simplicity and
since the case of even-order correlations will follow the same lines. Consider two distinct
words ,  ∈  such that Δ((0, 0), ) = Δ((0, 0), ). By the pigeonhole principle, we may
choose ,  among the first || · | | words of . Without loss of generality, we may assume
that  is lexicographically smaller than  and that  is non-empty (recall Lemma 2). Then
in the sequence s we have two identical blocks of length  ( ) starting at positions , 
respectively associated with the word 0 and 0 in , i.e.,  = val (0 ) for  ∈ {, }.
The situation is depicted in Fig. 2. Therefore, we have
s( + ) = s( + )
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
for all 0 ≤  &lt;  ( ). We now make several observations. First, notice that the largest index
in Eq. (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) is smaller than  +  ( ) by assumption on , . Then, note that, for  ∈ {, },
we have  = val (0 ) and val () ≤ | | · | | by choice of .
        </p>
        <p>
          Let  ≥ 1 be suficiently large. Choose  ≥ 1 such that  + ( ) ≤  &lt;  + ( +1)
and set  = (, ). Therefore, using Eq. (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ), we have
(− 1)s(+)+s(+) =  ( ).
        </p>
        <p>
          On the other hand, we have  ( ) = Θ(  ) by Eq. (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ), which yields
        </p>
        <p>
          +  ( + 1) ≤ ⌈  ⌉+1 · | | · | | +  ( + 1) ≪  +1
for some non-zero constant  depending on  and the sequence s. Then Eqs. (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) and (
          <xref ref-type="bibr" rid="ref4">4</xref>
          ) lead to
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          )
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          )
2(s,  ) ≥  ( ) ≫   ≫
 +  ( + 1)

≫

by choice of  . This concludes the case of the order-2 correlation.
        </p>
        <p>u
u
s</p>
        <p>( )</p>
        <p>( )</p>
        <p>Even orders. The general case of even-order correlations can be treated similarly. Indeed,
let  ≥ 1. Consider 2 pairwise distinct words 1, . . . , 2 ∈  such that Δ((0, 0), ) =
Δ((0, 0),  ) for all ,  ∈ {1, . . . , 2}. By the pigeonhole principle, we may choose them
among the first (2 − 1) · | | · | | words of . Without loss of generality, we may assume that
they are increasingly nested lexicographically speaking (and non-empty). Then in the sequence
s we have 2 identical blocks of length  ( ) starting at positions  = val (0 ) for
 ∈ {1, . . . , 2}. Therefore, we have s( + ) = s( + ) for all 1 ≤  ≤  ≤ 2 and
0 ≤  &lt;  ( ). We now finish up the proof by following the same argument as in the
previous case.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Discussion and Future Work</title>
      <p>We can actually obtain a lower bound on correlations using the factor complexity, i.e., the map
counting the number of diferent factors of each length within a given sequence. Consider a
sequence s that is not ultimately periodic. Let s : N → N denote its factor complexity. Let , 
be integers with  ≥ 1. Then, using Morse and Hedlund’s famous result [36], one can show that,
within the prefix of length s( ) of s, there exists a length- factor that occurs at least  times.
Therefore, we obtain the lower bound 2(s, 2s( )) ≥  . In particular, for Parry–Bertrand
automatic sequences, [20, Theorem 3.4] allows to obtain the behavior emphasized in Theorem 2.</p>
      <p>However, the aforementioned lower bound does not reflect the expected behavior of  when
the studied sequence is random, or in some other simple cases. For instance, for the binary
Champernowne sequence ch = 011011100101110111 · · · , we have 2(ch,  ) &gt; 418  (see [6,
Theorem 1]), whereas ch( ) = 2 for all suficiently large  .</p>
      <p>Furthermore, Theorem 2 extends to numeration systems that are not Parry-Bertrand. In fact,
scrutinizing its proof, we need a good control on the proportion of letters erased under the
morphism   . Indeed, the proportion of non-erased letters is precisely the length of the block
that is represented in Fig. 2. So, in turn, a good knowledge on the growth rate of the numeration
numeration language, i.e., the number of words of each length within the language, is required.
In the next example, using the method presented in the proof of Theorem 2, we exhibit an
automatic sequence having a quadratic factor complexity but a linear order-2 correlation.
Example 6. Consider the linear positional numeration system  = ( ())≥ 0 defined by
 (0) = 1 and  ( + 1) = 3 () + 1 for all  ≥ 0. It has several useful properties: its
numeration language is given by the DFA in Fig. 3 and  is Bertrand but not Parry; see [20,
Example 2 and Lemma 2.5], as well as [37, page 131]. We also have  () = Θ( ) where
 = 12 (3 + √13) ≈ 3.30278. Consider the fixed point x of the morphism  ↦→ ,  ↦→ 
and the coding  :  ↦→ 0,  ↦→ 1. The sequence  (x) is produced by the DFAO in Fig. 3. It is
binary and  -automatic. In addition, its factor complexity is known to be quadratic (see [20,
Theorem 3.3]).</p>
      <p>Using the notation of Section 3.2, the morphism   is defined by  ↦→  ,
 ↦→ ,  ↦→ ,  ↦→ . Note that it is not uniform. We
tgoet t he (seq)ue=nce ,,,,,,,,. ..ofsta t·· e·s ,rewachhoesde bshyifttheindweoerddscoorfrespond=s
{, 1, 2, 3, 10, 11, 12, 13, 20, 21, 22, 23, 30, 100, 101, 102, . . .} in the product automaton 
of Fig. 3. The morphism   is defined by  ↦→ , ,  ↦→ 0, and  ↦→ 1. Applying  
to   ( ) gives the sequence  (x). To illustrate the method of the proof of Theorem 2, we can

consider  = 10 and  = 100 for which  (, ) =  =  (, ). Now we also conclude that
2( (x),  ) ≫  .</p>
      <p>As a consequence, we raise the following more general and natural question: Is any morphic
sequence not pseudorandom as far as the correlation measures is concerned, i.e., is the correlation
of any morphic sequence large or small? In the near future, we hope to revisit the method of
the proof of Theorem 2 to extend it to a larger class of morphic sequences. In particular, as a
ifrst step, we will need to find the appropriate conditions to describe this particular class. On
the other hand, as shown in [12], a lower bound on the state complexity of binary sequences in
terms of the order-2 correlation measure can be straightforwardly derived from Theorem 1. In
our setting, can a refinement of the proof of Theorem 2, involving only the number of states of
some automata, lead to similar bounds?</p>
      <sec id="sec-4-1">
        <title>Acknowledgments</title>
        <p>We thank the reviewers for providing useful remarks that improved our initial work, especially
for thinking of the alternative proof mentioned at the beginning of Section 4, which encourages
us to look for extensions of our Theorem 2 to a broader family of sequences. We also thank
France Gheeraert and Michel Rigo for insightful discussions.</p>
        <p>Pierre Popoli’s research is supported by ULiège’s Special Funds for Research, IPD-STEMA
Program. Manon Stipulanti is an FNRS Research Associate supported by the Research grant
1.C.104.24F.
[9] K. Gyarmati, On a pseudorandom property of binary sequences, The Ramanujan Journal
8 (2004) 289–302. doi:10.1007/s11139-004-0139-z.
[10] K. Gyarmati, C. Mauduit, On the correlation of binary sequences, ii, Discrete Mathematics
312 (2012) 811–818. doi:https://doi.org/10.1016/j.disc.2011.09.013.
[11] J.-P. Allouche, J. Shallit, Automatic sequences: Theory, applications, generalizations,</p>
        <p>Cambridge University Press, Cambridge, 2003.
[12] L. Mérai, A. Winterhof, On the pseudorandomness of automatic sequences, Cryptogr.</p>
        <p>Commun. 10 (2018) 1013–1022. doi:10.1007/s12095-017-0260-7.
[13] K. Mahler, The spectrum of an array and its application to the study of the translation
properties of a simple class of arithmetical functions: Part two on the translation properties
of a simple class of arithmetical functions, Journal of Mathematics and Physics 6 (1927)
158–163. doi:https://doi.org/10.1002/sapm192761158.
[14] C. Mauduit, Multiplicative properties of the Thue-Morse sequence, Period. Math. Hungar.</p>
        <p>43 (2001) 137–153. doi:10.1023/A:1015241900975.
[15] M. Baake, M. Coons, Correlations of the Thue—Morse sequence, Indagationes
Mathematicae (2023). doi:https://doi.org/10.1016/j.indag.2023.02.001.
[16] J. Mazáč, Correlation functions of the Rudin–Shapiro sequence, Indagationes Mathematicae
(2023). doi:https://doi.org/10.1016/j.indag.2023.03.003.
[17] E. Grant, J. Shallit, T. Stoll, Bounds for the discrete correlation of infinite sequences on
 symbols and generalized Rudin-Shapiro sequences, Acta Arith. 140 (2009) 345–368.
doi:10.4064/aa140-4-5.
[18] E. Charlier, C. Cisternino, M. Stipulanti, Robustness of Pisot-regular sequences, Adv. in</p>
        <p>Appl. Math. 125 (2021) Paper No. 102151, 22. doi:10.1016/j.aam.2020.102151.
[19] E. Charlier, C. Cisternino, M. Stipulanti, A full characterization of Bertrand numeration
systems, in: Developments in language theory, volume 13257 of Lecture Notes in Comput.</p>
        <p>Sci., Springer, Cham, 2022, pp. 102–114. doi:10.1007/978-3-031-05578-2\_8.
[20] A. Massuir, J. Peltomäki, M. Rigo, Automatic sequences based on Parry or Bertrand
numeration systems, Adv. in Appl. Math. 108 (2019) 11–30. doi:10.1016/j.aam.2019.
03.003.
[21] F. Point, On decidable extensions of Presburger arithmetic: from A. Bertrand numeration
systems to Pisot numbers, J. Symbolic Logic 65 (2000) 1347–1374. doi:10.2307/2586704.
[22] M. Stipulanti, Convergence of Pascal-like triangles in Parry-Bertrand numeration systems,</p>
        <p>Theoret. Comput. Sci. 758 (2019) 42–60. doi:10.1016/j.tcs.2018.08.003.
[23] M. Lothaire, Combinatorics on words, Cambridge Mathematical Library, Cambridge
University Press, Cambridge, 1997. doi:10.1017/CBO9780511566097.
[24] P. B. A. Lecomte, M. Rigo, Numeration systems on a regular language, Theory Comput.</p>
        <p>Syst. 34 (2001) 27–44. doi:10.1007/s002240010014.
[25] V. Berthé, M. Rigo (Eds.), Combinatorics, Automata, and Number Theory, volume 135
of Encyclopedia of Mathematics and its Applications, Cambridge University Press, 2010.
doi:10.1017/CBO9780511777653.
[26] M. Rigo, A. Maes, More on generalized automatic sequences, Journal of Automata,</p>
        <p>Languages, and Combinatorics 7 (2002) 351–376. doi:10.25596/jalc-2002-351.
[27] J. Shallit, A generalization of automatic sequences, Theoret. Comput. Sci. 61 (1988) 1–16.</p>
        <p>doi:10.1016/0304-3975(88)90103-X.
[28] M. Rigo, Formal languages, automata and numeration systems. 2, Networks and
Telecommunications Series, ISTE, London; John Wiley &amp; Sons, Inc., Hoboken, NJ, 2014. Applications
to recognizability and decidability.
[29] W. Parry, On the  -expansions of real numbers, Acta Math. Acad. Sci. Hungar. 11 (1960)
401–416. doi:10.1007/BF02020954.
[30] E. Zeckendorf, Représentation des nombres naturels par une somme de nombres de</p>
        <p>Fibonacci ou de nombres de Lucas, Bull. Soc. Roy. Sci. Liège 41 (1972) 179–182.
[31] A. Bertrand-Mathis, Comment écrire les nombres entiers dans une base qui n’est pas
entière, Acta Math. Hungar. 54 (1989) 237–241. doi:10.1007/BF01952053.
[32] N. J. A. Sloane, et al., The On-Line Encyclopedia of Integer Sequences, 2024. URL: https:
//oeis.org.
[33] J.-P. Allouche, J. Cassaigne, J. Shallit, L. Q. Zamboni, A taxonomy of morphic sequences,
2019. Preprint available at https://arxiv.org/pdf/1711.10807.
[34] J. Shallit, The logical approach to automatic sequences—exploring combinatorics on words
with Walnut, volume 482 of London Mathematical Society Lecture Note Series, Cambridge
University Press, Cambridge, 2023.
[35] A. Cobham, Uniform tag seqences, Mathematical Systems Theory 6 (1972) 164–192.</p>
        <p>doi:10.1007/BF01706087.
[36] M. Morse, G. A. Hedlund, Symbolic Dynamics, Amer. J. Math. 60 (1938) 815–866.
[37] M. I. Hollander, Greedy numeration systems and regularity, Theory Comput. Syst. 31
(1998) 111–133. doi:10.1007/s002240000082.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Kolmogorov</surname>
          </string-name>
          ,
          <article-title>Three approaches to the quantitative definition of information, Internat</article-title>
          .
          <source>J. Comput. Math. 2</source>
          (
          <year>1968</year>
          )
          <fpage>157</fpage>
          -
          <lpage>168</lpage>
          . doi:
          <volume>10</volume>
          .1080/00207166808803030.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G. J.</given-names>
            <surname>Chaitin</surname>
          </string-name>
          ,
          <article-title>On the length of programs for computing finite binary sequences</article-title>
          ,
          <source>J. Assoc. Comput. Mach</source>
          .
          <volume>13</volume>
          (
          <year>1966</year>
          )
          <fpage>547</fpage>
          -
          <lpage>569</lpage>
          . doi:
          <volume>10</volume>
          .1145/321356.321363.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Solomonof</surname>
          </string-name>
          ,
          <article-title>A formal theory of inductive inference</article-title>
          .
          <source>I, Information and Control</source>
          <volume>7</volume>
          (
          <year>1964</year>
          )
          <fpage>1</fpage>
          -
          <lpage>22</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P. M.</given-names>
            <surname>Vitányi</surname>
          </string-name>
          ,
          <article-title>How incomputable is kolmogorov complexity?</article-title>
          ,
          <source>Entropy</source>
          <volume>22</volume>
          (
          <year>2020</year>
          ). doi:
          <volume>10</volume>
          . 3390/e22040408.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C.</given-names>
            <surname>Mauduit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sárközy</surname>
          </string-name>
          ,
          <article-title>On finite pseudorandom binary sequences. I. Measure of pseudorandomness, the Legendre symbol</article-title>
          ,
          <source>Acta Arith</source>
          .
          <volume>82</volume>
          (
          <year>1997</year>
          )
          <fpage>365</fpage>
          -
          <lpage>377</lpage>
          . doi:
          <volume>10</volume>
          .4064/ aa-82-4-
          <fpage>365</fpage>
          -377.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Mauduit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sárközy</surname>
          </string-name>
          ,
          <article-title>On finite pseudorandom binary sequences. II. The Champernowne, Rudin-Shapiro, and Thue-Morse sequences, a further construction</article-title>
          ,
          <source>J. Number Theory</source>
          <volume>73</volume>
          (
          <year>1998</year>
          )
          <fpage>256</fpage>
          -
          <lpage>276</lpage>
          . doi:
          <volume>10</volume>
          .1006/jnth.
          <year>1998</year>
          .
          <volume>2286</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>N.</given-names>
            <surname>Alon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kohayakawa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Mauduit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. G.</given-names>
            <surname>Moreira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Rödl</surname>
          </string-name>
          ,
          <article-title>Measures of pseudorandomness for finite sequences: typical values</article-title>
          ,
          <source>Proc. Lond. Math. Soc. (3) 95</source>
          (
          <year>2007</year>
          )
          <fpage>778</fpage>
          -
          <lpage>812</lpage>
          . doi:
          <volume>10</volume>
          .1112/plms/pdm027.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Cassaigne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Mauduit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sárközy</surname>
          </string-name>
          ,
          <article-title>On finite pseudorandom binary sequences. VII. The measures of pseudorandomness</article-title>
          ,
          <source>Acta Arith</source>
          .
          <volume>103</volume>
          (
          <year>2002</year>
          )
          <fpage>97</fpage>
          -
          <lpage>118</lpage>
          . doi:
          <volume>10</volume>
          .4064/aa103-2-1.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>