科拉科斯基序列

科拉科斯基序列
Kolakoski Sequence

原始链接: https://en.wikipedia.org/wiki/Kolakoski_sequence

科拉科斯基序列是一个迷人的无限1和2序列，以其自描述性而闻名。它由其运行长度定义——该序列详细描述了自身内部连续相同数字的长度。从1,2,2开始，该序列决定了每种数字连续出现的次数。例如，第一个项'1'表示一个'1'，下一个项'2'表示两个'2'，依此类推。在视觉上，该序列可以表示为一个螺旋，其中弧线被重复（如果该项为1）或二等分（如果该项为2）。尽管定义很简单，但该序列表现出复杂的性质。它不是周期性的，并且是“无立方体”的（不包含重复的子字符串）。虽然广泛认为1的密度为1/2，但尚未得到证明。科拉科斯基序列还与标签系统相关联，并且可以使用在时间和空间复杂度方面效率不同的算法生成，展示了其分形般的性质。

## Kolakoski 序列讨论总结一场 Hacker News 讨论围绕 Kolakoski 序列展开——一个自我描述的、由 1 和 2 组成的无限序列。该序列通过列出相同数字的连续长度，然后使用这些长度来定义序列的下一部分，依此类推生成。核心争论在于该序列是否真正特别。一位用户最初认为，通过选择任意连续长度，可以轻松构建类似的序列。然而，其他人澄清说 Kolakoski 序列的独特性在于其*确定性*；连续长度不是随机选择的，而是由序列本身决定的。讨论还涉及以 '2' 开头的变体（这只是标准序列的平移版本）、生成它的方法（包括位操作）以及它的性质，例如无立方数因子，以及它通过递归公式在伪随机数生成中的潜在用途。最终，共识是 Kolakoski 序列因是“最简单”的自我描述序列而引人注目。

Infinite sequence in mathematics

In mathematics, the Kolakoski sequence, sometimes also known as the Oldenburger–Kolakoski sequence,^[1] is an infinite sequence of symbols {1,2} that is the sequence of run lengths in its own run-length encoding.^[2] It is named after the recreational mathematician William Kolakoski (1944–97) who described it in 1965,^[3] but it was previously discussed by Rufus Oldenburger in 1939.^[1]^[4]

The initial terms of the Kolakoski sequence are:

1,2,2,1,1,2,1,2,2,1,2,2,1,1,2,1,1,2,2,1,2,1,1,2,1,2,2,1,1,... (sequence A000002 in the OEIS)

Each symbol occurs in a "run" (a sequence of equal elements) of either one or two consecutive terms, and writing down the lengths of these runs gives exactly the same sequence:

1,2,2,1,1,2,1,2,2,1,2,2,1,1,2,1,1,2,2,1,2,1,1,2,1,2,2,1,1,2,1,1,2,1,2,2,1,2,2,1,1,2,1,2,2,...

1, 2 , 2 ,1,1, 2 ,1, 2 , 2 ,1, 2 , 2 ,1,1, 2 ,1,1, 2 , 2 ,1, 2 ,1,1, 2 ,1, 2 , 2 ,1,1, 2 ,...

The description of the Kolakoski sequence is therefore reversible. If K stands for "the Kolakoski sequence", description #1 logically implies description #2 (and vice versa):

1. The terms of K are generated by the runs (i.e., run-lengths) of K

2. The runs of K are generated by the terms of K

Accordingly, one can say that each term of the Kolakoski sequence generates a run of one or two future terms. The first 1 of the sequence generates a run of "1", i.e. itself; the first 2 generates a run of "22", which includes itself; the second 2 generates a run of "11"; and so on. Each number in the sequence is the length of the next run to be generated, and the element to be generated alternates between 1 and 2:

1,2 (length of sequence l = 2; sum of terms s = 3)

1,2,2 (l = 3, s = 5)

1,2,2,1,1 (l = 5, s = 7)

1,2,2,1,1,2,1 (l = 7, s = 10)

1,2,2,1,1,2,1,2,2,1 (l = 10, s = 15)

1,2,2,1,1,2,1,2,2,1,2,2,1,1,2 (l = 15, s = 23)

As can be seen, the length of the sequence at each stage is equal to the sum of terms in the previous stage. This animation illustrates the process:

An animated gif illustrating how later terms of the Kolakoski sequence are generated by earlier terms.

These self-generating properties, which remain if the sequence is written without the initial 1, mean that the Kolakoski sequence can be described as a fractal, or mathematical object that encodes its own representation on other scales.^[1] Bertran Steinsky has created a recursive formula for the i-th term of the sequence.^[5]

Recurrence properties

[edit]

The sequence is not eventually periodic, that is, its terms do not have a general repeating pattern (cf. irrational numbers like π and √2). More generally, the sequence is cube-free, i.e., has no substring of the form $www$ with $w$ some nonempty finite string.^[6] It is not known whether every string appearing in the sequence occurs infinitely many times, nor whether the occurrence of a string implies the occurrence of its reverse string; the two facts, however, are known to be equivalent.^[7]

It seems plausible that the density of 1s in the Kolakoski {1,2}-sequence is 1/2, but this conjecture remains unproved.^[8] Václav Chvátal has proved that the upper density of 1s is less than 0.50084.^[9] Nilsson has used the same method with far greater computational power to obtain the bound 0.500080.^[10]

Although calculations of the first 3×10⁸ values of the sequence appeared to show its density converging to a value slightly different from 1/2,^[5] later calculations that extended the sequence to its first 10¹³ values show the deviation from a density of 1/2 growing smaller, as one would expect if the limiting density actually is 1/2.^[11]

Connection with tag systems

[edit]

The Kolakoski sequence can also be described as the result of a simple cyclic tag system. However, as this system is a 2-tag system rather than a 1-tag system (that is, it replaces pairs of symbols by other sequences of symbols, rather than operating on a single symbol at a time) it lies in the region of parameters for which tag systems are Turing complete, making it difficult to use this representation to reason about the sequence.^[12]

The Kolakoski sequence may be generated by an algorithm that, in the i-th iteration, reads the value x_i that has already been output as the i-th value of the sequence (or, if no such value has been output yet, sets x_i = i). Then, if i is odd, it outputs x_i copies of the number 1, while if i is even, it outputs x_i copies of the number 2. Thus, the first few steps of the algorithm are:

The first value has not yet been output, so set x₁ = 1, and output 1 copy of the number 1
The second value has not yet been output, so set x₂ = 2, and output 2 copies of the number 2
The third value x₃ was output as 2 in the second step, so output 2 copies of the number 1.
The fourth value x₄ was output as 1 in the third step, so output 1 copy of the number 2. Etc.

This algorithm takes linear time, but because it needs to refer back to earlier positions in the sequence it needs to store the whole sequence, taking linear space. An alternative algorithm that generates multiple copies of the sequence at different speeds, with each copy of the sequence using the output of the previous copy to determine what to do at each step, can be used to generate the sequence in linear time and only logarithmic space.^[11]

^ ^a ^b ^c Sloane, N. J. A. (ed.). "Sequence A000002 (Kolakoski sequence: a(n) is length of n-th run; a(1) = 1; sequence consists just of 1's and 2's)". The On-Line Encyclopedia of Integer Sequences. OEIS Foundation.
^ Pytheas Fogg, N. (2002). Berthé, Valérie; Ferenczi, Sébastien; Mauduit, Christian; Siegel, A. (eds.). Substitutions in dynamics, arithmetics and combinatorics. Lecture Notes in Mathematics. Vol. 1794. Berlin: Springer-Verlag. p. 93. ISBN 3-540-44141-7. Zbl 1014.11015.
^ Kolakoski, William (1965). "Problem 5304". American Mathematical Monthly. 72: 674. doi:10.2307/2313883. JSTOR 2313883. For a partial solution, see Üçoluk, Necdet (1966). "Self Generating Runs". American Mathematical Monthly. 73: 681–682. doi:10.2307/2314839. JSTOR 2314839.
^ Oldenburger, Rufus (1939). "Exponent trajectories in symbolic dynamics". Transactions of the American Mathematical Society. 46 (3): 453–466. doi:10.2307/1989933. JSTOR 1989933. MR 0000352.
^ ^a ^b Steinsky, Bertran (2006). "A recursive formula for the Kolakoski sequence A000002" (PDF). Journal of Integer Sequences. 9 (3). Article 06.3.7. Bibcode:2006JIntS...9...37S. MR 2240857. Zbl 1104.11012.
^ Carpi, Arturo (1994). "On repeated factors in $C^{\infty }$ -words". Information Processing Letters. 52 (6): 289–294. doi:10.1016/0020-0190(94)00162-6. ISSN 0020-0190. MR 1307746. Zbl 0938.68698.
^ Della Corte, Alessandro (2020). "Kolakoski sequence: links between recurrence, symmetry and limit density". Open Journal of Discrete Applied Mathematics. 4 (1): 29–44. arXiv:2002.08306. doi:10.30538/psrp-odam2021.0052.
^ Kimberling, Clark. "Integer Sequences and Arrays". University of Evansville. Retrieved 2016-10-13.
^ Chvátal, Vašek (December 1993). Notes on the Kolakoski Sequence. Technical Report 93-84. DIMACS.
^ Nilsson, J. "Letter Frequencies in the Kolakoski Sequence" (PDF). Acta Physics Polonica A. Retrieved 2014-04-24.
^ ^a ^b Nilsson, Johan (2012). "A space-efficient algorithm for calculating the digit distribution in the Kolakoski sequence" (PDF). Journal of Integer Sequences. 15 (6): Article 12.6.7, 13. MR 2954662.
^ van der Poorten, A. J. (1981). "Substitution automata, functional equations and "functions algebraic over a finite field"". Papers in algebra, analysis and statistics (Hobart, 1981). Contemporary Mathematics. Vol. 9. Providence, Rhode Island: American Mathematical Society. pp. 307–312. MR 0655988. See in particular p. 308.

Allouche, Jean-Paul; Shallit, Jeffrey (2003). Automatic Sequences: Theory, Applications, Generalizations. Cambridge University Press. p. 337. ISBN 978-0-521-82332-6. Zbl 1086.11015.
Dekking, F. M. (1997). "What Is the Long Range Order in the Kolakoski Sequence?". In Moody, R. V. (ed.). Proceedings of the NATO Advanced Study Institute, Waterloo, ON, August 21-September 1, 1995. Dordrecht, Netherlands: Kluwer. pp. 115–125.
Fedou, J. M.; Fici, G. (2010). "Some remarks on differentiable sequences and recursivity" (PDF). Journal of Integer Sequences. 13 (3). Article 10.3.2.
Keane, M. S. (1991). "Ergodic Theory and Subshifts of Finite Type". In Bedford, T.; Keane, M. (eds.). Ergodic Theory, Symbolic Dynamics and Hyperbolic Spaces. Oxford, England: Oxford University Press. pp. 35–70.
Lagarias, J. C. (1992). "Number Theory and Dynamical Systems". In Burr, S. A. (ed.). The Unreasonable Effectiveness of Number Theory. Providence, RI: American Mathematical Society. pp. 35–72. ISBN 9780821855010.
Păun, Gheorghe; Salomaa, Arto (1996). "Self-Reading Sequences". American Mathematical Monthly. 103 (2): 166–168. doi:10.2307/2975113. JSTOR 2975113. Zbl 0854.68082.
Shallit, Jeffrey (1999). "Number theory and formal languages". In Hejhal, Dennis A.; Friedman, Joel; Gutzwiller, Martin C.; Odlyzko, Andrew M. (eds.). Emerging applications of number theory. Based on the proceedings of the IMA summer program, Minneapolis, MN, USA, July 15--26, 1996. The IMA volumes in mathematics and its applications. Vol. 109. Springer-Verlag. pp. 547–570. ISBN 0-387-98824-6. Zbl 0919.00047.