Show HN:LoopMix128——快速的C语言伪随机数生成器(0.46纳秒),周期为2^128,通过BigCrush/PractRand测试
Show HN: LoopMix128 – Fast C PRNG (.46ns), 2^128 Period, BigCrush/PractRand Pass

原始链接: https://github.com/danielcota/LoopMix128

LoopMix128 是一款快速、高质量的伪随机数生成器 (PRNG),专为需要速度和统计可靠性的非加密应用而设计。它拥有 2^128 的保证周期,由独立的 64 位低位和高位计数器确保。Z3 证明器证明了其 192 位状态下的单射性。其状态被单射映射,从而能够轻松生成并行流。 LoopMix128 在统计测试中表现出色,通过了 TestU01 的 BigCrush 和 PractRand(高达 32TB)测试,且没有明显的异常。基准测试表明,它比标准库生成器快得多,并且与其他高速 PRNG(如 wyrand 和 xoroshiro128++)具有竞争力。使用不同种子进行的重复 PractRand 测试表明,与这些 PRNG 相比,它在抵抗可疑结果方面具有相当或更好的抵抗力。LoopMix128 由 Daniel Cota 开发,优先考虑速度和可靠性,以满足苛刻的应用需求。

Daniel Cota 推出了 LoopMix128,这是一个针对非密码学任务设计的快速 C 语言伪随机数生成器。其速度约为 0.37 ns/值,显著快于 xoroshiro128++ 和 PCG64,同时通过了 TestU01 BigCrush 和 PractRand (32TB) 的测试。该 PRNG 保证了 2^128 的周期,并使用 192 位状态,通过 Z3 SMT 求解器证明了其在并行流中的单射性。 MurmurHash 的作者 Aappleby 对其通过 BigCrush 测试感到惊讶,并指出状态更新函数的近线性性。他们质疑需要 2^128 周期的保证,并建议为这类应用使用更强大的生成器。 Straw 建议进行状态大小容量分析,以更好地了解其强度。其他用户将 LoopMix128 的性能与 PCG64、RomuQuad 和 RomuDuoJr 进行了比较。结果发现它比 PCG64 快,与 wyrand 相当,比 RomuDuoJr 稍慢。 作者正在根据最近的评论更新代码,并发现了一个反例,很快就会修复。
相关文章
  • 完美的随机浮点数 2025-05-07
  • (评论) 2024-05-05
  • Go 1.22 中的安全随机性 2024-05-08
  • (评论) 2025-03-26
  • (评论) 2024-05-06

  • 原文

    This repository contains LoopMix128, an extremely fast pseudo-random number generator (PRNG) with a guaranteed period of 2^128, proven injectivity, and clean passes in both BigCrush and PractRand (32TB). It is designed for non-cryptographic applications where speed and statistical quality are important.

    • High Performance: Significantly faster than standard library generators and competitive with or faster than other modern high-speed PRNGs like wyrand and xoroshiro128++.
    • Good Statistical Quality: Has passed TestU01's BigCrush suite and PractRand (up to 32TB) with zero anomalies.
    • 2^128 Period: Minimum period length of 2^128 through its 128 bit low/high counter looping.
    • Proven Injectivity: Z3 Prover proven injectivity across its 192 bit state. (z3 script) (results)
    • Parallel Streams: The injective 192 bit state facilitates parallel streams as outlined below.
    • Speed: 8.75x Java random, 21% faster than Java xoroshiro128++, 75% faster than C xoroshiro128++ (benchmark)
    • Passed 256M to 32TB PractRand with zero anomalies (results)
    • Passed BigCrush with these lowest p-values: (results)
      • 0.01 sknuth_MaxOft (N=20, n=10000000, r=0, d=100000, t=32), Sum ChiSqr
      • 0.02 sknuth_CouponCollector (N=1, n=200000000, r=27, d=8), ChiSqr
      • 0.02 svaria_WeightDistrib (N=1, n=20000000, r=28, k=256, Alpha=0, Beta=0.25), ChiSqr
    // Golden ratio fractional part * 2^64
    const uint64_t GR = 0x9e3779b97f4a7c15ULL;
    
    // Initialized to non-zero with SplitMix64 (or equivalent)
    uint64_t slow_loop, fast_loop, mix; 
    
    // Helper for rotation
    static inline uint64_t rotateLeft(const uint64_t x, int k) {
        return (x << k) | (x >> (64 - k));
    }
    
    // --- LoopMix128 ---
    uint64_t loopMix128() {
      uint64_t output = GR * (mix + fast_loop);
    
      // slow_loop acts as a looping high counter (updating once per 2^64 calls) to ensure a 2^128 period
      if ( fast_loop == 0 ) {
        slow_loop += GR;
        mix ^= slow_loop;
        }
    
      // A persistent non-linear mix that does not affect the period of fast_loop and slow_loop
      mix = rotateLeft(mix, 59) + fast_loop;
    
      // fast_loop loops over a period of 2^64
      fast_loop = rotateLeft(fast_loop, 47) + GR;
    
      return output;
      }
    

    (Note: The code above shows the core logic. See implementation files for full seeding and usage examples.)

    Thanks to the proven injectivity of the 192 bit state of LoopMix128, parallel streams can be implemented as follows:

    • Simply randomly seed fast_loop, slow_loop, and mix and take advantage of the large 2^192 injective state which makes collisions incredibly improbable.
    • Or, if absolutely critical - randomly seed fast_loop and mix for each stream and manually assign slow_loop uniquely to each stream taking into account that slow_loop normally increases by GR at the end of every 2^64 cycle (spacing slow_loop evenly across the streams).

    PractRand Testing (With Varied Seeds)

    As running BigCrush and PractRand can behave differently depending on initial seeded states, PractRand was also run multiple times from 256M to 8GB using varied initial seeds (seeding with SplitMix64). Below are the counts of total suspicious results when running PractRand 1000 times for LoopMix128 and some reference PRNGs:

    LoopMix128          0 failures, 24 suspicious
    xoroshiro256++      0 failures, 27 suspicious
    xoroshiro128++      0 failures, 28 suspicious
    wyrand              0 failures, 32 suspicious
    /dev/urandom        0 failures, 37 suspicious
    

    Created by Daniel Cota after he fell down the PRNG rabbit-hole by circumstance.

    联系我们 contact @ memedata.com