并行Perl – 具有即时编译功能的自动并行化解释器
Parallel Perl – autoparallelizing interpreter with JIT

原始链接: https://perl.petamem.com/gpw2026/perl-mit-ai-gpw2026.html#/4/1/1

## WHIP:用Perl自制的智能家居系统 WHIP是一个源于对现有智能家居解决方案(如FHEM)不满意的家庭基础设施项目——特别是它们依赖性问题和封闭性。目标是:一个健壮、开源、DIY系统,优先考虑寿命和控制。 核心使用STM32微控制器通过CAN总线连接(因其可靠性和速度优于RS485/WiFi),由树莓派中心管理。节点运行FreeRTOS,即使在没有中心连接的情况下也能实现自主运行,并配备了各种传感器/执行器模块。一个基于Perl的服务器协调整个系统,利用Modbus和DALI等协议。 一个关键创新是**pperl**,一种用Rust编写的新Perl解释器,旨在通过JIT编译、自动并行化和字节码缓存实现与V8相当的性能。它还具有自动FFI,用于无缝集成C库,以及守护进程模式,以实现快速响应时间。 该系统已经部署在两个离网别墅中,证明了其可扩展性和实际可行性。WHIP强调模块化、领域特定中心架构(能源、照明、音频等),以避免单点故障并提高可维护性。该项目优先考虑为几十年构建,而不仅仅是保修期,并秉承“自己动手”的理念,专注于健壮、工业级的解决方案。

Hacker News 新闻 | 过去 | 评论 | 提问 | 展示 | 工作 | 提交 登录 并行Perl – 具有JIT的自动并行化解释器 (petamem.com) 17 分,由 bmn__ 1小时前发布 | 隐藏 | 过去 | 收藏 | 3 条评论 帮助 quantummagic 8分钟前 | 上一个 | 下一个 [–] 感兴趣,但无法浏览网站。右下角的向下箭头无法点击,可能被浏览器的半透明Chrome遮挡,不确定。而且不明白为什么需要4个方向箭头。回复 sherr 1分钟前 | 父评论 | 下一个 [–] 直接访问链接并按空格键对我有效。下一页,以此类推。Firefox/Linux。 bmn__ 1小时前 | 上一个 [–] 主页:https://perl.petamem.com 以防HN再次表现出对用户的敌意而截断URI片段,目标深度链接是演示文稿幻灯片 #/4/1/1 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系方式 搜索:
相关文章

原文

02 WHIP

Witty House Infrastructure Processor

PV — Perl Integration

First tools: Victron Modbus + ECS BMS — all in Perl

$ ecs_bms_tool -range 1-16          # query all battery modules
$ ecs_bms_tool -get cell_voltage -get cell_temperature
$ ecs_bms_tool -otype json           # JSON for pipeline integration

$ Wmodbus discover 192.168.2.0/24    # find Modbus devices on network
$ Wmodbus --host 192.168.2.201 --unit 2 read holding 0-10
$ Wmodbus --host 192.168.2.201 --profile vents-dbe900l monitor

ecs_bms_tool — ECS LiPro BMS management (SoC, cell voltage, balancing)
Wmodbus — Modbus TCP/RTU: discovery, read/write, device profiles, monitoring
Wcli — solar irradiance & PV power calculator
Wthermal — physics-based house thermal model

Scripts work. But a house is more than solar panels.

Looking for a Smarthome

Preferably in Perl, obviously.

WHIP

WHIP "I'll build my own."

The FHEM Experience

"We do not like CPAN" — dependencies create problems. So we reimplement everything ourselves. But worse.
"We do not like PBP" — contributions are done by amateurs. Too high expectations would kill contribution.
"Efficient algorithms are overrated" — "So what? That's 0.1s faster?"
"Tests? TDD? That's superfluous work!"
"I don't like you, you cannot use my GPL code"

(FHEM people: no offense. Well, maybe a little.)

MySensors

Nice idea. Wrong execution.

Open-source, DIY, community-driven

Tree topology with auto-routing & self-healing

Up to 254 nodes × 254 sensors — decent scale

Arduino — ATmega328? For a house? In 2020?

Arduino software model — Endless loop with stuff in it.

RS485 / nRF24L01+ — Master-Slave Architecture

Text protocol — semicolon-delimited ASCII over serial. In 2020.

No autonomous operation — nodes depend on gateway/controller

I felt there had to be something better.

Birth of a Node

CAN bus instead of RS485

Multi-master · Inherent collision resolution · Resilient
Good enough for cars for decades. Industrial standard.

1 MBit instead of EIB/KNX 9600 baud

Good for 20–30m runs. Plenty for a house.

STM32F103 instead of ATmega328

72 MHz ARM Cortex-M3 · 7× faster than Arduino
CAN peripheral built-in · $1.50 in 2020

FreeRTOS + libopencm3 instead of endless loops

Real tasks · Priorities · Preemption · Hardware abstraction

STM32 Black Pill

RobotDyn Black Pill · STM32F103C8T6

Birth of a Node

CAN bus instead of RS485

Multi-master · Inherent collision resolution · Resilient
Good enough for cars for decades. Industrial standard.

1 MBit instead of EIB/KNX 9600 baud

Good for 20–30m runs. Plenty for a house.

STM32F103 instead of ATmega328

72 MHz ARM Cortex-M3 · 7× faster than Arduino
CAN peripheral built-in · $15 in 2020 (COVID!)

FreeRTOS + libopencm3 instead of endless loops

Real tasks · Priorities · Preemption · Hardware abstraction

STM32 Blue Pill

Blue Pill · STM32F103C8T6

Birth of a Node

CAN bus instead of RS485

Multi-master · Inherent collision resolution · Resilient
Good enough for cars for decades. Industrial standard.

1 MBit instead of EIB/KNX 9600 baud

Good for 20–30m runs. Plenty for a house.

STM32F103 instead of ATmega328

72 MHz ARM Cortex-M3 · 7× faster than Arduino
CAN peripheral built-in · no stock!DIY

FreeRTOS + libopencm3 instead of endless loops

Real tasks · Priorities · Preemption · Hardware abstraction

STM32 Green Pill

Green Pill · STM32F103C8T6 · DIY

Why CAN?

Why CAN? Hardware arbitration (CSMA/CR) · true multi-master · 1 Mbit/s · differential · industrial grade

Why not WiFi/Zigbee? No batteries to die. No mesh to collapse. Building for 50 years, not 5.

Why not RS485? No arbitration. Master-slave only. Two nodes transmit = garbage.

Why not KNX? 9600 baud (1990s design). Expensive. Closed ecosystem.

Birth of a Hub

So you have 20 nodes — now what?

WHIP Hub Assembly

Hub assembly · DIN rail mount

Waveshare 2-CH CAN HAT

Waveshare 2-CH CAN HAT

RasPi 4B/5 · 2-ch CAN HAT · Relay Board · DIN Rail Mount · CAN/IP & CAN/CAN Gateway

WHIP Architecture

Nodes — STM32 MCUs · FreeRTOS · 1MBit CAN bus · Autonomous C / Embedded

Hubs — RasPi · CAN/IP gateway · Hub Aggregation · Protocol bridges · Mojolicious Perl

Server — Orchestration · External connectivity Perl

Higher layers are always a supplement, never a requirement.

Nodes — STM32 + FreeRTOS

Hardware

STM32F103 (Cortex-M3, 72 MHz)
STM32F303 (Cortex-M4F, FPU)
Native bxCAN controller

Software

FreeRTOS · libopencm3
No vendor HAL lock-in
One YAML = one firmware

115+

sensor/actuator modules

🌡️ BME280 · DS18x20 · SHT3x · NTC

INA219 · INA226 · ACS712 · ADC

💡 DALI · WS281x · PWM dimmer · SSR

🔌 PCF857x · MCP23017 · relay · GPIO

📡 LoRa · Modbus RTU · 1-Wire · SPI

🖥️ SSD1306 · ST7735 · status LEDs

🍃 SCD4x · SGP4x · PMS5003 · SEN5x

🛸 AS3935 (franklin) · MLX90614 · VL53L0x · HX711

Dependency resolver inspired by Linux Kconfig

~5 modules per node → 153,476,148 combinations

Ganglion

Ganglion = GANG of Lightweight I/O Nodes — insect-brain model.
IF-THEN rules, timers, local variables — compiled to bytecode on the MCU.
Nodes operate autonomously even when hub/server are down.

Ganglion — In Action

DEF LightTimeout = 300       # 5 minutes

# Motion detected: light on, start timer
IF motion:detected THEN lights:on; SET $T_0 = LightTimeout

# Timer expired: light off
IF !$T_0 THEN lights:off

# Cross-node: kitchen smoke → alarm everywhere
DEF Kitchen = 42
IF Kitchen:smoke:detected THEN buzzer:alarm(1)

Toolchain:

Wgc — Compiler (Perl)
.tgc source → .bgc bytecode (10–50 bytes)

Wgi — Interpreter
Disassembly + execution trace
Same C source as on STM32

Wgs — Simulator
Perl reference impl · full memory model
Mock sensors · timer simulation · 170 tests

Hubs — a Pantheon

Specialized RasPi hubs. Named by function, not by accident.

Raijin

⚡ Thunder god — energy: Victron, BMS, MPPT, 120 kWh batteries

Lucifer

💡 Light bearer — DALI lighting: 4 buses, scenes, presence simulation

Bragi

🎵 Norse god of poetry — multiroom audio, voice, AI assist

Gaia

🌿 Earth goddess — greenhouse, garden, pond, irrigation

Tyr

⚔️ God of war — ...you can guess.

No hub is a single point of failure. Each domain runs independently.

SELV-DALI — Lighting without mains

SELV = Safety Extra Low Voltage. Under 60V DC. Safe to touch.

The trick: Entire lighting chain runs from battery storage. 48V → 24V DC/DC → LED. No 230V AC anywhere.

DALI controls at 16V. Switches, sensors, dimmers — all SELV.

Inverters fail? Lights stay on — they bypass AC entirely.

Switch next to the bathtub? No problem. No electrician needed.

QR EN

🇬🇧 English

QR DE

🇩🇪 Deutsch

WHIP — Protocols & Integrations

Protocols

CAN bus 1Mbit Modbus TCP/RTU DALI MQTT SNMP I2C 1-Wire

Modbus

17 of 21 function codes · 869 tests · 91% coverage

30+ external integrations

Victron VRM · MasterTherm · PVGIS · Discord · Nextcloud · Proxmox · UniFi · ...

All protocol handlers in Perl · Mojolicious async I/O

WHIP — In production

Villa-A (Prague) — completely off-grid

  • 40 kWp solar · 120 kWh LiFePO4 · 3× Multiplus-II 10kVA
  • MasterTherm heat pump · capillary ceiling heating/cooling
  • DALI lighting across 4 buses · distributed CAN nodes

Villa-B (Germany) — same concept, different config

Two deployments = real generalization, not "works on my machine"

Invisible when it works. Competent when it matters. Built for decades, not warranties.

04 AI does Perl

Turning the predicate around.

pperl

PetaPerl  /  ParallelPerl

A Perl 5 interpreter — designed by humans.

Written in Rust — by many AI agents.

Serious — no toy or academic exercise.

pperl badge

pperl

PetaPerl  /  ParallelPerl

A Perl 5 interpreter Platform — designed by humans.

Written in Rust — by many AI agents.

Serious — no toy or academic exercise.

pperl badge

pperl — Not the first attempt

Topaz

1999 · C++ rewrite · Chip Salzenberg · abandoned

B::C / perlcc

1996–2016 · Perl-to-C compiler · dead

cperl

2015–2020 · Perl 5 fork · Reini Urban · dormant

RPerl

Restricted Perl → C++ · Will Braswell · dormant

WebPerl

Perl 5 → WebAssembly · runs in browser · semi-active

PerlOnJava

Perl 5 on JVM · Flavio Glock · active — talk at this GPW!

Common failure mode: underestimating Perl 5's complexity

pperl — Scope

Perl 5.42 — ish

Compatibility: strive for maximum Perl 5 compliance, currently 5.42
Performance: strive for V8 levels

XS: no, but yes
Native Rust implementations, integral to the interpreter

Linux only — all architectures

We really don't care about use v5.xx

pperl — Status

22,000+

tests total

~61–400 failures — give or take

Performance: good, bad and ugly

Quotes from the AI

13095 pass (+25 from previous 13070), 31 fail (down from 46!). The File::Path native implementation not only works, it unblocked 15 previously-failing tests that depended on File::Path. Zero regressions.

pperl — Benchmarks

Benchmark perl5 pperl ratio
list_util::sum 191.8K 372.8K 1.9x
list_util::min 199.8K 772.9K 3.9x
list_util::max 201.3K 673.7K 3.3x
list_util::product 2.7M 4.0M 1.5x

Native Rust implementations — not XS, not C

pperl — Beyond Perl5

Maximum compatibility. But more.

Autoparallelization — for/map/grep via Rayon · transparent · no threads pragma

JIT Compilation — Cranelift · hot codepath detection · native code at runtime

Auto-FFI — call any C library · no XS · no compilation · Peta::FFI namespace

Pre-Compile — .plc blobs · skip parsing · near-instant startup

Daemonize — emacs-style daemon/client · shared memory · zero cold start

Autoparallelization

Powered by Rayon — Rust's data-parallelism library

Work-stealing scheduler
Divides work into tasks, idle threads steal from busy ones — automatic load balancing

One-line change in Rust
.iter().par_iter() — same code, parallel execution

Guaranteed data-race freedom
If it compiles, it's safe. Rust's type system enforces this at compile time.

# This just works. In parallel.
my @results = map { expensive_computation($_) } @large_list;

# No threads. No MCE. No forks.
# pperl detects safe loops → Rayon handles the rest.

--parallel flag · list ≥ 1000 items · no shared mutation

JIT Compilation

Just-In-Time — compile to machine code while running

How it works in pperl:

  1. Interpreter runs normally — profiling hot paths
  2. Hot loop detected → lower to Cranelift IR
  3. Cranelift compiles IR → native machine code
  4. Next iteration runs as native code — zero dispatch overhead

Cranelift — the compiler backend behind Wasmtime and Rust's alternative codegen.
Production-proven. Targets: x86-64 · AArch64 · s390x · RISC-V

# pperl detects this as a hot loop pattern
my $sum = 0;
for my $i (1 .. 1_000_000) {
    $sum += $i;
}
# → Cranelift compiles to native machine code

JIT — First Win

Inner loop JIT — single hot loop compiled to native code

Benchmarkperl5pperl interpretedpperl JITvs perl5
Mandelbrot 133ms 1493ms 41ms 3.2× faster
Ackermann 13ms 630ms 12ms 1.1× faster

The JIT fired and the test passes! The answer is correct (500000500000).

Good. But only the innermost loop is compiled. What about nested loops?

$py = 0; while ($py < $height) { $y0 = $y_min + $py * $y_step; $row_off = $py * $width; $px = 0; while ($px < $width) { $x0 = $x_min + $px * $x_step; $zr = 0.0; $zi = 0.0; $iter = 0; while ($iter < $max_iter) { $r2 = $zr * $zr; $i2 = $zi * $zi; last if ($r2 + $i2 > 4.0); $zi = 2.0 * $zr * $zi + $y0; $zr = $r2 - $i2 + $x0; $iter++; } $frame[$row_off + $px] = $color_lut[$iter]; $px++; } $py++; }

JIT — The Code

Mandelbrot set
Triple-nested while loop
19 variables · float arithmetic

Pure Perl.
No XS. No Inline::C.
No tricks.

JIT — Full Nested

All 3 loop levels compiled as one native function

Mandelbrot 1000×1000perl5pperl interpretedpperl JITvs perl5
Wall time 12,514ms 163ms 76× faster

200 million escape iterations of float arithmetic.
19 variables, 3 loop levels — Cranelift register-allocates across all of them.

Perl. With JIT. That's a sentence nobody expected.

Autoparallel JIT — Full Win

JIT + Rayon: compile to native, then split across cores

Mandelbrotperl5pperl JITpperl JIT + 8 threadsvs perl5
1000×1000 12,514ms 163ms 29ms 431× faster
4000×4000 ~200s 2,304ms 342ms ~580× faster

JIT alone: 76×. Adding 8 threads: another ~7× on top.
user 2.6s vs real 0.34s — near-linear scaling across cores.

Demo Time!

Auto-FFI

No XS. No Inline::C. No compilation. Just call C.

# Layer 0 — Raw: any library, you provide type signatures
use Peta::FFI qw(dlopen call);
my $lib = dlopen("libz.so.1");
my $ver = call($lib, "zlibVersion", "()p");
say "zlib: $ver";    # 1.3.1
# Layer 1 — Pre-baked: curated signatures, zero ceremony
use Peta::FFI::Libc qw(getpid strlen strerror uname);
say strlen("hello");           # 5
my @info = uname();
say "$info[0] $info[2]";      # Linux 6.18.6-arch1-1

Pack-style type codes: (p)L = strlen(const char*) → size_t
50+ native Rust modules already built in — Auto-FFI extends to everything else

Auto-FFI — Details

Powered by libffi — any signature works, no pre-generated stubs

LayerScopeMechanism
Raw (Layer 0)Any .so on the systemdlopen + dlsym + libffi call frame
Pre-baked (Layer 1)libc, libuuid, ...Direct Rust libc::* calls — zero overhead
Discovery (Layer 2)System-wide scanscan() → hashref of { soname => path }
# Layer 2 — What's on this system?
use Peta::FFI qw(scan dlopen call);
my $libs = scan();
say scalar(keys %$libs), " libraries found";
if (exists $libs->{"libz.so.1"}) {
    my $z = dlopen("libz.so.1");
    say "zlib: ", call($z, "zlibVersion", "()p");
}

Libc: ~30 functions (process, strings, env, math, file, time)
UUID: 6 functions via dlopen — dies with install hint if missing

Bytecode Cache (.plc)

Like Python's .pyc — but for Perl. Opt-in.

# Default: no caching (safe for development)
$ pperl script.pl

# Enable: compile once, load from cache on subsequent runs
$ pperl --cache script.pl

# Invalidate all caches
$ pperl --flush

First run: parse → codegen → execute → save .plc

Second run: load .plc → execute (no parsing, no codegen)

Bytecode Cache — Details

Storable-model: bincode deserializes directly to final runtime types. Zero intermediate conversion.

Benchmark perl5 pperl pperl --cache
three_modules 22.3ms 12.6ms 9.9ms
mixed_native_fallback 26.3ms 13.0ms 10.0ms
deep_deps 18.1ms 13.1ms 9.9ms

Net module-loading cost: 33–37% faster with cache. Biggest win on fallback modules. Native Rust modules already near-zero cost.

SHA-256 keyed · mtime + version validation · aggressive format versioning

Daemonize

Emacs-style daemon/client model

$ pperl --daemon script.pl   # compile, warm up, listen
$ pperl --client script.pl   # connect → fork → run → respond
$ pperl --stop   script.pl   # clean shutdown

First run: parse → codegen → execute (warm-up) → listen
Client request: connect → fork() → child inherits arenas → execute → respond

fork() gives each client a fresh address space
with all arenas already mapped — zero I/O, zero parsing, zero deserialization

Daemonize — Details

Benchmarkperl5pperl--cache--daemon
5 native modules 15.0ms4.3ms4.3ms4.6ms
fallback + native mix 23.5ms15.8ms~10ms 5.0ms (3.2×)

Eliminates both startup costs: process creation (~3-4ms) + module compilation (0-15ms)
Faster than bytecode cache — no deserialization, arenas are already in memory

Unix domain socket · JSON wire protocol · copy-on-write pages via fork()

Daemonize — Prior Art

SolutionScopeIsolationState leakageStatus
PPerlGeneral CLINoneYesDead (2004)
SpeedyCGICGINoneYesDead (2003)
mod_perlApachePer-childPer-requestMaintained
StarmanPSGIPer-workerPer-requestMaintained
FastCGIWebPer-processPer-requestMaintained
pperl daemonGeneral CLIPer-request (fork)NoneActive

All prior solutions: same interpreter across requests — state leakage by design
pperl: fresh child per request via fork() — compiled arenas via COW, clean runtime state

Future pperl

Seamless GPU — restricted Perl → OpenCL/HIP/Vulkan/CUDA kernel · same code, GPU execution

pperl-mini — tailored and scaled down versions. Maybe on a Raspberry Pico one day?

pperl-compiler — Maybe code running on a STM32 one day?

When to use pperl

Good fit:

  • Workloads that benefit from JIT and/or autoparallelization
  • Scripts using native builtins (50+ Rust modules, fast)
  • Fast startup — inherently ~2× faster than perl5, plus --cache
  • pperl-specific features: Auto-FFI, Daemonize, Bytecode Cache
  • Security: different codebase — unlikely to share CVEs with perl5
  • Smaller, less complex scripts

Not yet:

  • Large, complex codebases — edge cases where pperl differs from perl5
  • We strive for maximum compatibility, but we're not 100% there yet

Rule of thumb: the longer and more complex the script,
the more likely you hit a corner case. If you don't want to touch the code — use perl5.

Correctness Case Study

How serious is "maximum compatibility"?

The bug: $, (OFS) vs $\ (ORS) in print

pperl checked both with the same flag mask. Perl5 doesn't.

perl5 — $, (OFS)

if (SvGMAGICAL(ofs) || SvOK(ofs))

Checks get-magic AND ok-flags

perl5 — $\ (ORS)

if (PL_ors_sv && SvOK(PL_ors_sv))

Checks ok-flags only. No get-magic.

pperl had:

// Same mask for both — SVS_GMG included for ORS. Wrong.
if flags & (SVF_IOK | SVF_NOK | SVF_POK | SVF_ROK | SVS_GMG) != 0

Practical impact: near zero.

To trigger this, you'd need a tie on $\ whose FETCH returns undef, while the underlying SV has get-magic set but none of IOK/NOK/POK/ROK — and then call print. Nobody writes this. Nobody has ever written this.

We fixed it anyway.

The depth of compatibility is the product's guarantee.

联系我们 contact @ memedata.com