我无法运送的港口
The Port I couldn't Ship

原始链接: https://ammil.industries/the-port-i-couldnt-ship/

受到Simon Willison利用Claude将遗留库带到网络上的成功启发,作者尝试对Graph::Easy(一个用于生成ASCII流程图的Perl库)做同样的事情。最初的目标是创建一个Web应用,使用WebPerl展示该库迷人且可移植的图表——结果出乎意料地成功。 然而,作者随后追求了一个更雄心勃勃,但最终未能实现的目标:使用大型语言模型(LLM)将Graph::Easy移植到TypeScript。尽管最初充满乐观,但多次尝试——包括各种提示策略、测试驱动开发,甚至将任务分解到多个LLM“代理”——都无法准确地复现原始Perl的输出。 核心问题在于该库根深蒂固的复杂性,历经数十年的积累,以及LLM无法掌握生成正确ASCII艺术所必需的空间推理能力。作者得出结论,用编码代理复制多年精心开发既是对该工艺的不尊重,也是在根本上不现实的,凸显了当前LLM在处理细微且成熟的代码库方面的局限性。

黑客新闻 新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 我无法发布的端口 (ammil.industries) 6点 由 cjlm 1小时前 | 隐藏 | 过去 | 收藏 | 1条评论 abstractspoon 15分钟前 [–] 来得容易去得快 回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系 搜索:
相关文章

原文
home

In October I read Simon Willison’s account of bringing a 2001 Perl (and C) library to the web using Claude Code. Back in 2022 I wrote about a fantastic little Perl library called Graph::Easy that renders flowcharts as ASCII art.

Inspired by Willison, I set about bringing Graph::Easy to the web.1

Why Graph::Easy?

Here’s an example ASCII diagram generated with Graph::Easy.

These diagrams are inherently portable, timeless, and to my eye more charming than any modern visualisation alternatives. Thirty years on, Graph::Easy remains unmatched at this specific problem. 2

It was supposed to be so Graph::Easy

Like Willison, I was unsurprised yet delighted to find that Claude had no issue using WebPerl to make Graph::Easy run locally in the browser. With a small amount of prodding and guidance I had a slick little webapp that showcased the library.

Screenshot of Graph::Easy web interface showing the Seven Bridges of Königsberg problem rendered as ASCII art. The left panel contains Graph::Easy syntax defining the bridges and connections, while the right panel displays the resulting ASCII diagram with bridges represented as dotted-line boxes connected in a network layout.
Is it a real side project if you don't buy a new domain name?

At this point I should have put final touches to the web app, shipped it and written this blog post.

But I didn’t. I got greedy.

Chasing my own tail

One thing irked me. The WebPerl interpreter took a few seconds to initialise.

“You know what?” I thought. “I bet Claude could do a good job simply porting the library over to a different language.”

The early success of the WebPerl app convinced me the model could handle more than it could. I was lazy, curious, and assuming the jagged frontier would smooth out if I pushed hard enough.

And thus started a journey both fruitless and frustrating.

Attempt 1 - one-shot and done?

I didn’t, at first, understand the enormity of the task I had set. I asked Claude to port the Perl library to TypeScript in a new branch and let it rip.

LLMs are notoriously untrustworthy. They require scrutiny, oversight and credulity to spot cases where they gloss over inconvenient implementation details.

I’m human though – it’s hard to calibrate my response to their enthusiasm. Claude seemed to have the implementation in hand, who was I to question it?

Questioning came naturally to me, however, when the generated ASCII graphs from my newly vibed, black box implementation completely failed to match the original Perl versions.

Expected:

Actual:

Attempt 2 - TDD

After my initial punt I knew I had to be smarter. Graph::Easy has over 100 reference tests that I needed to wire up to take an approach approximating “Test Driven Development”. Graph description in, ASCII output out, compare it to the reference example. Easy.

At this point most of these tests were failing, but a few basic ones gave me misplaced confidence that I could crank the LLM machine and get a Green test bench within a few hours. In hindsight this was idealistic.

LLMs don’t see ASCII art the same way we do. They see strings made up of letters, punctuation and newlines. The spatial relationships that make Graph::Easy’s output dense and clarifying are invisible to them.

I dabbled with allowing Claude to generate screenshots of the reference inputs and test outputs to let the multimodal capability do their work but this felt slow and comparing pixels seemed counterproductive when considering characters in a rectangle.

Attempt 3 - Separation of Concerns

I tried again from scratch, attempting the current approximation of prompt engineering best practices and actually using my brain to reduce the complexity:

  • I reduced the scope and made sure we weren’t trying to replicate features we didn’t need (for example, SVG rendering and colour string parsing)
  • I split the work into parsing, layout, and rendering streams and was assured by Claude that these were independent (when will I learn not to trust the models?). I figured that the context of each agent would do a better job isolated from the concerns of others.
  • I interrogated each agent about pitfalls and refined the approach until I couldn’t spot holes

Running multiple agents at once gave a fresh contact-high of apparent productivity. But asking a final agent to merge the work and test the output I was dismayed (but low-key unsurprised) to find the new run still failed on even the most basic of tests.

I took a long-overdue peek at the source codebase. Over 30,000 lines of battle-tested Perl across 28 modules. A* pathfinding for edge routing, hierarchical group rendering, port configurations for node connections, bidirectional edges, collapsing multi-edges. I hadn’t expected the sheer interwoven complexity.

Attempt 4 - Shopping around

At this point I was desperate and resented my teammate. Claude’s peppy enthusiasm was grating and the whole project was wearing me down. Over the course of the project it appeared to gain a new skill of being self-aware about its own token use. Recognising the enormity of the task ahead of it, it would seek reassurance after every loop of work.

My guess is that this is an attempt to combat user frustration and perceived degraded capabilities of the models once the context window is saturated. It’s not unique to Claude.

The reward hacking was another problem. If I didn’t pay attention Claude did everything possible to avoid doing the work. I was fatigued from making demands in the CLAUDE.md that it appeared to be ignoring anyway.

Cursor 2.0 was released mid-spiral and gave me an easy way to simultaneously apply multiple models to the problem. I was drawn to the recently unmasked, zippy, Composer model – as if speed was the biggest blocker here.

I found that GPT-Codex-High appeared to read the Perl source more consistently and could explain why certain Perl idioms wouldn’t translate. This bought me another few days of false hope.

Attempts 5, 6, ∞

I don’t recall what happened next. I think I slipped into a malaise of models. 4-way split-paned worktrees, experiments with cloud agents, competing model runs and combative prompting. Drowning in stale markdown plans with no idea who or what had written them, documentation that was supposedly self-updating but never was.

Rituals of asking the models to ensure the documentation was up to date (despite the hooks that were supposed to do that for me) grew tiring and felt futile.

I surfaced for air and gazed back at the mess of incomplete ports.

Graph::Easy, Hard::won

I finally understood: Graph::Easy earned its complexity through decades of tweaking and development. Chewing up decades of careful work and spitting it out with a gaggle of coding agents is disrespectful to the craft.

I spent weeks casually trying to replicate what took years to build. My inability to assess the complexity of the source material was matched by the inability of the models to understand what it was generating.

A reader (or dare I say a wiser version of me), armed with a future model and dedicated to the task, will succeed with this port where I failed and that makes me uneasy.


联系我们 contact @ memedata.com