我被付了最低工资来解决一个不可能的问题。
I got paid minimum wage to solve an impossible problem

原始链接: https://tiespetersen.substack.com/p/i-got-paid-minimum-wage-to-solve

## 过度优化的陷阱:一个宏大的故事 一位计算机科学专业的学生,被要求清扫一家超市的地面,忍不住运用他的技能来“优化”这个过程。他将地面平面图转化为网格,构建了一个可视化编辑器,并用模拟退火算法编写了一个C++路径优化器——一种旨在寻找高效路线的复杂算法。 然而,最初的“优化”路径却是一团混乱的急转弯,对于人类清扫工来说完全不实用,尽管它是最短的距离。这凸显了一个关键的缺陷:他优化了*错误*的指标。距离不如机动性和理智重要。 在算法中添加“转向惩罚”产生了一条更现实、更易于行走的路径,展示了纯粹效率与可用性之间的权衡。这次经历变成了一个更广泛的教训:算法可以完美地实现*错误*的目标。 就像社交媒体算法最大化参与度(往往以牺牲真相和福祉为代价),或者LLM优先考虑听起来自信的答案而不是准确性一样,优化容易衡量的指标并不能保证积极的结果。关键不在于你*如何*优化,而在于你优化*什么*。最终,这位学生意识到,在解决错误问题时,技术上的完美毫无用处,有时,像正常人一样清扫地面才是最好的方法。

相关文章

原文

Sweeping the entire Albert Heijn floor. Sounds simple. And should’ve been simple.

But I’m a Computer Science student, with a problem: I can’t stop trying to optimize things that (probably) don’t need optimizing.

So instead of just doing my job and, well… sweeping… I did what any “reasonable” person would do: I turned the supermarket floor plan into a grid graph, built a visual editor and wrote a C++ path optimizer using simulated annealing.

But before we dive into how this went spectacularly wrong, and how this made me realize how this makes everyone miserable, I need you to answer a quick question:

If you were to take over my job for one day (I wouldn’t recommend it but hypothetically speaking) and needed to sweep the entire Albert Heijn floor, would you take path A or B?

Seriously. Look at them. Which one seems more efficient for sweeping a supermarket floor?

If you picked path A: congratulations, you think like an algorithm and are most likely a robot. (Good luck with CAPTCHA questions.)

But you are technically right. Path A is shorter by distance. It is absolutely useless however.

Look at those turns. Actually imagine for a second that you would walk around taking those turns. You’d look insane, like some Roomba having a seizure.

Path A is what happens when you optimize for the wrong thing.

Which, spoiler alert, is the entire point of this story. But we’ll get there, let me first explain how we got here:

First, I took the Albert Heijn floor plan and converted it into a grid. Each tile is either empty (should be swept) or an obstacle (wall, checkout counter, yoghurt package somebody threw on the ground).

I built a visual editor in Processing (a Java tool for people who like making things look cool), so I could easily map out the store and export the resulting graph.

Converting the floorplan into the grid structure was therefore quite easy.

The tiling of the actual floor helped to subdivide the area into bite-sized chunks.

This could then easily be converted into a network structure (also called a graph), by interpreting each tile as a node and then connecting them to neighboring tiles.

As you can see, I allowed for horizontal and vertical movement, as well as diagonal movement (as long as you don’t fly through walls).

The only thing to do next was to find a cycle through this network while making sure to visit all nodes (tiles). This would then be the solution to my sweeping problem.

(This problem is also called the Traveling Salesman Problem, see article for more details and why it is so hard to “solve”.)

Since it is computationally impossible to find the best path in a graph of this size, we have to resort to heuristics. Heuristics basically try to find a very good solution in a short time, instead of trying to find the perfect solution (which is more or less impossible).

So I implemented the path optimizer in C++.

The underlying heuristic algorithm: simulated annealing.

If you’re not familiar, simulated annealing is essentially trying a bunch of little changes (also called local moves).

At first, you just accept every little change (even if it makes the path worse), but throughout the algorithm you slowly get more picky and at the end only allow changes that strictly improve the path.

This is inspired by how metals cool down. By starting with a high temperature (just trying different moves) to explore a lot, and then gradually cooling down to settle into a low energy state (close to optimal).

Watch this gif. See how it start chaotic and gradually settles into something stable? That’s simulated annealing doing its thing.

For the local move, I used the 2-opt move. You take two edges in your path, remove them, and reconnect them in a different way. If this tiny change makes the path better, keep it. If not, either keep it (if the temperature is still high) or discard it.

Then just do that 1 billion times. Or well… let your computer do that 1 billion times.

After letting it run for a while, I got my first “optimized” path. Here’s what it came up with:

Look at it. That path has more sharp turns than a Christopher Nolan movie. There is no way anybody is crazy enough to actually sweep like this. You would probably throw up afterwards.

Technically, it covers the entire floor. Technically, it’s (almost) the shortest sweeping path. Technically, it’s perfect.

There are some good parts to it, but practically, it’s absolutely useless.

The algorithm did exactly what I asked for. (Thankfully, imagine if it would just do something else entirely, that would be scary.)

I just asked it the wrong question.

I quickly realized I was optimizing for the wrong thing. Distance isn’t everything.

Turns matter. Momentum matters. Not looking like a malfunctioning robot matters.

So I added a “turn penalty” to the cost function and asked it to also minimize that. Basically telling the algorithm: “Turning 90 degrees costs you extra points. Turning 180 degrees? You are out of your mind.”

This resulted in smoother routes, even if it makes the distance slightly longer.

Look at that. It’s actually… walkable. You could give this path to an actual human being and they wouldn’t quit on the spot.

We are not optimizing for distance anymore. We are optimizing for reality.

Here’s where it gets fun.

You can adjust the penalty for sharp turns. This acts as a slider between “pure efficiency” and “actually useful”.

You can literally see the trade-off. As you increase the penalty, the path gets smoother but a little longer. As you decrease it, you get efficiency but chaos.

Which path you choose is entirely up to you. It depends on things like, how easy it is for you to turn, if total distance is a priority or not, and how much dizziness you can tolerate.

This isn’t just about sweeping floors however.

This is about everything.

Social media algorithms optimize for engagement. They’re really good at it. The problem?

Engagement ≠ happiness. Engagement ≠ truth. Engagement = clicks, screen time, rage and reaction.

Consequences? Outrage, misinformation, doomscrolling, anxiety.

The algorithms are working perfectly. It’s doing exactly what it was designed to do. The cost function is just wrong. (Instagram would probably think otherwise.)

Recommendation algorithms optimize for watch time and click-through rates. Your grandma is binge-watching conspiracy theories on YouTube for 6 hours.

The algorithm crushed it. She feels like shit.

No surprises there.

Even LLMs (Large Language Models) like ChatGPT optimize for the wrong thing. They optimize for sounding confident. For sounding like they know the answer.

Not for being right. Not for being honest.

They’re trained to complete patterns, not to say “I don’t know”. So they just guess. Without any shame and with perfect grammar.

This even applies to things outside of tech.

Think of businesses. Most of them optimize for profit. Earth, the environment, morals or ethics? They are not integrated into the cost function, so won’t be optimized.

Did I use this optimized path at my actual job?

No. Obviously not. I just swept the floor like a normal person.

But building this taught me something that I think about constantly: technical correctness is worthless if you’re solving the wrong problem.

You can write perfect code. You can build flawless systems. You can optimize the sh*t out of your cost function. And you can still end up with something that sucks.

The important part isn’t the optimization algorithm. The important part is figuring out what you should be optimizing for in the first place.

Most of the time, we don’t even ask that question. We just optimize for whatever’s easy to measure and hope it works out.

Spoiler: it probably doesn’t.

If you learned something from this, great. If you just enjoyed watching me over-complicate a sweeping job, also great.

Either way, thanks for reading about my attempt to optimize a task that absolutely did not need optimizing.

Want more experiments like this? More algorithms, interesting tech and the occasional rant about productivity and the attention economy? Subscribe for free below! (No spam, only 1/2x per month)

GitHub repository (code): here

联系我们 contact @ memedata.com