So why is this so hard? Why do we get all these weird edge cases pop up whenever we do layout? I think the issue is that layout actually is quite hard.
Here, let me give you an example. Here's a line from my formal semantics of CSS 2.2, the specification of text-align: center:
[(is-text-align/center textalign) (= (- (left-outer f) (left-content b)) (- (right-content b) (right-outer l)))]
This is saying that if a container b has text-align: center specified, then its left gap (the x position of left outer edge of its first child f, minus the left content edge of the container b) equals its right gap (similar, using right edges, the last child l, and the subtraction being reversed). It's a constraint! But then if you go a few lines up you'll see that before actually applying this constraint, we first check if the container is even big enough to contain the content, and if not, we left align it no matter what.
What? Really? Yep. It's a tricky little quirk of the CSS semantics, section 9.4.2 of CSS 2.1.3 Actually, I think the controlling standard on this exact quirk is now CSS Text Level 3 which has a quite clear paragraph documenting this behavior. If text is centered inside a box too small to contain it, we don't want it spilling out the left edge (it might go off-screen, where the user cannot scroll); left-aligning ensures it only spills out on the right.
That's a funky quirk but also, you may have never noticed it and if you did this edge case probably was better than what the layout would have been. Meaning, actually, building this edge case into the definition of text-align was a smart choice by the CSS designers, embedding hard-earned design wisdom into intuitive rules that people mostly use without issue. (text-align is not considered one of the bad scary parts of CSS.) And on the contrary, in a "clean" constraint-based system, web page designers would probably not bothered to manually add this as a constraint, and probably in quite a few cases that would result in worse, not better layout.
Generalizing a bit, the challenge is that we're never just "deciding what our page looks like". We are always designing a layout that is responsive to parameters like screen size, zoom level, details of font rendering (Windows and macOS render identical fonts slightly differently), operating system details, and even higher-level changes in our application like new content, new features, translations to other languages,4 German words are very long, Chinese ones are very short. device oddities like notches, and so on.
Designing a layout from scratch that looks good in any possible one of those contexts is basically impossible—it's hard enough to do both desktop and mobile!—and so designers are, by necessity, going to rely on implicit knowledge encoded somewhere on what to do in edge cases. There's going to be a huge amount of this implicit knowledge, and whether it's encoded in rules or weights or optimization criteria it's going to be opaque to designers and surprising at least sometimes.
For example, in CSS you can also justify text, stretching spaces between words so that all lines in a paragraph (except the last!) have the same right edge. But, famously, if the line width is too narrow and the line contains too few words that are too long, then the spaces between them get stretched comically far apart and it looks terrible. You can do better by enabling hyphenation (which might turn 2 really long words in a line into 3 or more moderate size word chunks) or letter-spacing (which might also stretch the spaces between letters slightly) but those are themselves unpredictable and language-dependent and still sometimes look ugly.
So where did all these implicit rules about justification come from? Well, text justification comes from a long Western tradition.5 That post claims Trajan's column, built 113 AD, as an early example. So what happened in that long Western tradition, before CSS and computers, when this problem came up? Well, in the olden days, if you were a newspaper columnist and your column was ugly when justified6 Most newspapers justified their text and also ran it in lots of narrow columns, so it was especially a problem with newspapers. then your editor might just rephrase your text into smaller words. That technology might be slowly becoming possible but is clearly outside the bounds of what CSS would do. Generalizing, these implicit rules often draw from traditions where these edge cases simply never came up! So it's no surprise that there might be no ideal workable of implicit rules with no edge cases.