（评论）

（评论）
(comments)

原始链接: https://news.ycombinator.com/item?id=41024664

Google Chrome 等网络浏览器使用 GPU 渲染网页，而不是仅使用 CPU。页面中的每个层都单独渲染到像素图上，然后组合起来生成最终的显示图像。这种方法可以有效地处理静态组件，减少对移动但未改变的部分的不必要的渲染。然而，当存在多层半透明元素时，由于按特定顺序管理混合图像以准确表示屏幕上的分层内容的复杂性，浏览器性能可能会受到影响。因此，许多开发人员的目标是在设计过程中尽量减少透明度的使用并优化元素排序。尽管网络浏览器采用 GPU 渲染，但它们与视频游戏引擎有根本的不同，因为后者会随着每个新帧不断更新屏幕上的所有视觉元素。这种不断的更新可以实现快速响应速度，从而在游戏过程中几乎没有延迟，这与浏览器不同，浏览器过度渲染动态和高复杂性页面可能会导致严重的性能问题。此外，游戏引擎通常分别处理不透明和透明的几何数据，从而允许针对每种类型的信息定制优化策略。

> However, using a transparent color significantly slowed down the number that can be drawn too which doesn't make as much sense to me. I'd imagine that with hardware today transparency should be somewhat free.

That's because transparency limits how you can batch draws on the GPU. With opaque draws, you can use the depth buffer and draw in any order you like e.g. maximizing batching. With transparency, you need to draw things in the right order for blending to work (painters order).

I think it's more complex than that. While web browsers do use GPU rendering, they're not game engines. They don't draw every single object on the screen every frame, that could easily cause lag on a large complex page.

Chromium in particular tries to minimize the total number of layers. It renders each layer into a pixel map, then for each frame it composites all of the visible layers into the final image.

That works really well in practice because you often have lots of layers that move around but don't actually change their pixels. So those don't need to be rendered (rasterized) every frame, just composited.

If you have a bunch of box shadows without transparency, Chromium will rasterize the whole thing once as a single layer.

If you have a bunch of box shadows with transparency, Chromium might create a separate layer for each one. That's probably suboptimal in this particular case, but imagine if your partially transparent box shadows had to also slide around the page independently.

> They don't draw every single object on the screen every frame, that could easily cause lag on a large complex page.

Games draw every single object on the screen every frame. They don't lag, quite the opposite in fact!

I think the above was a simplification. GUI rendering is a good example of jack of all trades, master of none. It doesn't use tight render loops like game engines but it much more flexible in terms of UI possibilities.

There is also the issue that GPU's are oddly terrible at generating 2D elements of which a desktop has thousands of them. There are things like Glyph caching but they can only go so far.

Having the CPU doing the majority of the work with a few rasterization tasks to the GPU makes sense.

In fact that exact line of thinking was behind an effort to rewrite Firefox's rendering to be more like a game engine, a few years ago.

Not sure where that went in the end.

Games are expected to have sole access to your machine, so if they use all the CPU/GPU resources, nobody cares. If my web browser was burning up my battery re-rendering the page 90 times a second, I'd be livid.

> Games are expected to have sole access to your machine, so if they use all the CPU/GPU resources, nobody cares. If my web browser was burning up my battery re-rendering the page 90 times a second, I'd be livid.

Games 20 years ago had sole access to machines that were much less powerful than even partial access to today's machines.

And eg Nintendo Switch games (or games on the Steam deck, or just mobile phone games) still deal with power limitations; people are very aware when their games burn through their batteries.

Reverse-painter's-order beats painter's-order since it lets you skip fully-occluded objects:

  Start with a buffer that's fully transparent (α=0.0)
  for each face from front to back:
    for each pixel of the face:
      draw the pixel, blending 1.0-buffer.α of the new pixel into whatever's already in the buffer
      (if buffer.α == 1.0 you can just skip it entirely, just like for depth buffering)
  go back and double check your math relating to transparent objects behind other transparent objects

The tricky part is if you have faces that overlap in a cycle (can happen legitimately), or that penetrate each other (often avoidable if you think about it).

The game engines I've dealt with separate opaque and transparent geometry.

It is generally good to render opaque geometry back to front to reduce overdraw, but not going so far as sorting the objects. We would do stuff like render the hands first in an FPS or render the skybox last in most games.

Now for the transparent layer: First occlusion is handled by the z-buffer as usual. If you render from front to back I assume you render to another buffer first and then composite to the framebuffer? If you render from back to front you don't need alpha in your framebuffer and can assume each rendered pixel is opaque, not needing that composite.

There's also order independent transparency stuff though which IIRC does need another buffer, which requires a composite but then saves you having to sort the objects.

I could be wrong, but I remember folks that worked on Dreamcast games that loved how you could just throw the geometry at it in any old fashion you liked and the GPU would just sort it all out as needed. Transparencies and all.

GPUs also do not like overdraw, so it's generally good idea to avoid having many transparent elements on top of each other, its also the reason why drawing more triangles vs. transparent texture is generally better.

My big take away with the whole City Skylines 2 performance issue and the lack of LOD was that geometry processing is so cheap nowadays. So long as you aren't too reckless with geometry in terms of sub-pixel rendering, you don't really have to worry about it too much any more.

It isn't like the Ps2 era when geometry time was a real concern on render times. Even a modern low end GPU could process a few hundred million polygons a second without sweating it, now getting the result son screen is a very different issue.

Yeah, PS2's Graphics Synthesizer had a fill rate of 1.2 GB/s. For comparison, the OG Xbox had 0.932 GB/s, and the GameCube had 0.648GB/s. Assuming only 1 texture here.

The Xbox was released 1 year later, for context.

Sony also demo'd the GSCube once, it had 16 Graphics Synthesizers, achieving a fill rate of 37.7 GB/s (no textures, half that with 1... I think). Eventually they ditched the idea in favour of Nvidia's solution.

One more thing to consider is memory bandwidth, which can be limiting factor especially on mobile devices.

A non-transparent draw over another draw allows in best case to cull all overlapping drawing operations, in worst case means you only have to use as much bandwidth as the individual draws.

With transparency, especially if you can't somehow combine the operations together (from my understanding, very hard problem), it means you also need to read back the entire region you're overlapping with transparency - so every transparent draw involves at least twice the final framebuffer size bitmap going-over memory bus.

Now consider that many mobile devices had not enough memory bandwidth to do a full redraw (i.e. blit the full framebuffer image twice) of the screen in time to maintain 60fps and it becomes a considerable problem.

Yeah, somehow between comments I forgot this was about shadows and I was thinking more about drawing polygons. In that case, you can break up the polygons and work out the colors for each of the (theoretically 2^N) regions of overlap.

The "Ordering" step doesn't really matter that much. You're usually doing a sort anyway prior to submitting the drawcall. What hurts is the overdraw. If you're doing opaque rendering, you get to render front to back, rendering only what actually appears on the final framebuffer. The number of pixels (after the depth pass) is proportional to the framebuffer. When you're doing transparent rendering you render back to front, and you have to render a tonne of the scene that will eventually be (partially) obscured by other random polys. We call that overdraw. The amount of pixels through the shader pipeline balloons to be proportional to the size of your mesh.

If you're doing non-overlapping stuff, you'd actually expect (almost) no slowdown from transparency, since you'd have to touch every pixel once anyway, and the only thing that changed is the shader formula.

Was confused because everything moved in 2 seconds per frame on M2 Firefox.

Switched to Chrome - suddenly everything is butter smooth.

Congrats on an article very well done!

I'm totally down for some good old fashioned impractical hacking. But just remember, we already have canvas, which can do all this easier, faster, and better.

A great, possibly the greatest article I read this year ended with "your welcome" instead of "you're". Fix asap! Or maybe I didn't get the joke, that's a possibility

But at the end of the day Firefox and Chrome are still rendering 1px box-shadow differently at 150% browser zoom. Best hopes for Baseline 2025.

For the past 30 years I got good at programming but never really did graphics because I didn't like games. I now view it as a massive oversight and have been trying to catch up for over a year.

So hard.

Most of this is about the not-so-much-anymore-exotic art of GPU programming. It's becoming important in so many fields. The last thing I want to be is some foot-dragging old windbag who forgot to stay up to date.

Well, games are seen as fun, so they can attract programmers even with low salaries.

Basically the same reason pet veterinaries and teachers and nurses and musicians and artists still attract plenty of candidates despite a comparatively low pay.

(Playing games is more fun than working with CRUD apps. But writing games and writing CRUD apps seem about equal in their probability distributions of fun.)

> It also turns out that some smart people figured out maths hacks to draw rounded boxes for super cheap which UI peeps love because with this hack boxes can be so round as to appear as circles

Any references to learn more about these hacks?

Come on. It's thanks to articles like these I can finally have LLM write all those bullshit CSS hacks for me, and I even get to keep rejecting its commits until it gets it right!

(Though honestly, right now, getting GPT-4o to do responsive layout is an exercise in frustration equivalent to doing it myself.)

（评论） (comments)

（评论）
(comments)