让 Rust 数据库在游戏显卡的 RT 核心上运行空间查询,性能超越了 H100。
Made a Rust DB run spatial queries on gaming GPU RT cores, beating an H100

原始链接: https://sedona.apache.org/latest/blog/2026/06/26/sedonadb-04-gpu-accelerated-spatial-joins/

SedonaDB 0.4 在空间分析领域迈出了重要一步,引入了 **RayBooster**。这是一个创新的引擎,利用闲置的 GPU 光线追踪核心来加速空间连接。通过将空间交集映射到光线追踪基元上,SedonaDB 实现了巨大的性能提升——在消费级硬件(如 RTX 3090)上运行特定查询时,其性能甚至超过了企业级的 H100。 关键技术进步包括: * **GPU 优化存储**:使用“数组结构”(Structure of Arrays)格式,实现 O(1) 几何访问。 * **全局索引**:采用 Z 轴堆叠(Z-stacking)构建统一的包围盒层次结构(BVH),以实现高效的批处理。 * **通用谓词引擎**:使用 DE-9IM 拓扑描述符,通过单一代码路径处理任何几何组合。 * **内存感知调度**:通过智能管理复杂连接过程中的 GPU 内存,防止程序崩溃。 在 *SpatialBench* 上的测试显示,RayBooster 的速度提升高达 9.68 倍,大幅缩短了查询时间并降低了云基础设施成本。用户只需通过简单的 Docker 命令(`SET gpu.enable = true`)即可启用此功能。本次发布修复了 187 个问题并新增了 26 个函数,同时也为后续支持 Python DataFrame、GeoParquet 和 N 维栅格奠定了基础。

```Hacker News最新 | 过往 | 评论 | 提问 | 展示 | 招聘 | 提交登录让 Rust 数据库在游戏显卡 RT 核心上运行空间查询,性能超越 H100 (apache.org)5 分,作者 dr-jia-yu,2 小时前 | 隐藏 | 过往 | 收藏 | 讨论帮助 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系方式 搜索: ```
相关文章

原文

In SedonaDB 0.4, we taught this Rust database to run spatial joins on your $1,500 gaming GPU's ray tracing cores, and it beats an H100.

SedonaDB 0.4 GPU-Accelerated Spatial Joins — ray tracing cores, repurposed for the database

The Apache Sedona community released SedonaDB 0.4.0, resolving 187 issues and adding 26 new functions from 15 contributors. SedonaDB is the first open-source, single-node analytical database that treats spatial data as a first-class citizen — the counterpart to the distributed Sedona engines for small-to-medium datasets running on a single machine.

This is the first in a series of posts diving into what's new in SedonaDB 0.4. We'll be covering more of the release — the Python DataFrame API, the R dplyr interface, Geography support, GeoParquet write support, N-dimensional rasters and Zarr, and more — in the posts to come; for the full rundown, see the 0.4.0 release blog post. We're kicking things off with the feature we're most excited about: GPU-accelerated spatial joins.

GPU-Accelerated Spatial Joins

Architecture of RayBooster: a storage layer of GPU-friendly geometry arrays, an indexing layer, and a refinement layer feeding the RT shaders and the OptiX ray tracing execution engine

Gaming GPUs contain dedicated ray tracing cores designed for video game lighting — and they sit idle during database queries. Spatial joins are about finding intersecting geometries, which maps naturally onto ray tracing primitives. We built RayBooster, an extension that brings ray tracing core acceleration into SedonaDB.

The accompanying research paper, "RayBooster: A Ray Tracing Engine to Accelerate SedonaDB," was accepted to VLDB 2026 (Industry Track), developed in collaboration with The Ohio State University.

How it works: four components

A single BVH tree is built over the build-side geometries, then probe-side geometries cast rays through it and matched pairs are written back to the intersection buffer

1. GPU-friendly storage layout. Instead of the stream-oriented WKB format, RayBooster uses a Structure of Arrays organization that separates offsets, vertices, and types, enabling O(1) random access to any geometry.

2. A single monolithic index. Rather than building millions of tiny index trees, it uses Z-stacking — encoding each geometry's ID into the unused Z-axis of the ray tracing scene and building one global BVH for the entire batch.

3. A universal predicate engine. RelateEngine computes the DE-9IM matrix (a topological descriptor) on RT cores, giving one code path that resolves any geometry/predicate combination instead of hardcoding 500+ kernel variants.

4. Memory-aware execution. A scheduling and spilling layer keeps joins within GPU memory budgets on irregular real-world workloads, preventing out-of-memory failures.

Casting rays to test point-in-polygon and polygon intersection, then assembling the DE-9IM intersection matrices that resolve any topological predicate

Performance

Testing on SpatialBench:

  • Up to 5.93x speedup on heavy joins, with a 59.02% cost reduction on AWS
  • Q11 cross-zone trip join: 7.51s (CPU) → 1.61s on a consumer RTX 3090 — a 4.66x speedup
  • 10x scale: 53.34s reduced to under 7s
  • Heavy joins at scale: 4.93x to 9.68x speedups across GPU models
  • Consumer RTX 3090 vs. H100: on some queries the gaming card actually beat the H100 (1.26s vs 1.77s on Q10), despite the H100 lacking RT cores

Cost-effectiveness on SpatialBench across CPU, L40S, A10, and L4: per-query cost and total workload cost on AWS

Using it

On a machine with an NVIDIA GPU, pull the official Docker image and enable the feature with a single command:

ctx.sql("SET gpu.enable = true")

The GPU Acceleration guide walks through launching the Docker image on NVIDIA GPU machines and lists the supported compute capabilities.

Citation

Liang Geng, Rubao Lee, Dewey Dunnington, Feng Zhang, Jia Yu, and Xiaodong Zhang. "RayBooster: A Ray Tracing Engine to Accelerate SedonaDB." PVLDB, 2026 (Industry Track).

联系我们 contact @ memedata.com