等待SQL:202y: 按所有分组
Waiting for SQL:202y: Group by All

原始链接: http://peter.eisentraut.org/blog/2025/11/11/waiting-for-sql-202y-group-by-all

## 简化 SQL 的 GROUP BY 子句 一项备受期待的 SQL 功能 `GROUP BY ALL` 正在接近标准化,以简化分组操作。目前,用户必须在 `GROUP BY` 子句中显式列出所有非聚合列,这对于复杂的查询来说可能既繁琐又容易出错。 `GROUP BY ALL` 会自动扩展以包含 `SELECT` 列表中的所有非聚合列。例如,`SELECT a, avg(b) FROM t1 GROUP BY ALL` 变为 `GROUP BY a`。但是,它*不会*自动处理更复杂的表达式;如果表达式包含聚合函数,用户仍然需要显式定义分组。 虽然 `GROUP BY ALL` 提供了便利,但也存在风险。对 `SELECT` 列表的更改可能会隐式地改变分组,从而可能导致意想不到的结果——类似于使用 `SELECT *` 的陷阱。因此,建议谨慎使用,尤其是在复杂或经常修改的查询中。 Oracle 和 PostgreSQL 等数据库中已经出现了实现,预计在 SQL 标准最终发布后,可用性将进一步提高。

等待SQL:202y: 按所有分组 (eisentraut.org) 6点 由 ingve 2小时前 | 隐藏 | 过去 | 收藏 | 2评论 Exuma 4分钟前 [–] 也只是让我引用该别名在GROUP BY中,去你的 回复 mberning 0分钟前 | 父评论 [–] 有些数据库可以。用类似于ORDER BY的序号引用也会很好。对于快速且简单的查询非常有用。我明白问题在于人们可能会过度依赖它。 回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系 搜索:
相关文章

原文

Making GROUP BY a bit easier to use is in my experience among the top three requested features in SQL.

Like, if you do

CREATE TABLE t1 (a int, b int, ...);

SELECT a, avg(b) FROM t1 GROUP BY a;

the column list in the GROUP BY clause doesn’t convey much information. Of course you wanted to group by a, there is no other reasonable alternative. You can’t not group by a because that would be an error, and you can’t group by things besides a, because there is nothing else in the select list other than the aggregate.

The problem gets worse if you have a longer select list or complicated expressions, because you need to repeat these in the GROUP BY clause and carefully keep them updated if you change something. (Or you can try to work around this by using subqueries.) Could be easier!

A number of implementations have adopted the syntax GROUP BY ALL to simplify this.

SELECT a, avg(b) FROM t1 GROUP BY ALL;

The SQL standard working group discussed this feature informally at the June 2025 meeting, and there was consensus about going forward with it.

At the September 2025 meeting, a formal change proposal was brought forward and accepted. (So, technically, it’s just a working draft right now, and it won’t be final until the standard is released.)

The formal meaning of GROUP BY ALL is that it expands to a list of the elements of the select list that do not contain aggregate functions. So in

SELECT a, avg(b) FROM t1 GROUP BY ALL;

a does not contain an aggregate function, but avg(b) does, so GROUP BY ALL resolves to GROUP BY a, as expected.

This doesn’t completely remove the need for explicit GROUP BY lists. Consider a more complicated case like

CREATE TABLE t1 (a int, b int, c int, d int, ...);

SELECT a, avg(b) + c + d FROM t1 GROUP BY ALL;

Here, a does not contain an aggregate function, but avg(b) + c + d does contain an aggregate function, so the query resolves to

SELECT a, avg(b) + c + d FROM t1 GROUP BY a;

But that is not valid, because you need to account for the ungrouped columns c and d somehow. It’s not clear what GROUP BY ALL should do with this. You could have meant GROUP BY a, c + d or maybe GROUP BY a, c, d or maybe even GROUP BY a + c + d. GROUP BY ALL, as currently specified, intentionally does not handle that and leaves it up to the query author to write explicitly what they want.

Here is another example:

INSERT INTO t1 VALUES (1, 2, 100), (2, 1, 200), (2, 2, 300);

SELECT a + b, avg(c) FROM t1 GROUP BY ALL;

This means

SELECT a + b, avg(c) FROM t1 GROUP BY a + b;

not

SELECT a + b, avg(c) FROM t1 GROUP BY a, b;

which returns a different result.

I have found that the documentations of some of the products with existing implementations are quite handwavy and superficial about details like this, but the standardization needs to be precise.

The SQL standard doesn’t actually support expressions in the GROUP BY list, only column references, so this last query is technically not conforming. But there is an intent to fix this soon. But then you also need to consider more special cases like invocation of user-defined functions or subqueries, so some detail work will be required.

All that said, this kind of shortcut syntax does have risks. Similar to other shortcuts like SELECT *, this might cause queries to change unintentionally if changes are made elsewhere in the command. For example, if you have

SELECT a, avg(b) FROM t1 GROUP BY ALL;

and then change it to

SELECT a, c, avg(b) FROM t1 GROUP BY ALL;

this will cause the GROUP BY clause to change implicitly. Maybe this is not what you meant. This example query is very small, but often times these aggregation queries are long and nested, and someone editing it might not be fully aware that grouping applies to the query.

So it would probably be good to use this syntax with a bit of caution, and maybe use it mainly for interactively written queries and structurally simple queries.

In the meantime, Oracle Database has also implemented and released GROUP BY ALL support. (And they did all the work writing the SQL standard change proposal.)

PostgreSQL implemented it while I was waiting at the airport to fly home from the working group meeting. There had already been a patch pending, but the hackers group was hesitant to move forward without standardization. So once that was done, the patch was quickly accepted. This will be released next year.


联系我们 contact @ memedata.com