是否有可能以这种方式使用 AGG 函数进行 SQL 查询?

发布于 2024-12-11 06:35:14 字数 883 浏览 0 评论 0原文

假设我有以下聚合函数:

  • AGG1
  • AGG2
  • AGG3
  • AGG4

是否可以像这样编写有效的 SQL(以与数据库无关的方式):

SELECT [COL1, COL2 ....], AGG1(param1), AGG2(param2) FROM [SOME TABLES]
WHERE [SOME CRITERIA]
HAVING AGG3(param2) >-1 and AGG4(param4) < 123
GROUP BY COL1, COL2, ... COLN
ORDER BY COL1, COLN ASC
LIMIT 10

其中 COL1 ... COLN 是正在查询的表中的列, param1 ... paramX 是传递给 AGG 函数的参数。

注意:AGG1 和 AGG2 在结果中作为列返回(但不会出现在 HAVING CLAUSE 中,AGG3 和 AGG4 出现在 HAVING CLAUSE 中但不会在结果集中返回。

理想情况下,我想要一个与数据库无关的答案解决方案,但如果我必须绑定到数据库,我正在使用 PostgreSQL (v9.x)

编辑

只是澄清一下:我不反对在中使用 GROUP BY 。我的 SQL 不是很好,所以上面的示例 SQL 可能有点误导,我编辑了上面的伪 sql 语句,希望让我的意图更清楚

。使用 AGG 函数可以:

  • 在返回的列中包含 agg 函数值,而不在 HAVING 子句中指定它们。
  • 在 HAVING 子句中指定 agg 函数,但不会在结果集中返回

。看来这两个问题的答案都是是的,我唯一需要做的就是添加一个 GROUP BY 子句以确保返回的行是唯一的。

Assuming I have the following aggregate functions:

  • AGG1
  • AGG2
  • AGG3
  • AGG4

Is it possible to write valid SQL (in a db agnostic way) like this:

SELECT [COL1, COL2 ....], AGG1(param1), AGG2(param2) FROM [SOME TABLES]
WHERE [SOME CRITERIA]
HAVING AGG3(param2) >-1 and AGG4(param4) < 123
GROUP BY COL1, COL2, ... COLN
ORDER BY COL1, COLN ASC
LIMIT 10

Where COL1 ... COLN are columns in the tables being queried, and param1 ... paramX are parameters passed to the AGG funcs.

Note: AGG1 and AGG2 are returned in the results as columns (but do not appear in the HAVING CLAUSE, and AGG3 and AGG4 appear in the HAVING CLAUSE but are not returned in the result set.

Ideally, I want a DB agnostic answer to the solution, but if I have to be tied to a db, I am using PostgreSQL (v9.x).

Edit

Just a matter of clarification: I am not opposed to using GROUP BY in the query. My SQL is not very good, so the example SQL above may have been slightly misleading. I have edited the pseudo sql statement above to hopefully make my intent more clear.

The main thing I wanted to find out was whether a select query that used AGG functions could:

  • Have agg functions values in the returned column without them being specified in a HAVING clause.
  • Have agg functions specified in a HAVING clause, but are not returned in the result set.

From the answers I have received so far, it would seem the answer to both questions is YES. The only think I have to do to correct my SQL is to add a GROUP BY clause to make sure that the returned rows are unique.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

别想她 2024-12-18 06:35:14

PostgreSQL 主要版本包括点后的第一个数字,因此“PostgreSQL (v9.x)”不够具体。正如@kekekela所说,没有(便宜的)完全与数据库无关的方法。即使 PostgreSQL 9.0 和 9.1 之间也存在重要的语法差异。

如果您只有分组值 AGG1(param1), AGG2(param2),则无需提供显式的 GROUP BY 子句即可逃脱。由于混合了分组列和非分组列,因此您必须为 SELECT 中出现的所有非分组列提供一个 GROUP BY 子句。对于任何版本的 PostgreSQL 都是如此。阅读GROUP BY 并在手册中使用它

但是,从版本 9.1 开始,一旦您在 GROUP BY 中列出主键,您就可以跳过此表的其他列,并仍然在 SELECT< /代码> 列表。 版本 9.1 的发行说明告诉我们:

当主查询目标列表中允许非 GROUP BY 列
键在 GROUP BY 子句中指定 (Peter Eisentraut)

关于参数

您打算向聚合函数提供一个常量值吗?有什么意义? 文档告诉我们

聚合函数根据多个输入行计算单个结果。

或者您希望这些参数是列名称?只要在提交到数据库之前生成语句,这种动态 SQL 就可以工作。不适用于准备好的语句或简单的 sql 或 plpgsql 函数。您必须使用 EXECUTE为此目的,在 plpgsql 函数中。

为了防范 SQLi,请使用 USING $1, $2 语法来表示 quote_ident() 表示您的列或表名称

PostgreSQL major version include the first digit after the dot, thus "PostgreSQL (v9.x)" is not specific enough. As @kekekela said, there is no (cheap) completely db agnostic way. Even between PostgreSQL 9.0 and 9.1 there is an important syntactical difference.

If you had only the grouped values AGG1(param1), AGG2(param2) you would get away without providing an explicit GROUP BY clause. Since you mix grouped and non-grouped columns you have to provide a GROUP BY clause with all non-grouped columns that appear in the SELECT. That's true for any version of PostgreSQL. Read about GROUP BY and HAVING it in the manual.

Starting with version 9.1, however, once you list a primary key in the GROUP BY you can skip additional columns for this table and still use them in the SELECT list. The release notes for version 9.1 tell us:

Allow non-GROUP BY columns in the query target list when the primary
key is specified in the GROUP BY clause (Peter Eisentraut)

Concerning parameters

Do you intend to feed a constant value to an aggregate function? What's the point? The docs tell us

An aggregate function computes a single result from multiple input rows.

Or do you want those parameters to be column names? That kind of dynamic SQL works as long as the statement is generated before committing to the database. Does not work for prepared statements or simple sql or plpgsql functions. You have to use EXECUTE in a plpgsql function for that purpose.

As safeguard against SQLi use the USING $1, $2 syntax for values and quote_ident() for your column or table names.

━╋う一瞬間旳綻放 2024-12-18 06:35:14

不使用 GROUP BY 聚合列的唯一方法是使用窗口函数。您遗漏了问题的详细信息,但以下内容可能就是您正在寻找的内容:

SELECT *
FROM (
    SELECT [COL1, COL2 ....], 
           AGG1(param1) over (partition by some_grouping_column) as agg1, 
           AGG2(param2) over (partition by some_grouping_column) as agg2,
           row_number() over () as rn
    FROM [SOME TABLES]
    WHERE [SOME CRITERIA]
    ORDER BY COL1
)  t
WHERE AGG3 >-1 
  AND AGG4 < 123
  AND rn <= 10
ORDER BY col1

这是标准 ANSI SQL,适用于大多数数据库,包括 PostgreSQL(自 8.4 起)。

请注意,您不需要对 Partition by 子句中的两个聚合使用相同的分组列。

如果您想坚持使用 ANSI SQL,那么您应该使用 row_number() 函数来限制结果。如果您仅在 PostgreSQL(或以某种方式支持 LIMIT 的其他 DBMS)上运行此命令,请将 LIMIT 原因移至派生表(内部查询)

The only way to aggregate over columns without using GROUP BY is to use windowing functions. You left out details of your problem, but the following might be what you are looking for:

SELECT *
FROM (
    SELECT [COL1, COL2 ....], 
           AGG1(param1) over (partition by some_grouping_column) as agg1, 
           AGG2(param2) over (partition by some_grouping_column) as agg2,
           row_number() over () as rn
    FROM [SOME TABLES]
    WHERE [SOME CRITERIA]
    ORDER BY COL1
)  t
WHERE AGG3 >-1 
  AND AGG4 < 123
  AND rn <= 10
ORDER BY col1

This is standard ANSI SQL and works on most database including PostgreSQL (since 8.4).

Note that you do not need to use the same grouping column for both aggregates in the partition by clause.

If you want to stick with ANSI SQL then you should use the row_number() function to limit the result. If you run this only on PostgreSQL (or other DBMS that support LIMIT in some way) move the LIMIT cause into the derived table (the inner query)

神经暖 2024-12-18 06:35:14

从高层次的角度来看,这应该是可行的,除非您需要在 GROUP BY 语句中使用 COL1、COL2 等,否则它们在 SELECT 列表中将无效。在 SELECT 列表中包含 AGG1 等而不是在 HAVING 中不是问题。

就数据库无关性而言,无论你做什么,你都必须调整语法(例如,我立刻就知道 PostgreSQL、SQL SERVER 和 Oracle 中的 LIMIT 会有所不同),但是你如果您的高级表示是可靠的,则可以构建逻辑来正确地为每个语句构建语句。

That should work from a high level perspective, except you'd need COL1, COL2 etc in a GROUP BY statement or else they won't be valid in the SELECT list. Having AGG1, etc in the SELECT list and not in the HAVING is not a problem.

As far as db agnostic, you're going to have to tweak syntax no matter what you do (the LIMIT for example is going to be different in PostgreSQL, SQL SERVER and Oracle that I know off the top of my head), but you could build logic to construct the statements properly for each provided your high-level representation is solid.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文