PostgreSQL 查询分解
我无法分解简单的 SQL 查询。我使用 PostgreSQL,但我的问题也与其他 RDBMS 有关。
考虑以下示例。我们有表格订单,我们想要找到总金额超过某个限制的第一个订单:
drop table if exists orders cascade;
/**
Table with clients' orders
*/
create table orders(
date timestamp,
amount integer
/**
Other columns omitted
*/
);
/**
Populate with test data
*/
insert into orders(date,amount)
values
('2011-01-01',50),
('2011-01-02',49),
('2011-01-03',2),
('2011-01-04',1000);
/**
Selects first order that caused exceeding of limit
*/
create view first_limit_exceed
as
select min(date) from
(
select o1.date
from orders o1,
orders o2
where o2.date<=o1.date
group by o1.date
having sum(o2.amount) > 100
) limit_exceed;
/**
returns "2011-01-03 00:00:00"
*/
select * from first_limit_exceed;
现在让我们让问题变得更难一些。考虑我们只想查找满足某些谓词的行的总金额。我们有很多这样的谓词,创建单独版本的视图first_limit_exceed将是可怕的代码重复。因此,我们需要某种方法来创建参数化视图,并将过滤后的行集或谓词本身传递给它。 在 Postgres 中,我们可以使用查询语言函数作为参数化视图。但是 Postgres 不允许函数将行集或另一个函数作为参数。 我仍然可以在客户端或 plpgsql 函数中使用字符串插值,但它很容易出错并且难以测试和调试。 有什么建议吗?
I fail to decompose simple SQL queries. I use PostgreSQL but my question is also related to other RDBMS.
Consider the following example. We have table orders and we want to find first order after which total amount exceeded some limit:
drop table if exists orders cascade;
/**
Table with clients' orders
*/
create table orders(
date timestamp,
amount integer
/**
Other columns omitted
*/
);
/**
Populate with test data
*/
insert into orders(date,amount)
values
('2011-01-01',50),
('2011-01-02',49),
('2011-01-03',2),
('2011-01-04',1000);
/**
Selects first order that caused exceeding of limit
*/
create view first_limit_exceed
as
select min(date) from
(
select o1.date
from orders o1,
orders o2
where o2.date<=o1.date
group by o1.date
having sum(o2.amount) > 100
) limit_exceed;
/**
returns "2011-01-03 00:00:00"
*/
select * from first_limit_exceed;
Now let's make the problem a little harder. Consider we want to find total amount only for rows that satisfy some predicate. We have a lot of such predicates and creating separate version of view first_limit_exceed would be terrible code duplication. So we need some way to create parameterized view and pass either filtered set of rows or predicate itself to it.
In Postgres we can use query language functions as parameterized views. But Postgres does not allow function to get as argument neither set of row nor another function.
I still can use string interpolation on client's side or in plpgsql function, but it is error-prone and hard to test and debug.
Any advice?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在 PostgreSQL 8.4 及更高版本中:
将所需的任何谓词添加到内部查询中:
In
PostgreSQL 8.4
and later:Add any predicates you want into the inner query:
听起来有点像您试图将太多代码放入数据库中。如果您对满足特定谓词的特定关系的行感兴趣,只需在客户端代码中执行带有适当
where
子句的select
语句即可。拥有将谓词作为参数的视图正在重新发明轮子,而 sql 已经很好地解决了这个问题。另一方面,我可以看到将查询本身存储在数据库中的论点,以便它们可以组成更大的报告。这两个仍然由应用程序代码更好地处理。我可能会通过使用擅长动态 sql 生成的库(例如 sqlalchemy)来解决类似的问题,然后将查询表示(sqlalchemy 表达式对象是“pickleable”)作为 blob 存储在数据库中。
换句话说,数据库是事实的代表,您可以在其中存储知识。应用程序有责任响应用户请求,当您发现自己定义数据转换时,实际上更多的是预测和实现实际用户的请求,而不仅仅是忠实地保存知识。
当模式不可避免地发生变化时,最好使用视图,这样您就可以让不需要了解新模式的旧应用程序处于工作状态。
It sounds a bit like you're trying to put too much code into the database. If you are interested in the rows of a certain relation that satisfy a particular predicate, just execute a
select
statement with an appropriatewhere
clause in the client code. Having views that take predicates as parameters is reinventing the wheel that sql already solves nicely.On the other hand, I can see an argument for storing queries themselves in the database, so that they can be composed into larger reports. This two is still better handled by application code. I might approach a problem like that by using a library that's good at dynamic sql generatation, (for example sqlalchemy), and then storing the query representations (sqlalchemy expression objects are 'pickleable') as blobs in the database.
To put it another way, databases are representers of facts, You store knowledge in them. applications have the duty of acting on user requests, When you find yourself defining transformations on the data, that's really more a matter of anticipating and implementing the requests of actual users, rather than just faithfully preserving knowledge.
Views are best used when the schema inevitably changes, so you can leave older applications that don't need to know about the new schema in a working state.