SQL:使用 join、union 将 2 个表分组为 1,然后?
我有 5 张桌子:
顾客 id - 名称
p_orders id - id_customer - 代码 - 日期
p_items id-id_order-description-price
以及h_orders和h_items,它们正是p_orders和p_items的副本。
当 p_ 表达到大量行时,我将最旧的行移至 h_ 表..它们将作为历史记录。
所以,我的问题是:如何从 p_ 表和 h_ 中检索数据,将它们视为一个唯一的表?
例如,我想检索每个客户的订单数以及总价(所有客户的订单),并且我使用该查询:
SELECT
customer.id,
customer.name,
count(DISTINCT p_orders.id) AS num_orders,
sum(p_items.price) AS total_money
FROM
customer
INNER JOIN p_orders ON p_orders.id_customer = customer.id
INNER JOIN p_items ON p_items.id_order = p_orders.id
GROUP BY
customer.id,
customer.name,
p_orders.id_customer
ORDER BY
customer.id
它仅适用于一组“表” (p_ 或 h_)..但我都想要。
我尝试使用 UNION:
(
SELECT
customer.id,
customer.name,
count(DISTINCT p_orders.id) AS num_orders,
sum(p_items.price) AS total_money
FROM
customer
INNER JOIN p_orders ON p_orders.id_customer = customer.id
INNER JOIN p_items ON p_items.id_order = p_orders.id
GROUP BY
customer.id,
customer.name,
p_orders.id_customer
)
UNION
(
SELECT
customer.id,
customer.name,
count(DISTINCT h_orders.id) AS num_orders,
sum(h_items.price) AS total_money
FROM
customer
INNER JOIN h_orders ON h_orders.id_customer = customer.id
INNER JOIN h_items ON h_items.id_order = h_orders.id
GROUP BY
customer.id,
customer.name,
h_orders.id_customer
)
ORDER BY id ASC
这个可行,但是如果客户在 p_ 表和 h_ 表中都有订单,我将为该客户提供 2 行,其中有 2 个不同的 num_orders 和total_money(分别来自 p_表和 h_ 表)
我尝试在联合外添加 GROUP BY id:
(
--SELECT 2
)
UNION
(
--SELECT 1
)
GROUP BY id
ORDER BY id ASC
但查询失败,并显示错误:字符 948 处“GROUP”处或附近的语法错误,看起来 GROUP BY 不能以这种方式使用。
有什么建议吗?
编辑:
对于uriDium,是的,所有表都将 id 列作为主键,并且引用的字段(又名 p_orders.id_customer)也是外键。 这里是测试数据库结构转储(我在创建表后添加了一些索引和外键,但我不认为这意味着什么):
CREATE TABLE customer (
id serial NOT NULL,
name character(50)
);
CREATE TABLE p_orders (
id serial NOT NULL,
id_customer integer NOT NULL,
date date DEFAULT now(),
code character(5)
);
CREATE TABLE p_items (
id serial NOT NULL,
id_order integer NOT NULL,
descr character(250),
price money
);
CREATE TABLE h_orders (
id integer NOT NULL,
id_customer integer NOT NULL,
date date,
code character(5)
);
CREATE TABLE h_items (
id integer NOT NULL,
id_order integer NOT NULL,
descr character(250),
price money
);
CREATE UNIQUE INDEX id_h_orders ON h_orders USING btree (id);
CREATE INDEX id_h_o_c ON h_orders USING btree (id_customer);
CREATE UNIQUE INDEX id_items_h ON h_items USING btree (id);
CREATE INDEX id_ordinr_dsve ON h_items USING btree (id_order);
ALTER TABLE ONLY customer
ADD CONSTRAINT customer_pkey (id);
ALTER TABLE ONLY p_orders
ADD CONSTRAINT p_orders_pkey PRIMARY KEY (id);
ALTER TABLE ONLY p_items
ADD CONSTRAINT p_items_pkey PRIMARY KEY (id);
ALTER TABLE ONLY stats
ADD CONSTRAINT stats_pkey PRIMARY KEY (id);
ALTER TABLE ONLY p_orders
ADD CONSTRAINT "$1" FOREIGN KEY (id_customer) REFERENCES customer(id) ON DELETE CASCADE;
ALTER TABLE ONLY p_items
ADD CONSTRAINT "$1" FOREIGN KEY (id_order) REFERENCES p_orders(id) ON DELETE CASCADE;
ALTER TABLE ONLY h_orders
ADD CONSTRAINT "$1" FOREIGN KEY (id_customer) REFERENCES customer(id) ON DELETE CASCADE;
ALTER TABLE ONLY h_items
ADD CONSTRAINT "$1" FOREIGN KEY (id_order) REFERENCES h_orders(id) ON DELETE CASCADE;
I have 5 tables:
customers
id - name
p_orders
id - id_customer - code - date
p_items
id - id_order - description - price
and h_orders and h_items, that are exactly the copy of p_orders and p_items.
When the p_ tables reach a big amount of rows, i move the oldest to the h_ tables.. they due as history.
So, my problem is: how to retrieve the data from both the p_ tables and h_ considering them as one unique table?
For example, i want to retrieve the number of orders for each customer, and the total price (of all the customer's orders), and i use that query:
SELECT
customer.id,
customer.name,
count(DISTINCT p_orders.id) AS num_orders,
sum(p_items.price) AS total_money
FROM
customer
INNER JOIN p_orders ON p_orders.id_customer = customer.id
INNER JOIN p_items ON p_items.id_order = p_orders.id
GROUP BY
customer.id,
customer.name,
p_orders.id_customer
ORDER BY
customer.id
it works just for one 'set' of tables (p_ or h_)..but i want them both.
I've tryed to use an UNION:
(
SELECT
customer.id,
customer.name,
count(DISTINCT p_orders.id) AS num_orders,
sum(p_items.price) AS total_money
FROM
customer
INNER JOIN p_orders ON p_orders.id_customer = customer.id
INNER JOIN p_items ON p_items.id_order = p_orders.id
GROUP BY
customer.id,
customer.name,
p_orders.id_customer
)
UNION
(
SELECT
customer.id,
customer.name,
count(DISTINCT h_orders.id) AS num_orders,
sum(h_items.price) AS total_money
FROM
customer
INNER JOIN h_orders ON h_orders.id_customer = customer.id
INNER JOIN h_items ON h_items.id_order = h_orders.id
GROUP BY
customer.id,
customer.name,
h_orders.id_customer
)
ORDER BY id ASC
This one works, but if a customer have orders both in the p_ tables and in the h_ tables, i'll have 2 rows for that customer with 2 different num_orders and total_money (respectively coming from p_ tables and h_ tables)
I've tryed to add a GROUP BY id outside the union:
(
--SELECT 2
)
UNION
(
--SELECT 1
)
GROUP BY id
ORDER BY id ASC
but the query fail with ERROR: syntax error at or near "GROUP" at character 948, seem like GROUP BY cannot be used in that way.
Any suggestion?
EDIT:
For uriDium, yes, all the tables have the id column as primary key, and the referred fields (aka p_orders.id_customer) are foreign keys too.
Here the test db structure dump (i added some indexes and foreign keys after the table creation, but i dont think that this mean something):
CREATE TABLE customer (
id serial NOT NULL,
name character(50)
);
CREATE TABLE p_orders (
id serial NOT NULL,
id_customer integer NOT NULL,
date date DEFAULT now(),
code character(5)
);
CREATE TABLE p_items (
id serial NOT NULL,
id_order integer NOT NULL,
descr character(250),
price money
);
CREATE TABLE h_orders (
id integer NOT NULL,
id_customer integer NOT NULL,
date date,
code character(5)
);
CREATE TABLE h_items (
id integer NOT NULL,
id_order integer NOT NULL,
descr character(250),
price money
);
CREATE UNIQUE INDEX id_h_orders ON h_orders USING btree (id);
CREATE INDEX id_h_o_c ON h_orders USING btree (id_customer);
CREATE UNIQUE INDEX id_items_h ON h_items USING btree (id);
CREATE INDEX id_ordinr_dsve ON h_items USING btree (id_order);
ALTER TABLE ONLY customer
ADD CONSTRAINT customer_pkey (id);
ALTER TABLE ONLY p_orders
ADD CONSTRAINT p_orders_pkey PRIMARY KEY (id);
ALTER TABLE ONLY p_items
ADD CONSTRAINT p_items_pkey PRIMARY KEY (id);
ALTER TABLE ONLY stats
ADD CONSTRAINT stats_pkey PRIMARY KEY (id);
ALTER TABLE ONLY p_orders
ADD CONSTRAINT "$1" FOREIGN KEY (id_customer) REFERENCES customer(id) ON DELETE CASCADE;
ALTER TABLE ONLY p_items
ADD CONSTRAINT "$1" FOREIGN KEY (id_order) REFERENCES p_orders(id) ON DELETE CASCADE;
ALTER TABLE ONLY h_orders
ADD CONSTRAINT "$1" FOREIGN KEY (id_customer) REFERENCES customer(id) ON DELETE CASCADE;
ALTER TABLE ONLY h_items
ADD CONSTRAINT "$1" FOREIGN KEY (id_order) REFERENCES h_orders(id) ON DELETE CASCADE;
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
您可能应该在这两个表上创建视图:
现在您可以加入这些视图。 我包含了类型 (P & H),以便您知道“id”列现在指的是什么。 如果两个表(“h”和“p”)中的 id 可能有重复,那么您必须在 All_Order_Items 视图中直接加入 Orders 表。否则,在两个视图之间加入时会遇到很多麻烦。希望您的 id列经过智能设计,而不仅仅是自动递增或标识列。
You should probably create views over the two tables:
Now you can just join to those views. I included the types (P & H) so that you know what the "id" column now refers to. If the ids in your two tables ("h" and "p" can have duplicates then you will have to join the Orders table right in the All_Order_Items view. Otherwise you will have a lot of trouble joining between the two views. Hopefully your id columns are intelligently designed and not just auto-incrmenting or identity columns.
你可以试试这个:
You could try this:
使用两个查询的并集创建一个视图,但不使用聚合函数。 使用 Union All,因为两个表中都不存在相同的记录,并且您不需要服务器浪费时间来查看该记录。您可能有时会想要在查询中访问两个表。
然后使用视图编写查询。
查看代码将类似于(您可能还需要其他字段用于其他目的:
那么对您的查询的调用将类似于(这些都没有经过测试,可能需要调整)
Create a view with the union of the two queries but without the aggregate functions. USe Union All as the same record is not in both tables and you don't need the server to waste time looking to see that.You will probaly have other times you want to access both tables in a query.
Then write your query using the view.
view code would be something like (you may want other fields for other purposes as well:
then the call for your query would be something like (none of this is tested may need adjustment)
据我所知,SQL Server 应该自动消除重复项。 使用 UNION ALL 将包含重复项。 我想象 SQL Server 会使用主键作为计算重复项的方法。 这些表上的主键是否由相同的数据类型组成,p 表中的 ID 1 也是 h 表中的 ID 1 吗?
As far as I know SQL Server should automatically eliminating duplicates. Using UNION ALL will include duplicates. I would imagine that SQL Server would use the primary keys as a means of working out what is a duplicate. Are the primary keys on these tables made up of the same datatype and is ID 1 in your p table also ID 1 in your h table?
执行您正在查看的操作的最简单方法是创建视图(例如“a_orders”和“a_items”)。 视图的定义如下:
如果在将行插入到 h_orders 时从 a_orders 中删除行(因此给定的订单不会出现在两个表中),那么使用 UNION ALL 而不是 UNION 会更有效。
The easiest way to do what you are looking at would be to create views (say "a_orders" and "a_items"). The views would just be defined like:
If you delete rows from a_orders as you insert them to h_orders (so a given order would not be in both tables) it would be quite a bit more efficient to use UNION ALL instead of UNION.
感谢所有的回复,伙计们..
Jimmie R. Houts 的“视图方式”和“子查询方式”都工作得很好,也许视图使用起来更方便..并且它们都应该花费相同的时间(或者不是?)
所以我会将第一个关于观点的答案标记为最佳答案。
不管怎样,如果可以的话,我可以问你我使用的结构和索引是否好或者可以优化吗?
Thanks for all the replies, guys..
Both the 'views way' and the 'subquery way' by Jimmie R. Houts works perfectly, maybe the views are just more convenient to use.. and them both should take the same time (or not?)
So i'll mark as best answer the first one about the views.
Anyway, if i can, may i ask you if the structure and indexes i used are a good or could be optimized?