我需要什么 SQL 魔法才能将一列变成多列?

发布于 2024-10-08 11:52:56 字数 799 浏览 0 评论 0 原文

我需要打印一些票据,每张票据都有足够的空间来容纳一组客户详细信息以及该客户订购的最多五件商品的代码。订购超过五件商品的顾客将获得多张门票。因此,从这样的订单表中,

Customer | Item
---------|------
Bob      | FTMCH
Bob      | ZORP
Bob      | KLUGE
Carol    | FTMCH
Carol    | MEEP
Carol    | ZORP
Ted      | FOON
Ted      | SMOCK
Alice    | ORGO
Carol    | SQICK
Carol    | BLECH
Carol    | KLUGE
Carol    | GLURP

我需要一个返回以下内容的查询:

Customer | Item1 | Item2 | Item3 | Item4 | Item5
---------|-------|-------|-------|-------|------
Alice    | ORGO  | null  | null  | null  | null
Bob      | FTMCH | ZORP  | KLUGE | null  | null
Carol    | FTMCH | MEEP  | ZORP  | SQICK | BLECH
Carol    | KLUGE | GLURP | null  | null  | null
Ted      | FOON  | SMOCK | null  | null  | null

有人能帮我解决这个问题的 SQL 吗? OpenOffice.org Base 中的 HSQL 嵌入式数据库(如果有影响的话)。

I need to print some tickets, each of which has enough room to hold one set of customer details along with codes for up to five items ordered by that customer. Customers who have ordered more than five items get multiple tickets. So from an orders table like this,

Customer | Item
---------|------
Bob      | FTMCH
Bob      | ZORP
Bob      | KLUGE
Carol    | FTMCH
Carol    | MEEP
Carol    | ZORP
Ted      | FOON
Ted      | SMOCK
Alice    | ORGO
Carol    | SQICK
Carol    | BLECH
Carol    | KLUGE
Carol    | GLURP

I need a query that returns this:

Customer | Item1 | Item2 | Item3 | Item4 | Item5
---------|-------|-------|-------|-------|------
Alice    | ORGO  | null  | null  | null  | null
Bob      | FTMCH | ZORP  | KLUGE | null  | null
Carol    | FTMCH | MEEP  | ZORP  | SQICK | BLECH
Carol    | KLUGE | GLURP | null  | null  | null
Ted      | FOON  | SMOCK | null  | null  | null

Can some kind soul help me with the SQL for this? HSQL embedded database in OpenOffice.org Base, if it makes a difference.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

自找没趣 2024-10-15 11:52:56

好的,这工作得很好:

SELECT
    "Customer",
    MAX(CASE WHEN "Slot" = 0 THEN "Item" END) AS "Item1",
    MAX(CASE WHEN "Slot" = 1 THEN "Item" END) AS "Item2",
    MAX(CASE WHEN "Slot" = 2 THEN "Item" END) AS "Item3",
    MAX(CASE WHEN "Slot" = 3 THEN "Item" END) AS "Item4",
    MAX(CASE WHEN "Slot" = 4 THEN "Item" END) AS "Item5"
FROM (
    SELECT
        l."Customer" AS "Customer",
        l."Item" AS "Item",
        COUNT(r."Item") / 5 AS "Ticket",
        MOD(COUNT(r."Item"), 5) AS "Slot"
    FROM "Orders" AS l
    LEFT JOIN "Orders" AS r
    ON r."Customer" = l."Customer" AND r."Item" < l."Item"
    GROUP BY "Customer", "Item"
)
GROUP BY "Customer", "Ticket"
ORDER BY "Customer", "Ticket"

它使得:

Customer | Item1 | Item2 | Item3 | Item4 | Item5 
---------|-------|-------|-------|-------|-------
Alice    | ORGO  |       |       |       |       
Bob      | FTMCH | KLUGE | ZORP  |       |       
Carol    | BLECH | FTMCH | GLURP | KLUGE | MEEP  
Carol    | SQICK | ZORP  |       |       |       
Ted      | FOON  | SMOCK |       |       |       

感谢所有在这里和 询问 Metafilter。

(后续编辑:)

天哪,这只会变得更糟 :-(

事实证明,业务规则允许同一客户多次订购同一商品,而且所有这些未完成的订单将包含在一组门票中,所以我的玩具表应该看起来更像这样:

ID  | Customer | Item 
159 | Bob      | FTMCH
264 | Bob      | ZORP 
265 | Bob      | KLUGE
288 | Carol    | FTMCH
314 | Carol    | MEEP 
323 | Carol    | ZORP 
327 | Ted      | FOON 
338 | Ted      | SMOCK
358 | Alice    | ORGO 
419 | Carol    | SQICK
716 | Carol    | MEEP 
846 | Carol    | BLECH
939 | Carol    | MEEP 
950 | Carol    | GLURP
979 | Carol    | KLUGE

卡罗尔的多个 MEEP 扰乱了原始解决方案中的排名逻辑,我最终得到了以下可怕的怪物

SELECT
    "Customer",
    MAX(CASE WHEN "Slot" = 0 THEN "Item" END) AS "Item0",
    MAX(CASE WHEN "Slot" = 1 THEN "Item" END) AS "Item1",
    MAX(CASE WHEN "Slot" = 2 THEN "Item" END) AS "Item2",
    MAX(CASE WHEN "Slot" = 3 THEN "Item" END) AS "Item3",
    MAX(CASE WHEN "Slot" = 4 THEN "Item" END) AS "Item4",
    MAX(CASE WHEN "Slot" = 0 THEN "Quantity" END) AS "Qty0",
    MAX(CASE WHEN "Slot" = 1 THEN "Quantity" END) AS "Qty1",
    MAX(CASE WHEN "Slot" = 2 THEN "Quantity" END) AS "Qty2",
    MAX(CASE WHEN "Slot" = 3 THEN "Quantity" END) AS "Qty3",
    MAX(CASE WHEN "Slot" = 4 THEN "Quantity" END) AS "Qty4"
FROM (
    SELECT
        "Customer",
        "Item",
        COUNT("ID") AS "Quantity",
        "Rank" / 5 AS "Ticket",
        MOD("Rank", 5) AS "Slot"
    FROM (
        SELECT
            main."ID" AS "ID",
            main."Customer" AS "Customer",
            main."Item" AS "Item",
            COUNT(less."Item") AS "Rank"
        FROM "Orders" AS main
        LEFT JOIN (
            SELECT DISTINCT
                "Customer",
                "Item"
            FROM "Orders") AS less
        ON less."Customer" = main."Customer" AND less."Item" < main."Item"
        GROUP BY "ID", "Customer", "Item"
    )
    GROUP BY "Customer", "Item", "Rank"
)
GROUP BY "Customer", "Ticket"

: 我想它可以完成这项

Customer | Item0 | Item1 | Item2 | Item3 | Item4 | Qty0 | Qty1 | Qty2 | Qty3 | Qty3 | Qty4
Bob      | FTMCH | KLUGE | ZORP  |       |       | 1    | 1    | 1    |      |      |     
Carol    | BLECH | FTMCH | GLURP | KLUGE | MEEP  | 1    | 1    | 1    | 1    | 1    | 3   
Carol    | SQICK | ZORP  |       |       |       | 1    | 1    |      |      |      |     
Ted      | FOON  | SMOCK |       |       |       | 1    | 1    |      |      |      |     
Alice    | ORGO  |       |       |       |       | 1    |      |      |      |      |     

工作,但我感到很幸运,所涉及的数据库总是很小(几千行)。

从精神上来说,我是一个嵌入式系统人员,而不是数据库人员。任何以此为生的人都可以告诉我这种废话是否常见吗?带有四个嵌套 SELECT 和 LEFT JOIN 的查询是否值得在 Daily WTF 上提及?

OK, this works well enough:

SELECT
    "Customer",
    MAX(CASE WHEN "Slot" = 0 THEN "Item" END) AS "Item1",
    MAX(CASE WHEN "Slot" = 1 THEN "Item" END) AS "Item2",
    MAX(CASE WHEN "Slot" = 2 THEN "Item" END) AS "Item3",
    MAX(CASE WHEN "Slot" = 3 THEN "Item" END) AS "Item4",
    MAX(CASE WHEN "Slot" = 4 THEN "Item" END) AS "Item5"
FROM (
    SELECT
        l."Customer" AS "Customer",
        l."Item" AS "Item",
        COUNT(r."Item") / 5 AS "Ticket",
        MOD(COUNT(r."Item"), 5) AS "Slot"
    FROM "Orders" AS l
    LEFT JOIN "Orders" AS r
    ON r."Customer" = l."Customer" AND r."Item" < l."Item"
    GROUP BY "Customer", "Item"
)
GROUP BY "Customer", "Ticket"
ORDER BY "Customer", "Ticket"

It makes this:

Customer | Item1 | Item2 | Item3 | Item4 | Item5 
---------|-------|-------|-------|-------|-------
Alice    | ORGO  |       |       |       |       
Bob      | FTMCH | KLUGE | ZORP  |       |       
Carol    | BLECH | FTMCH | GLURP | KLUGE | MEEP  
Carol    | SQICK | ZORP  |       |       |       
Ted      | FOON  | SMOCK |       |       |       

Thanks to all who helped, both here and at Ask Metafilter.

(Followup edit:)

Jesus, this just gets worse :-(

Turns out the business rules allow the same customer to order the same item on multiple occasions, and that all outstanding orders are to be included on the one set of tickets. So my toy table should have looked more like this:

ID  | Customer | Item 
159 | Bob      | FTMCH
264 | Bob      | ZORP 
265 | Bob      | KLUGE
288 | Carol    | FTMCH
314 | Carol    | MEEP 
323 | Carol    | ZORP 
327 | Ted      | FOON 
338 | Ted      | SMOCK
358 | Alice    | ORGO 
419 | Carol    | SQICK
716 | Carol    | MEEP 
846 | Carol    | BLECH
939 | Carol    | MEEP 
950 | Carol    | GLURP
979 | Carol    | KLUGE

Carol's multiple MEEPs bugger the ranking logic in the original solution, and I've ended up with the following hideous monster:

SELECT
    "Customer",
    MAX(CASE WHEN "Slot" = 0 THEN "Item" END) AS "Item0",
    MAX(CASE WHEN "Slot" = 1 THEN "Item" END) AS "Item1",
    MAX(CASE WHEN "Slot" = 2 THEN "Item" END) AS "Item2",
    MAX(CASE WHEN "Slot" = 3 THEN "Item" END) AS "Item3",
    MAX(CASE WHEN "Slot" = 4 THEN "Item" END) AS "Item4",
    MAX(CASE WHEN "Slot" = 0 THEN "Quantity" END) AS "Qty0",
    MAX(CASE WHEN "Slot" = 1 THEN "Quantity" END) AS "Qty1",
    MAX(CASE WHEN "Slot" = 2 THEN "Quantity" END) AS "Qty2",
    MAX(CASE WHEN "Slot" = 3 THEN "Quantity" END) AS "Qty3",
    MAX(CASE WHEN "Slot" = 4 THEN "Quantity" END) AS "Qty4"
FROM (
    SELECT
        "Customer",
        "Item",
        COUNT("ID") AS "Quantity",
        "Rank" / 5 AS "Ticket",
        MOD("Rank", 5) AS "Slot"
    FROM (
        SELECT
            main."ID" AS "ID",
            main."Customer" AS "Customer",
            main."Item" AS "Item",
            COUNT(less."Item") AS "Rank"
        FROM "Orders" AS main
        LEFT JOIN (
            SELECT DISTINCT
                "Customer",
                "Item"
            FROM "Orders") AS less
        ON less."Customer" = main."Customer" AND less."Item" < main."Item"
        GROUP BY "ID", "Customer", "Item"
    )
    GROUP BY "Customer", "Item", "Rank"
)
GROUP BY "Customer", "Ticket"

which makes this:

Customer | Item0 | Item1 | Item2 | Item3 | Item4 | Qty0 | Qty1 | Qty2 | Qty3 | Qty3 | Qty4
Bob      | FTMCH | KLUGE | ZORP  |       |       | 1    | 1    | 1    |      |      |     
Carol    | BLECH | FTMCH | GLURP | KLUGE | MEEP  | 1    | 1    | 1    | 1    | 1    | 3   
Carol    | SQICK | ZORP  |       |       |       | 1    | 1    |      |      |      |     
Ted      | FOON  | SMOCK |       |       |       | 1    | 1    |      |      |      |     
Alice    | ORGO  |       |       |       |       | 1    |      |      |      |      |     

It does the job, I guess, but I'm feeling pretty lucky that the database involved is always going to be quite small (a few thousand rows).

Spiritually I'm an embedded-systems guy, not a database guy. Can anybody who does this for a living tell me whether this kind of nonsense is common? Would a query with four nested SELECTs and a LEFT JOIN merit a mention on the Daily WTF?

若水微香 2024-10-15 11:52:56

我相信这仅适用于 T-SQL,但您可以使用 PIVOT: http: //msdn.microsoft.com/en-us/library/ms177410.aspx

我做了类似的事情,将日期列表变成计算列。

I believe this is only usable for T-SQL, but you can use PIVOT: http://msdn.microsoft.com/en-us/library/ms177410.aspx

I did something similar with a list of dates becoming the columns for calculations.

养猫人 2024-10-15 11:52:56

不完全是你问的,MySQL 而不是 OpenOffice,但可能会给你一个想法或者其他人可以处理它:

select
    u.Customer,
    group_concat(u.Item) items
from
    (select
        t.Item,
        @n:=if(@c=t.Customer and @n<4,@n+1,0) c1,
        @m:=if(@n,@m,@m+1) g,
        @c:=t.Customer as Customer
    from
        t1 t, (select @m:=0) init
    order
        by t.Customer
    ) u
group by
    u.g

输出:

+----------+------------------------------+
| Customer | items                        |
+----------+------------------------------+
| Alice    | ORGO                         | 
| Bob      | FTMCH,ZORP,KLUGE             | 
| Carol    | KLUGE,ZORP,BLECH,SQICK,GLURP | 
| Carol    | MEEP,FTMCH                   | 
| Ted      | FOON,SMOCK                   | 
+----------+------------------------------+

Not exactly what you asked, and MySQL rather than OpenOffice, but might give you an idea or someone else could work on it :

select
    u.Customer,
    group_concat(u.Item) items
from
    (select
        t.Item,
        @n:=if(@c=t.Customer and @n<4,@n+1,0) c1,
        @m:=if(@n,@m,@m+1) g,
        @c:=t.Customer as Customer
    from
        t1 t, (select @m:=0) init
    order
        by t.Customer
    ) u
group by
    u.g

Output :

+----------+------------------------------+
| Customer | items                        |
+----------+------------------------------+
| Alice    | ORGO                         | 
| Bob      | FTMCH,ZORP,KLUGE             | 
| Carol    | KLUGE,ZORP,BLECH,SQICK,GLURP | 
| Carol    | MEEP,FTMCH                   | 
| Ted      | FOON,SMOCK                   | 
+----------+------------------------------+
吹梦到西洲 2024-10-15 11:52:56

这可以帮助您完成大部分任务,但不能处理 Carol 的重复订单。如果还有其他要分组的内容,例如 OrderIDOrderDate,那么这很容易做到。你能发布完整的架构吗?

select m1.Customer, 
    min(m1.Item) as Item1, 
    min(m2.item) as Item2, 
    min(m3.item) as Item3, 
    min(m4.item) as Item4, 
    min(m5.item) as Item5
from CustomerOrder m1
left outer join CustomerOrder m2 on m1.Customer = m2.Customer 
    and m2.item > m1.item
left outer join CustomerOrder m3 on m1.Customer = m3.Customer 
    and m3.item > m2.item
left outer join CustomerOrder m4 on m1.Customer = m4.Customer 
    and m4.item > m3.item
left outer join CustomerOrder m5 on m1.Customer = m5.Customer 
    and m5.item > m4.item
group by m1.Customer

输出:

Customer       Item1      Item2      Item3      Item4      Item5
-------------- ---------- ---------- ---------- ---------- ----------
Alice          ORGO       NULL       NULL       NULL       NULL
Bob            FTMCH      KLUGE      ZORP       NULL       NULL
Carol          BLECH      FTMCH      GLURP      KLUGE      MEEP
Ted            FOON       SMOCK      NULL       NULL       NULL

This gets you most of the way there, but does not handle the duplicate order for Carol. That would be easy to do if there was something else to group on, like OrderID or OrderDate. Can you post the full schema?

select m1.Customer, 
    min(m1.Item) as Item1, 
    min(m2.item) as Item2, 
    min(m3.item) as Item3, 
    min(m4.item) as Item4, 
    min(m5.item) as Item5
from CustomerOrder m1
left outer join CustomerOrder m2 on m1.Customer = m2.Customer 
    and m2.item > m1.item
left outer join CustomerOrder m3 on m1.Customer = m3.Customer 
    and m3.item > m2.item
left outer join CustomerOrder m4 on m1.Customer = m4.Customer 
    and m4.item > m3.item
left outer join CustomerOrder m5 on m1.Customer = m5.Customer 
    and m5.item > m4.item
group by m1.Customer

Output:

Customer       Item1      Item2      Item3      Item4      Item5
-------------- ---------- ---------- ---------- ---------- ----------
Alice          ORGO       NULL       NULL       NULL       NULL
Bob            FTMCH      KLUGE      ZORP       NULL       NULL
Carol          BLECH      FTMCH      GLURP      KLUGE      MEEP
Ted            FOON       SMOCK      NULL       NULL       NULL
小ぇ时光︴ 2024-10-15 11:52:56

这种需求并不罕见,并且可以在 SQL 中合理地提供。但有两个问题阻碍了你。

1) 您输入了一个 SQL 标签,这意味着 ISO/IEC/ANSI 标准 SQL。正确的使用方法是游标或游标替代(while 循环,它执行相同的操作,但速度更快)。这避免了所有这些外部联接和处理大量结果集;然后用 GROUP BY 等将其提交。它还处理重复项,主要是因为它首先创建它们(通过别名表的这五个版本)。是的,情况会变得更糟,当数据库合理填充时,它将成为性能消耗者。

2)关系数据库中不允许重复,即。在你的源表中;您需要使行唯一(并且那些键/列不显示)。尝试通过代码消除重复项是没有用的。如果这一点得到纠正,那么所有重复项(真实的重复项和由不良代码创建的重复项)都可以被消除。

使用子查询也可以更优雅地满足此要求;只是这里需要两层嵌套,一层用于构建教学项目列,两层用于获取排名或位置。并且(标准 SQL 构造)预先假设您有一个关系数据库(没有重复的行)。如果您不习惯 SQL,则 Eeek 因素较高。这就是大多数程序员使用光标或光标替代品的原因。

但是,如果您没有 SQL 及其基本功能(HSQL 是一些不标准的实现),那么我们就不会使用相同的工具包。我能提供的SQL代码不会为你运行,我们将继续来回。

(也许我们应该有一个“psuedo-SQL”标签。)

ID 列防止重复???

在业界的某些部分,有一个普遍存在的神话:这种效果,归功于数据库初学者写的书籍。像往常一样,神话没有科学依据。让我们尝试一个简单的测试。

    CREATE TABLE Person (
    PersonId  IDENTITY NOT NULL
        PRIMARY KEY,
    FirstName CHAR(30) NOT NULL,
    LastName  CHAR(30) NOT NULL
    )
INSERT Person VALUES ("Fred", "Astaire") 1 row(s) affected
INSERT Person VALUES ("Ginger", "Rogers") 1 row(s) affected
INSERT Person VALUES ("Fred", "Astaire") 1 row(s) affected
SELECT * FROM Person
PersonId FirstName LastName ======== ============================== ============================== 1 Fred Astaire 2 Ginger Rogers 3 Fred Astaire
3 row(s) affected

That's a pure, unarguable duplicate row. The simple fact is. the Id column provides a row number, but does nothing to prevent duplicate rows. For that you need an Unique Index on the columns that determine uniqueness, as identified in the data model, for every relational table in the database (by definition, if the rows are not unique, it is not a Relational table). Otherwise it is just a file storage system.

    CREATE UNIQUE NONCLUSTERED INDEX U_Name
       ON Person (LastName, FirstName)

我可能会识别另一种形式的数据完整性(重复)。

    INSERT Person VALUES ("Fred", "Astair")
1 row(s) affected
INSERT Person VALUES ("Astaire", "Fred") 1 row(s) affected

All are preventable in SQL.

The requirement is not uncommon, and can be supplied reasonably in SQL. But you have two issues blocking you.

1) You've entered an SQL tag, that means ISO/IEC/ANSI Standard SQL. The correct method to use is a cursor or cursor substitute (while loop, which does the same thing, but is faster). That avoids all these outer joins and handling massive result sets; then beating it into submission with GROUP BYs, etc. It also handles duplicates, mainly because it does it create them in the first place (via those five versions of the aliased table). And yes, it will keep getting worse, and when the database is reasonably populated it will be a performance hog.

2) Duplicates are not allowed in a Relational database, ie. in your source tables; you need to make the rows unique (and those keys/columns is not shown). No use trying to eliminate duplicates via code. If that is corrected, then all duplicates (real and created by the poor code) can be eliminated.

This requirement can also be supplied more elegantly using Subqueries; except that here you need two levels of nesting, one to build teach Item column, and two to obtain rank or Position. And that (standard SQL construct) pre-supposes that you have a Relational database (no duplicate rows). High Eek factor if you are not used to SQL. Which is why most coders use a cursor or cursor substitute.

But if you do not have SQL, its basic capabilities, (HSQL being some sub-standard implementation), then we are not using the same tool kit. The SQL code I can provide will not run for you, and we will keep going back and forth.

(Maybe we should have a "psuedo-SQL" tag.)

ID Column Prevents Duplicates ???

There is a myth that is prevalent in some parts of the industry, to that effect, due to books written by database beginners. As usual, myths have no scientific basis. Let's try a simple test.

    CREATE TABLE Person (
    PersonId  IDENTITY NOT NULL
        PRIMARY KEY,
    FirstName CHAR(30) NOT NULL,
    LastName  CHAR(30) NOT NULL
    )
INSERT Person VALUES ("Fred", "Astaire") 1 row(s) affected
INSERT Person VALUES ("Ginger", "Rogers") 1 row(s) affected
INSERT Person VALUES ("Fred", "Astaire") 1 row(s) affected
SELECT * FROM Person
PersonId FirstName LastName ======== ============================== ============================== 1 Fred Astaire 2 Ginger Rogers 3 Fred Astaire
3 row(s) affected


That's a pure, unarguable duplicate row. The simple fact is. the Id column provides a row number, but does nothing to prevent duplicate rows. For that you need an Unique Index on the columns that determine uniqueness, as identified in the data model, for every relational table in the database (by definition, if the rows are not unique, it is not a Relational table). Otherwise it is just a file storage system.

    CREATE UNIQUE NONCLUSTERED INDEX U_Name
       ON Person (LastName, FirstName)

There is another form of data integrity (duplication) which I might identify while I am at it.

    INSERT Person VALUES ("Fred", "Astair")
1 row(s) affected
INSERT Person VALUES ("Astaire", "Fred") 1 row(s) affected


All are preventable in SQL.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文