从非规范化表中选择最新记录
我设置了一个包含以下列的表:
- fname
- lname
- 地址
- city
- state
- zip
- 客户 ID
- date_modified
数据基本上是非规范化的,因此看起来
Ben -- Smith--***123first*** st -- NY -- NY -- 12101 -- 123 -- 1-1-2011 Ben -- Smith--***123 1st st*** -- NY -- NY -- 12101 -- 123 -- 1-1-2011 Sara -- Smith -- BLAH BLAH BLAh
我正在尝试将这些记录复制到新表中,但我只想要每个客户 ID 一条记录。
我尝试做类似的事情,
insert into new_table(fname,lname,address,city,state,zip,Customer_ID,
date_modified)
select fname,lname,address,city,state,zip,Customer_ID,date_modified
group by (fname,lname,address,city,state,zip,Customer_ID,date_modified)
问题是有太多地址和其他列具有相同的含义但不同的文本(第一个与第一个)。因此分组依据将这两条记录保留在新表中。我如何通过选择最大值(修改日期)来为每个客户 ID 获取一条记录。基本上我只想按 customer_id 进行分组,而不是按其余列进行分组,但这在 Oracle 中是不允许的。
I have a table set up with columns:
- fname
- lname
- address
- city
- state
- zip
- Customer ID
- date_modified
the data is basically denormalized so looks like
Ben -- Smith--***123first*** st -- NY -- NY -- 12101 -- 123 -- 1-1-2011 Ben -- Smith--***123 1st st*** -- NY -- NY -- 12101 -- 123 -- 1-1-2011 Sara -- Smith -- BLAH BLAH BLAh
I am trying to copy these records to a new table but I only want one record per Customer ID.
I tried doing something like
insert into new_table(fname,lname,address,city,state,zip,Customer_ID,
date_modified)
select fname,lname,address,city,state,zip,Customer_ID,date_modified
group by (fname,lname,address,city,state,zip,Customer_ID,date_modified)
the issue is there are too many addresses and other columns that have the same meaning but different text (first vs 1st). so the group by leaves both of these records in the new table. how do i esentialy get one record for each customer id by choosing the max(date modified). Basically I would want to group by just the customer_id and not the rest of the columns but that isn't allowed in oracle.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
以下是获得您想要的东西的几种方法之一。请记住,这不会标准化您的数据库。您仍然有一个包含名字、姓氏和地址的
customer_id
。我可能会将其转换为多个插入,一个用于获取所有唯一的客户 ID 以及“客户”表的最新名称信息,然后另一个用于获取地址的插入。如果您想要更改的历史信息,那么您需要进行适当的调整。此外,如果您有两行具有完全相同的客户 ID 和相同的确切
date_modified
,则下面的代码将无法正常工作。如果您遇到这种情况,您将需要想出适当的业务逻辑来处理该情况。The following is one of several ways to get what you seem to want. Keep in mind though that this does not normalize your database. You still have a
customer_id
in with a first and last name and an address. I'd probably turn this into several inserts, one to get all of the unique customer IDs along with the latest name information for the Customers table, then another insert for the addresses. If you want historical information of changes then you would need to adjust appropriately.Also, the code below will not work properly if you have two rows that have the same exact customer ID and the same exact
date_modified
. If you run into that case you'll need to come up with the appropriate business logic to handle that.通过使用分析(又称窗口)函数为每个客户选择第一行,这非常容易。如果两行具有相同的 date_modified,则未定义采用哪一行。
This is quite easy by using analytical (aka windowing) functions to select the first row for each customer. In case two rows have the same date_modified, it is not defined which one is taken.