SQL 查询为列中的每个唯一值返回一条记录
我在 SQL Server 2000 中有一个表,我试图以特定的方式查询它。 展示这一点的最佳方法是使用示例数据。
看吧,[Addresses]
:
Name Street City State
--------------------------------------------------------
Bob 123 Fake Street Peoria IL
Bob 234 Other Street Fargo ND
Jim 345 Main Street St Louis MO
这实际上是实际表结构的简化示例。 表的结构完全超出我的控制范围。 我需要一个查询,该查询将为每个名称返回一个地址。 哪个地址并不重要,重要的是只有一个地址。 结果可能是这样的:
Name Street City State
--------------------------------------------------------
Bob 123 Fake Street Peoria IL
Jim 345 Main Street St Louis MO
我在这里发现了一个类似的问题 ,但在我的情况下,给出的解决方案都不起作用,因为我无权访问 CROSS APPLY
,并且在每列上调用 MIN()
会混合不同的内容地址在一起,虽然我不关心返回哪条记录,但它必须是一个完整的行,而不是不同行的混合。
更改表结构的建议对我没有帮助。 我同意这个表很糟糕(它比这里显示的更糟糕),但这是我无法更改的主要 ERP 数据库的一部分。
该表大约有3000条记录。 没有主键。
有任何想法吗?
I have a table in SQL Server 2000 that I am trying to query in a specific way. The best way to show this is with example data.
Behold, [Addresses]
:
Name Street City State
--------------------------------------------------------
Bob 123 Fake Street Peoria IL
Bob 234 Other Street Fargo ND
Jim 345 Main Street St Louis MO
This is actually a simplified example of the structure of the actual table. The structure of the table is completely beyond my control. I need a query that will return a single address per name. It doesn't matter which address, just that there is only one. The result could be this:
Name Street City State
--------------------------------------------------------
Bob 123 Fake Street Peoria IL
Jim 345 Main Street St Louis MO
I found a similar question here, but none of the solutions given work in my case because I do not have access to CROSS APPLY
, and calling MIN()
on each column will mix different addresses together, and although I don't care which record is returned, it must be one intact row, not a mix of different rows.
Recommendations to change the table structure will not help me. I agree that this table is terrible, (it's worse than shown here) but this is part of a major ERP database that I can not change.
There are about 3000 records in this table. There is no primary key.
Any ideas?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(14)
选择名称、街道、城市、州 FROM(
选择名称、街道、城市、州、
ROW_NUMBER() OVER(PARTITION BY 名称 ORDER BY 名称) AS rn
从表)AS t
其中 rn=1
select Name , street,city,state FROM(
select Name , street,city,state,
ROW_NUMBER() OVER(PARTITION BY Name ORDER BY Name) AS rn
from table) AS t
WHERE rn=1
这太丑陋了,但听起来你的困境也很丑陋......所以这里......
This is ugly as hell, but it sounds like your predicament is ugly, too... so here goes...
临时表解决方案如下
A temporary table solution would be as follows
我认为这是基于游标的解决方案的一个很好的候选者。 我已经很久没有使用游标了,所以我不会尝试编写 T-SQL,但想法如下:
I think this is a good candidate for a cursor based solution. It's been so long since I've used a cursor that I won't attempt to write the T-SQL but here's the idea:
考虑到你的限制,我认为你无法做到这一点。 您可以提取这些字段的不同组合。 但如果有人用相同的地址拼写 Bob 和 Bobb,您最终会得到两条记录。 [GIGO] 您是正确的,任何分组(缺少对所有字段进行分组 - 相当于 DISTINCT)都会混合行。 遗憾的是您没有为每个客户提供唯一的标识符。
您可以将查询嵌套在一起,例如为每个名称选择前 1 个并将所有这些连接在一起。
I don't think that you can do that, given your constraints. You can pull out distinct combinations of those fields. But if someone spelled Bob and Bobb with the same address you'd end up with two records. [GIGO] You are correct that any grouping (short of grouping on all of the fields-equivalent to DISTINCT) will mix rows. It's too bad that you don't have a unique identifier for each customer.
You might be able to nest queries together in such as way as to select the top 1 for each name and join all of those together.
对上面的内容稍作修改就可以了。
现在,如果您有相同的街道但其他信息不同(例如有拼写错误),则这将不起作用。
或者更完整的散列将包括所有字段(但对于性能而言,您可能有太多字段):
A slight modification on the above should work.
Now this won't work if you have the same street but the other pieces of information are different (e.g. with typos).
OR a more complete hash would include all the fields (but you likely have too many for performance):
还有另一种方式:
And still another way:
好吧,这会给你带来非常糟糕的性能,但我认为它会起作用
Well, this will give you pretty bad performance, but I think it'll work
使用临时表或表变量并在其中选择不同的名称列表。 然后使用该结构为每个不同的名称选择原始表中每条记录的前 1 条。
Use a temp table or table variable and select a distinct list of names into that. Use that structure then to select the top 1 of each record in the original table for each distinct name.
如果可以使用临时表:
对于更大的表,不建议使用此解决方案。
If you can use a temp table:
This solution would not be advisable for much larger tables.