查询此数据的更有效方法?

发布于 2024-09-15 09:44:15 字数 953 浏览 1 评论 0原文

我有一个包含一些数据的表:

 ColA | ColB | ColC
------+------+------
  1   |  A   |  X
  2   |  A   |  Y
  3   |  B   |  Y
  4   |  C   |  Y
  5   |  C   |  Z
  6   |  D   |  Y
  7   |  D   |  Z

我想查询以获取 ColBColC 作为一对 匹配的所有行条件:

SELECT * FROM [Table] 
WHERE (ColB = A AND ColC = Y)
OR (ColB = B AND ColC = Y)
OR (ColB = C AND ColC = Y)
OR (ColB = D AND ColC = Z)

这应该返回第 2、3、4 和 7 行。


(ColB, ColC) 的值对可能很大(在 ~100 的范围内)。除了使用大量 OR 条件的大型查询之外,是否有更有效的方法来查询此数据?

我希望有一种方法可以使用元组的等价物,这意味着我可以做类似的事情:

SELECT * FROM [Table] 
WHERE (ColB, ColC) IN ({A, Y}, {B, Y}, {C, Y}, {D, Z})

有什么想法吗?


编辑:(回答评论中的一些问题)

ColBColC的字段存储guid并声明为uniqueidentifier类型。
这需要在 SQL Server 2005 及以上版本(所有版本)上运行。
该表有数百万行,我不反对添加其工作所需的任何索引。

I have a table with some data in it:

 ColA | ColB | ColC
------+------+------
  1   |  A   |  X
  2   |  A   |  Y
  3   |  B   |  Y
  4   |  C   |  Y
  5   |  C   |  Z
  6   |  D   |  Y
  7   |  D   |  Z

I want to query to get all of the rows where ColB and ColC as a pair match a condition:

SELECT * FROM [Table] 
WHERE (ColB = A AND ColC = Y)
OR (ColB = B AND ColC = Y)
OR (ColB = C AND ColC = Y)
OR (ColB = D AND ColC = Z)

this should return rows 2, 3, 4 and 7.


The pairs of values for (ColB, ColC) could potentially be large (in the region of ~100). Is there a more efficient way of querying for this data other than a large query with lots of OR conditions?

I am hoping there is a way of using an equivalent of a Tuple, meaning I can do something like:

SELECT * FROM [Table] 
WHERE (ColB, ColC) IN ({A, Y}, {B, Y}, {C, Y}, {D, Z})

Any ideas?


EDIT: (to answer some questions in the comments)

The fields for ColB and ColC store guids and are declared as uniqueidentifier types.
This needs to work on SQL Server 2005 upwards (all editions).
The table has in the order of millions of rows, and I'm not averse to adding any indices that are required for this to work.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

嘦怹 2024-09-22 09:44:15

你可以这样做。 (如果您使用的是 SQL Server 2008,则可以使用行构造函数而不是 union alls

您需要检查查询计划以查看它是否更有效。

SELECT * /*But don't use star*/
FROM [Table] 
JOIN 
(
SELECT 'A' AS BMatch, 'Y' AS CMatch UNION ALL
SELECT 'B' AS BMatch, 'Y' AS CMatch UNION ALL
SELECT 'C' AS BMatch, 'Y' AS CMatch UNION ALL
SELECT 'D' AS BMatch, 'Z' AS CMatch UNION ALL ...
) Matches
ON ColB = BMatch AND ColC = CMatch

您在评论中说 ColBColC 的组合是唯一的,因此(假设您的表已经有聚集索引)我想如果您创建一个 unique (colb, colc)(colc, colb) 上的非聚集索引,上述内容应该为您提供一个包含 100 次索引查找和 100 次书签查找的计划。如果没有,您可以尝试添加索引提示以使其使用新索引。您需要将其 I/O 与完整扫描的 I/O 进行比较,因为大量 or 可能会给您带来结果。

通过在非聚集索引中包含额外的必需列,可以避免书签查找的成本。不过您已经使用了 * 所以我不知道这有多可行。您需要平衡此查询的好处与数据修改操作可能带来的坏处。

You could do this. (If you are on SQL Server 2008 you could use the values row constructors rather than union alls)

You'd need to check the query plan to see if it was any more efficient.

SELECT * /*But don't use star*/
FROM [Table] 
JOIN 
(
SELECT 'A' AS BMatch, 'Y' AS CMatch UNION ALL
SELECT 'B' AS BMatch, 'Y' AS CMatch UNION ALL
SELECT 'C' AS BMatch, 'Y' AS CMatch UNION ALL
SELECT 'D' AS BMatch, 'Z' AS CMatch UNION ALL ...
) Matches
ON ColB = BMatch AND ColC = CMatch

You say in the comments that the combination of ColB and ColC is unique so (assuming your table already has a clustered index) I would imagine if you create a unique nonclustered index on either (colb, colc) or (colc, colb) that the above should give you a plan with 100 index seeks followed by 100 bookmark lookups. If it doesn't you could try adding an index hint to get it to use the new index. You'd need to compare the I/O for that with the I/O of a full scan as lots of ors would likely give you.

The cost of the bookmark lookups could potentially be avoided by including the additional required columns in the non clustered index. You've used * though so I don't know how viable this would be. You'd need to balance the benefit to this query against possible disbenefits to data modification operations.

爱你是孤单的心事 2024-09-22 09:44:15

您是否考虑过添加一个计算列作为两列的串联?

它将简化您的选择语句并允许添加索引。

CREATE TABLE [dbo].[Table] (
  ColA  INTEGER
  , ColB VARCHAR(1)
  , ColC VARCHAR(1)
  , ColBC AS ColB + ColC
)  

CREATE UNIQUE INDEX IX_TABLE_COLBC ON [dbo].[Table] (ColBC)

INSERT INTO [Table] VALUES(1, 'A', 'X')
INSERT INTO [Table] VALUES(2, 'A', 'Y')
INSERT INTO [Table] VALUES(3, 'B', 'Y')
INSERT INTO [Table] VALUES(4, 'C', 'Y')
INSERT INTO [Table] VALUES(5, 'C', 'Z')
INSERT INTO [Table] VALUES(6, 'D', 'Y')
INSERT INTO [Table] VALUES(7, 'D', 'Z')

不完全等同于元组,但它确实允许您将选择更改为

SELECT * FROM [Table] 
WHERE (ColBC) IN ('AY', 'BY', 'CY', 'DZ')

并且它使用索引。

Have you considered adding a calculated column as a concatenation of both columns?

It would simplify your select statement and allows for an index to be added.

CREATE TABLE [dbo].[Table] (
  ColA  INTEGER
  , ColB VARCHAR(1)
  , ColC VARCHAR(1)
  , ColBC AS ColB + ColC
)  

CREATE UNIQUE INDEX IX_TABLE_COLBC ON [dbo].[Table] (ColBC)

INSERT INTO [Table] VALUES(1, 'A', 'X')
INSERT INTO [Table] VALUES(2, 'A', 'Y')
INSERT INTO [Table] VALUES(3, 'B', 'Y')
INSERT INTO [Table] VALUES(4, 'C', 'Y')
INSERT INTO [Table] VALUES(5, 'C', 'Z')
INSERT INTO [Table] VALUES(6, 'D', 'Y')
INSERT INTO [Table] VALUES(7, 'D', 'Z')

Not exactly the equivalent of a Tuple, but it does allow you to change your select to

SELECT * FROM [Table] 
WHERE (ColBC) IN ('AY', 'BY', 'CY', 'DZ')

and it uses the index.

上课铃就是安魂曲 2024-09-22 09:44:15

为什么不创建一个新表来过滤结果集?由于有大约 100 个过滤依据,如果您的过滤条件将来发生变化,此解决方案将更加灵活,即您只需更改过滤表,而不是更改将嵌入到 select 语句中的“where”子句: -

ifexists ( select * from sys.objects WHERE object_id = object_id(N'dbo.Filter') AND type in (N'U'))
删除表 dbo.Filter

创建表 dbo.Filter (
ColB char(1) 不为空,
ColC char(1) 不为空,
约束 pkFilter 主键聚集 (ColB,ColC) 且填充因子 = 100)

插入 dbo.Filter (ColB,ColC) 值 ('A','Y'),('B','Y'),('C' ,'Y'),('D','Z')

从[表]中选择 * 作为 t
内连接 dbo.Filter 作为 t.ColB = f.ColB 和 t.ColC = f.ColC 上的 f

Why not create a new table to filter your result-set? With around 100 values to filter by, this solution would be more flexible should your filter conditions change in the future i.e. you simply change the filter table rather than changing the "where" clause that would be embedded in your select statement: -

if exists (select * from sys.objects WHERE object_id = object_id(N'dbo.Filter') AND type in (N'U'))
drop table dbo.Filter

create table dbo.Filter (
ColB char(1) not null,
ColC char(1) not null,
constraint pkFilter primary key clustered (ColB,ColC) with fillfactor = 100)

insert into dbo.Filter (ColB,ColC) values ('A','Y'),('B','Y'),('C','Y'),('D','Z')

select * from [Table] as t
inner join dbo.Filter as f on t.ColB = f.ColB and t.ColC = f.ColC

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文