这个执行计划是重新思考我的主键的动机吗

发布于 2024-07-18 21:51:47 字数 639 浏览 8 评论 0原文

当我进入当前(雇主)公司时,设计了一个新的数据库模式,该模式将成为许多未来正在/将要创建的工具的基础。 以我有限的 SQL 知识,我认为该表设计得相当好。 我唯一担心的是几乎每个表都有一个多部分主键。 每个表至少有一个它自己的 CustomerId 和键。 虽然这些确实是针对某个记录的定义,但我感觉多个键(我们这里讨论的是四键)效率非常低。

今天,我在一个简单的重复查询中看到了一些难以想象的 CPU 使用率,该查询连接两个表,从第一个表中选择一个字符串字段并区分它们。

select distinct(f.FIELDNAME) as fieldName
from foo f
inner join bar b
   on f.id = b.fId
where b.cId = @id;

检查执行计划(我不是EP英雄)我注意到有三个主要的CPU点。 不同的(如预期的)和两个对不相关的寻求。 我个人认为索引搜索应该非常快,但它们各自占用了 18% 的成本。 这是正常的吗? 是由于(四重)聚集索引吗?

--更新--
该查询用于创建 Lucene 索引。 这是大约每周进行一次的一次性处理(我知道,这听起来很矛盾)。 据我所知,我无法在这里重复使用任何结果。

When I entered my current (employer's) company a new database schema was designed and will be the base of a lot of the future tools that are/will be created.
With my limited SQL knowledge I think the table is rather well designed.
My only concern is that almost every table has a multy-part primary key. Every table has at least a CustomerId and key of it's own. While these are indeed defining for a certain record, I have the feeling that multiple keys (we're talking quadruple here) are very inefficient.

Today I was seeing some unimaginable CPU usage over a simple, repeated query that joins two tables, selects a single string field from the first and distincts them.

select distinct(f.FIELDNAME) as fieldName
from foo f
inner join bar b
   on f.id = b.fId
where b.cId = @id;

Checking the execution plan (I'm no EP Hero) I noticed that there are three major CPU points. The distinct (as expected) and two seeks over the indeces.
I would personally think that the indices seek should be extremely fast, but they take up 18% of the cost each. Is this normal? Is it due to the (quadruple) clustered indexes?

--UPDATE--
The query is used for creating a Lucene index. It's a one-time processing that happens about weekly (sounds contradictive, I know). I can't re-use any results here as far as I see.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

溇涏 2024-07-25 21:51:47

这种查询看起来很熟悉。 我在这里猜测,但是,它可能正在填充 web/winform ui 上的组合框,而该组合框受到了很大的打击。

也许您应该在应用程序端缓存结果,这样您就不会经常执行它。 最糟糕的情况是,您可以将其缓存在 sql 服务器端,但这是一个巨大的混乱。

This kind of query looks familiar. Im guessing here, but, it's probably populating a combo box on a web/winform ui that is being hit pretty hard.

Perhaps you should be caching the results on the application side so you don't end up executing it so often. Worse case scenario you could cache this on sql servers side, but its a massive kludge.

一个人的夜不怕黑 2024-07-25 21:51:47

您能否运行以下查询并发布其输出:

SELECT  COUNT(*), COUNT(DISTINCT fieldname)
FROM    foo

SELECT  COUNT(*), COUNT(DISTINCT cId), COUNT(DISTINCT fId)
FROM    bar

这将有助于估计哪些索引最适合您的需求。

同时确保您有以下索引:

foo (FIELDNAME)
bar (cId, fId)

并重写您的查询:

SELECT  DISTINCT(fieldname)
FROM    foo f
WHERE   EXISTS (
        SELECT  1
        FROM    bar b
        WHERE   b.fId = f.id
                AND b.cId = @id
        )

此查询应使用 f.FIELDNAME 上的索引来构建 DISTINCT 列表和 上的索引bar 过滤掉不存在的值。

Could you please run the following queries and post their output:

SELECT  COUNT(*), COUNT(DISTINCT fieldname)
FROM    foo

SELECT  COUNT(*), COUNT(DISTINCT cId), COUNT(DISTINCT fId)
FROM    bar

This will help to estimate which indexes best suit your needs.

Meanwhile make sure you have the following indexes:

foo (FIELDNAME)
bar (cId, fId)

and rewrite your query:

SELECT  DISTINCT(fieldname)
FROM    foo f
WHERE   EXISTS (
        SELECT  1
        FROM    bar b
        WHERE   b.fId = f.id
                AND b.cId = @id
        )

This query should use an index on f.FIELDNAME to build the DISTINCT list and the index on bar to filter out the non-existent values.

后eg是否自 2024-07-25 21:51:47

在大多数数据库中,如果未列出索引中的第一列,则不会使用索引。 您说 customerId 是每个主键的一部分,但您不将其用于查询中的联接。 为了正确回答您的问题,我们确实需要查看 foo 和 bar 的创建表输出,或者至少显示索引

也就是说,如果您像这样更改它,您的查询可能会更快:

select distinct(f.FIELDNAME) as fieldName
from foo f
inner join bar b
   on f.id = b.fId
   and f.cId = b.cId #Using this part of the key will speed it up
where b.cId = @id;

我的评论假设您的主键按“cId,fId”排序有效,这意味着您的查询不必检查每个 cId,而只需检查那些这是索引的一部分。

In most databases, indexes aren't used if the first column in the index isn't listed. You say that the customerId is part of every primary key, but you don't use it for the join in your query. To properly answer your question, we really need to see the create table output for foo and bar, or at least show index from.

That said, your query may be faster if you change it like so:

select distinct(f.FIELDNAME) as fieldName
from foo f
inner join bar b
   on f.id = b.fId
   and f.cId = b.cId #Using this part of the key will speed it up
where b.cId = @id;

My comment assumes that your primary key is ordered as "cId, fId" Effectively, that will mean that your query doesn't have to check every cId, only the ones that are part of the index.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文