sybase - 除非字符串是硬编码的,否则无法使用索引
我正在使用 Sybase 12.5.3 (ASE);尽管我已经广泛使用过 MSSQL,但我还是 Sybase 的新手。我遇到了存储过程确实非常慢的情况。我已将问题追溯到相对较大的表的单个 SELECT stmt。修改该语句极大地提高了过程的性能(并且恢复它会极大地减慢速度;即,SELECT stmt 绝对是罪魁祸首)。
-- Sybase optimizes and uses multi-column index... fast!<br>
SELECT ID,status,dateTime
FROM myTable
WHERE status in ('NEW','SENT')
ORDER BY ID
-- Sybase does not use index and does very slow table scan<br>
SELECT ID,status,dateTime
FROM myTable
WHERE status in (select status from allowableStatusValues)
ORDER BY ID
上面的代码是实际代码的改编/简化版本。请注意,我已经尝试过重新编译过程、更新统计信息等。
我不知道为什么 Sybase ASE 仅在字符串硬编码时才选择索引,而在从另一个表中选择时选择表扫描。请有人给我一个线索,提前谢谢你。
I'm using Sybase 12.5.3 (ASE); I'm new to Sybase though I've worked with MSSQL pretty extensively. I'm running into a scenario where a stored procedure is really very slow. I've traced the issue to a single SELECT stmt for a relatively large table. Modifying that statement dramatically improves the performance of the procedure (and reverting it drastically slows it down; i.e., the SELECT stmt is definitely the culprit).
-- Sybase optimizes and uses multi-column index... fast!<br>
SELECT ID,status,dateTime
FROM myTable
WHERE status in ('NEW','SENT')
ORDER BY ID
-- Sybase does not use index and does very slow table scan<br>
SELECT ID,status,dateTime
FROM myTable
WHERE status in (select status from allowableStatusValues)
ORDER BY ID
The code above is an adapted/simplified version of the actual code. Note that I've already tried recompiling the procedure, updating statistics, etc.
I have no idea why Sybase ASE would choose an index only when strings are hard-coded and choose a table scan when choosing from another table. Someone please give me a clue, and thank you in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
1.这里的问题是编码不良。在您的版本中,糟糕的代码和糟糕的表设计是优化器做出错误决策的主要原因(98%)(两者是相辅相成的,我还没有弄清楚各自的比例)。两者
都是
不合格的,因为在这两种情况下,它们都会导致 ASE 为括号之间的内容创建一个工作表,这很容易避免(并且可以避免所有随之而来的问题)。工作表上不可能有统计信息,因为 t.status 或 s.status 上的统计信息丢失(AdamH 在这一点上是正确的),它正确地选择了表扫描。
子查询有其用武之地,但永远不能替代纯连接(表相关)。更正是:
和
2.语句
|Now you don't have to add an index to get stats on a columns, but it might's the best way.
是不正确的。切勿创建您不会使用的索引。如果您想要更新某列的统计信息,只需执行以下
操作即可: 3. 确保您拥有 (a) 所有索引列和 (b) 所有联接列的最新统计信息非常重要。
4.是的,在每个要发布的代码段上,SHOWPLAN 是不可替代的,对于任何性能有问题的代码来说更是如此。您还可以
SET NOEXEC ON
,以避免执行,例如。对于大型结果集。1.The issue here is poor coding. In your release, poor code and poor table design are the main reasons (98%) the optimiser makes incorrect decisions (the two go hand-in-hand, I have not figured out the proportion of each). Both:
and
are substandard, because in both cases they cause ASE to create a worktable for the contents between the brackets, which can easily be avoided (and all consequential issues avoided with it). There is no possibility of statistics on a worktable, since the statistics on either t.status or s.status is missing (AdamH is correct re that point), it correctly chooses a table scan.
Subqueries have their place, but never as a substitute for a pure (the tables are related) join. The corrections are:
and
2.The statement
|Now you don't have to add an index to get statistics on a column, but it's probably the best way.
is incorrect. Never create Indices that you will not use. If you want statistics updated on a column, simply
3.It is important to ensure that you have current statistics on (a) all indexed columns and (b) all join columns.
4.Yes, there is no substitute for SHOWPLAN on every code segment that is intended for release, doubly so for any code with questionable performance. You can also
SET NOEXEC ON
, to avoid execution, eg. for large result sets.索引提示可以解决这个问题,但可能不是解决方案。
首先,我想知道 allowedStatusValues.status 上是否有索引,如果有,那么 sybase 将对其进行统计,并且对其中的值的数量有一个很好的了解。
如果不是,那么优化器可能不知道 Status 可能采用多少个不同的值。然后必须假设您将从 myTable 中提取几乎所有行,而执行此操作的最佳方法是表扫描(如果没有覆盖索引)。
现在您不必添加索引来获取列的统计信息,但这可能是最好的方法。
如果您确实在 allowedStatusValues.status 上有索引,那么我想知道您的统计数据有多好。为自己获取一份 sp__optdiag 的副本。您可能还需要调整“直方图调整因子”和“直方图步长数”的值,从默认值稍微增加这些值将为您提供更详细的统计数据,这始终有助于优化器。
An index hint will work around it, but is probably not the solution.
Firstly I'd like to know if there is an index on allowableStatusValues.status, if there is then sybase will have stats on it and will have a good idea on the number of values in there.
If not then the optimiser probably won't have a good idea how many different values Status may take. It's then having to make the assumption that you're going to be extracting almost all of the rows from myTable, and the best way of doing this is a table scan (if no covering index).
Now you don't have to add an index to get statistics on a column, but it's probably the best way.
If you do have an index on allowableStatusValues.status, then i'd wonder how good your stats are. Get yourself a copy of sp__optdiag. You probably also need to tune the values of "histogram tuning factor" and "number of histogram steps", increasing these slightly from the defaults will give you more detailed statistics which always helps the optimiser.
如果用联接替换子查询,它是否仍然会进行表扫描:
Does it still do a table scan if you replace the subquery with a join:
我强烈建议 Sybase 向您显示每个查询的执行计划,而不是依赖于对查询运行时间的实验观察,例如:
使用
SET showplan ON
,Sybase 会生成执行计划对于它执行的每条语句。这些对于帮助确定查询未使用适当索引的位置非常宝贵。对于Sybase中的存储过程,整个过程的执行计划是在存储过程编译后第一次执行时生成的。如果您发布每个查询的计划,我们也许能够更清楚地了解该问题。
Rather than relying on experimental observations of how long a query takes to run, I would highly recommend getting Sybase to show you the execution plans for each query, for example:
With
SET showplan ON
, Sybase generates execution plans for every statement it executes. These can be invaluable in helping to identify where queries are not making use of appropriate indexes. For stored procedures in Sybase, the execution plan for the entire procedure is generated when the stored procedure is first executed after being compiled.If you post the plans for each of your queries we might be able to shed more light on the problem.
令人惊讶的是,使用索引提示解决了这个问题(请参阅下面的(index myIndexName)行 - 下面重写/简化的代码:
奇怪的是我有
Amazingly, using an index hint resolves the issue (see the (index myIndexName) line below - re-written/simplififed code below:
Weird that I have to use this technique to avoid a table scan, but there ya go.
Garrett,通过仅显示简化的代码,您可能已经完全删除了能够阐明问题根源的信息。
我的第一个猜测是 allowedStatusValues.status 和 myTable.status 之间的类型不匹配。然而,这并不是唯一的可能性。正如 nineside 所说,完整的查询计划(使用 showplan 和 fmtonly 标志)以及实际的表定义和存储过程源更有可能产生有用的答案。
Garrett, by showing only the simplified code, you have likely stripped out exactly the information that would illuminate the source of the problem.
My first guess would be a type mismatch between allowableStatusValues.status and myTable.status. However, that is not the only possibility. As ninesided stated, the complete query plans (using showplan and fmtonly flags), as well as the actual table definitions and stored procedure source, is much more likely to produce a useful answer.