SQLite 中的按位掩码与 IN() 效率?
我有两种方法从数据库中选择一组条目:
SELECT ... WHERE `level` IN (1,2,4,8) LIMIT ...;
或者
SELECT ... WHERE `level` & mask LIMIT ...;
总共有 4 个“级别”,编号为 1、2、4、8(因为能够在其他地方也使用相同的掩码)。 IN()
的大括号或 mask
都可以包含 4 个级别中的一个或多个级别中的任意一组。该列已建立索引。查询所花费的时间仍然比舒适的要长,我们正在尝试优化它以提高速度。
昨天,有人说决定使用朴素的 IN() 会导致最多四次比较,而我应该使用位掩码来代替。今天我听说位掩码将完全阻碍列上索引的优势,并且速度会慢得多。
你能告诉我哪种方法会更快吗?
I have two ways to select a set of entries from the database:
SELECT ... WHERE `level` IN (1,2,4,8) LIMIT ...;
or
SELECT ... WHERE `level` & mask LIMIT ...;
There are 4 'levels' total, numbered 1,2,4,8 (for reasons of ability to use the same mask elsewhere too). Both the braces of IN()
or the mask
can contain any set of one or more of the 4 levels. The column is indexed. The query is still taking way longer than comfortable, and we're trying to optimize it for speed.
Yesterday one person said decided using naive IN() results in up to four comparisons and that I should be using a bit mask instead. Today I heard the bit mask will completely thwart advantages from index on the column, and will be much slower.
Can you tell me which approach will be faster?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
你的问题很老了,但我仍然会回答它。
位掩码很可能会更慢,因为它必须计算按位 AND 的计算,而
IN
将使用level
的索引值来计算在括号内的参数中查找它(我相信,这应该是一个O(log(n))
操作)。现在,您可能会错过的一件事是,他们不做同样的事情。
您的第一个查询将简单地检查
level
是否为 1、2、4 或 8。您的第二个查询,或者实际上类似于:
能够查找包含以下内容的
levels
您想要的掩码以及可能更多的掩码,在您的情况下,它可能会检查 1 到 15 之间的所有值组合。因此,性能会受到影响。至于 @AlanFoster 建议的暴力基准测试,我不同意他的观点。
最好在查询中添加以下前缀:
EXPLAIN
或EXPLAIN QUERY PLAN
并检查 SQLite 匹配的行数。
更新
EXPLAIN QUERY PLAN SELECT * FROM ... WHERE level IN (2, 3);
EXPLAIN QUERY PLAN SELECT * FROM ... WHERE (level & 2) = 2;
code>如您所见,按位AND运算符需要进行全表扫描。
Your question is quite old but I'm still gonna answer it nonetheless.
The bitmask will most probably be slower as it has to work out the computation of the bitwise AND whereas
IN
will use the indexed value oflevel
to look it up in the arguments enclosed within the parentheses (which, I believe, should be a singleO(log(n))
operation).Now, the thing that you may be missing, is that they don't do the same thing.
Your first query will simply check if
level
is either 1, 2, 4 or 8.Your second query, or actually something like:
Has the ability to lookup
levels
that contain the mask you want and potentially some more, in your case it could be checking all value combinations between 1 and 15. Hence the performance hit.As for the bruteforced benchmark @AlanFoster suggested, I don't agree with him.
It's far better to prefix the query with either:
EXPLAIN
, orEXPLAIN QUERY PLAN
And inspect how many rows SQLite is matching.
Update
EXPLAIN QUERY PLAN SELECT * FROM ... WHERE level IN (2, 3);
EXPLAIN QUERY PLAN SELECT * FROM ... WHERE (level & 2) = 2;
As you can see, the bitwise AND operator needs a full-table scan.