mysql:使用混合优先排序优化查询
我有一个大表(现在约为 1M 行,很快约为 10M),其中有两个排名列(除了常规数据之外):
avg_visited
,一个浮点 0-1 代表 %age 流行度;alexa_rank
越高越好- ,一个 1-N 的整数,给出先验排名
先验排名来自外部来源,因此无法更改。许多行还没有流行度(因为还没有用户点击它),因此先验排名是后备排序。然而,流行度确实经常变化 - 既更新旧条目,又为以前仅具有先验排名的条目添加流行度(如果某些用户实际上点击了它)。
我经常运行 SELECT id, url, alexa_rank, avg_visited FROM
sitesORDER BY avg_visited desc, alexa_rank asc LIMIT 49500, 500
(对于 49500 的各种值)。
但是,ORDER BY 不能使用具有混合优先顺序的索引 http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html
这是在mysql 5.1,innodb中。
我怎样才能最好地改变这种情况,为我提供一个健全的、完全索引的查询?
I have a large table (~1M rows now, soon ~10M) that has two ranked columns (in addition to the regular data):
avg_visited
, a float 0-1 representing a %age popularity; higher is betteralexa_rank
, an integer 1-N giving an a priori ranking
The a priori ranking is from external sources so can't be changed. Many rows have no popularity yet (as no user has yet hit it), so the a priori ranking is the fallback ordering. The popularity however does change very frequently - both to update old entries and to add a popularity to ones that previously only had the a priori ranking, if some user actually hits it.
I frequently run SELECT id, url, alexa_rank, avg_visited FROM
sitesORDER BY avg_visited desc, alexa_rank asc LIMIT 49500, 500
(for various values of 49500).
However, ORDER BY cannot use an index with mixed ascendency per http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html
This is in mysql 5.1, innodb.
How can I best change this situation to give me a sane, fully indexed query?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
不幸的是,MySQL 不支持索引中的 DESC 子句,也不支持派生表达式上的索引。
您可以将负面受欢迎程度与正面受欢迎程度一起存储,并在
ORDER BY
中使用它:Unfortunately,
MySQL
does not supportDESC
clauses in the indexes, neither does it support indexes on derived expressions.You can store the negative popularity along with the positive one and use it in the
ORDER BY
:只是一个简单的技巧:
因为流行度是 0 到 1 之间的浮点数。您可以将其乘以 -1,数字将在 -1 到 0 之间。
这样您就可以将流行度的排序顺序反转为
ORDER BY流行度 ASC,a_priori ASC
不确定开销是否能衡量收益。
这让我想起了以相反形式存储电子邮件的技巧。
Just a simple hack:
Since since popularity is a float between 0 to 1. You can multiply it by -1 and the number will be between -1 to 0.
This way you can reverse the sort order of popularity to
ORDER BY popularity ASC, a_priori ASC
Not sure the overhead out weighs the gain.
This reminds me of the hack of storing emails in reverse form.