查询顾客与列出的产品一起购买了什么

发布于 2024-09-07 11:14:16 字数 1628 浏览 4 评论 0原文

我正在尝试优化一个非常旧的查询,我无法理解它。我想要存档的结果是,我想向网上商店的访问者推荐其他客户感兴趣的内容,即他们与访问者正在查看的产品一起购买的其他内容。

我有一个子查询,但它非常慢,大约需要 15 秒处理大约 8 000 000 行。

布局是,放入用户购物篮中的所有产品都保存在表 wsBasket 中,并由 basketid 分隔(在另一个表中与成员关联)。

在这个例子中,我想列出用户与productid 427一起购买的所有最受欢迎的产品,但不列出productid 427本身。

SELECT productid, SUM(quantity) AS qty 
FROM wsBasket 
WHERE basketid IN 
    (SELECT basketid 
     FROM wsBasket 
     WHERE productid=427) AND productid!=427 
GROUP by productid 
ORDER BY qty 
DESC LIMIT 0,4;

非常感谢任何帮助!希望这至少对某人有意义:)

 

更新1: 感谢您的评论,伙计们,这是我的答案,它们不适合评论字段。

在上面的查询上使用 EXPLAIN 我得到了以下结果。请注意,我的表上没有任何索引(id 字段上的主键除外),我想修改查询以从索引中受益并将索引放置在正确的键上。

+----+--------------------+----------+------+---------------+------+---------+------+------+----------------------------------------------+
| id | select_type        | table    | type | possible_keys | key  | key_len | ref  | rows | Extra                                        |
+----+--------------------+----------+------+---------------+------+---------+------+------+----------------------------------------------+
|  1 | PRIMARY            | wsBasket | ALL  | NULL          | NULL | NULL    | NULL | 2821 | Using where; Using temporary; Using filesort |
|  2 | DEPENDENT SUBQUERY | wsBasket | ALL  | NULL          | NULL | NULL    | NULL | 2821 | Using where                                  |
+----+--------------------+----------+------+---------------+------+---------+------+------+----------------------------------------------+

i'm trying to get optimize an very old query that i can't wrap my head around. the result that i want to archive is that i want to recommend the visitor on a web shop what other customers have shown interest in, i.e. what else they have bought together with the product that the visitor is looking at.

i have a subquery but it's very slow, takes ~15s on ~8 000 000 rows.

the layout is that all products that are put in a users basket are kept in a table wsBasket and separated by a basketid (which in another table is associated with a member).

in this example i want to list all the most popular products that users have bought together with productid 427, but not list the productid 427 itself.

SELECT productid, SUM(quantity) AS qty 
FROM wsBasket 
WHERE basketid IN 
    (SELECT basketid 
     FROM wsBasket 
     WHERE productid=427) AND productid!=427 
GROUP by productid 
ORDER BY qty 
DESC LIMIT 0,4;

any help is much appreciated! hope this makes any sense at all to at least someone :)

 

UPDATE 1:
thanks for your comments guys here are my answers, they didn't fit in the comments-field.

Using EXPLAIN on the above query i got the fllowing. Please note, I do not have any indexes on the table (except for primary key on the id-field), i want to modify the query to benefit from indexes and place indexes on the right keys.

+----+--------------------+----------+------+---------------+------+---------+------+------+----------------------------------------------+
| id | select_type        | table    | type | possible_keys | key  | key_len | ref  | rows | Extra                                        |
+----+--------------------+----------+------+---------------+------+---------+------+------+----------------------------------------------+
|  1 | PRIMARY            | wsBasket | ALL  | NULL          | NULL | NULL    | NULL | 2821 | Using where; Using temporary; Using filesort |
|  2 | DEPENDENT SUBQUERY | wsBasket | ALL  | NULL          | NULL | NULL    | NULL | 2821 | Using where                                  |
+----+--------------------+----------+------+---------------+------+---------+------+------+----------------------------------------------+

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

陌路终见情 2024-09-14 11:14:16

需要添加两个明显的索引:一个在篮子 ID 上,第二个在产品 ID 上:然后重试查询和一个新的 EXPLAIN 以查看索引是否正在使用

Two obvious indexes to add: one on basketid and a second on productid: then retry the query and a new EXPLAIN to see that the indexes are being used

愛上了 2024-09-14 11:14:16

除了确保 productidbasketid 上存在合适的索引之外,将查询构造为简单联接而不是子查询通常也会使您受益匪浅,尤其是在 MySQL 中。

SELECT b1.productid, SUM(b1.quantity) AS qty
FROM wsBasket AS b0
JOIN wsBasket AS b1 ON b1.basketid=b0.basketid
WHERE b0.productid=427 AND b1.productid<>427
GROUP BY b1.productid
ORDER BY qty DESC
LIMIT 4

对我来说,在可能相似的数据集上,连接在 EXPLAIN 输出中产生了两个 select_type: SIMPLE 行,而子查询方法则产生了可怕的性能问题依赖子查询。因此,连接速度快了一个数量级以上。

As well as ensuring that suitable indexes exist on productid and basketid, you will often benefit from structuring your query as a simple join rather than a subquery, especially in MySQL.

SELECT b1.productid, SUM(b1.quantity) AS qty
FROM wsBasket AS b0
JOIN wsBasket AS b1 ON b1.basketid=b0.basketid
WHERE b0.productid=427 AND b1.productid<>427
GROUP BY b1.productid
ORDER BY qty DESC
LIMIT 4

For me, on a possibly-similar dataset, the join resulted in two select_type: SIMPLE rows in the EXPLAIN output, whereas the subquery method spat out a horrible-for-performance DEPENDENT SUBQUERY. Consequently the join was well over an order of magnitude faster.

情话墙 2024-09-14 11:14:16

在此查询中主要用于搜索的两个字段是productid 和basketid。

当您搜索productid等于427的记录时,数据库不知道在哪里可以找到该记录。它甚至不知道如果确实找到了一个匹配项,则不会再有另一个匹配项,因此它必须查找整个表,可能有数千条记录。

索引是一个已排序的单独文件,并且仅包含您感兴趣的排序字段。因此创建索引可以节省大量时间!

The two fields you mainly use for searching in this query are productid and basketid.

When you search for records having productid equal to 427, Database has no clue where to find this record. It doesn't even know that if it does find one matching, that there will not be another matching one, so it has to look through the entire table, potentially thousands of records.

An index is a separate file that is sorted, and contains only the field/s you're interested in sorting on. so creating an index saves a immense amount of time!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文