帮助我更改这个单一的复杂查询以使用临时表

发布于 2024-09-05 03:17:33 字数 6082 浏览 3 评论 0原文

关于系统: - 有导师创建课程和包 - 遵循基于标签的搜索方法。当新导师注册和导师创建包时创建标签关系(这使得导师和包可搜索)。有关详细信息,请查看标签在该系统中如何工作部分?以下。

以下是相关查询 任何人都可以帮我建议一种使用临时表的方法。我们已经对所有相关字段建立了索引,看起来这是使用这种方法可能花费的最短时间:-

 SELECT SUM(DISTINCT( t.tag LIKE "%Dictatorship%"
                      OR tt.tag LIKE "%Dictatorship%"
                      OR ttt.tag LIKE "%Dictatorship%" )) AS key_1_total_matches
       ,
       SUM(DISTINCT( t.tag LIKE "%democracy%"
                      OR tt.tag LIKE "%democracy%"
                      OR ttt.tag LIKE "%democracy%" ))    AS key_2_total_matches
       ,
       COUNT(DISTINCT( od.id_od ))                        AS
       tutor_popularity,
       CASE
         WHEN ( IF(( wc.id_wc > 0 ), ( wc.wc_api_status = 1
                                       AND wc.wc_type = 0
                                       AND wc.class_date > '2010-06-01 22:00:56'
                                       AND wccp.status = 1
                                       AND ( wccp.country_code = 'IE'
                                              OR wccp.country_code IN ( 'INT' )
                                           ) ), 0)
              ) THEN 1
         ELSE 0
       END                                                AS 'classes_published'
       ,
       CASE
         WHEN ( IF(( lp.id_lp > 0 ), ( lp.id_status = 1
                                       AND lp.published = 1
                                       AND lpcp.status = 1
                                       AND ( lpcp.country_code = 'IE'
                                              OR lpcp.country_code IN ( 'INT' )
                                           ) ), 0)
              ) THEN 1
         ELSE 0
       END                                                AS 'packs_published',
       td . *,
       u . *
FROM   tutor_details AS td
       JOIN users AS u
         ON u.id_user = td.id_user
       LEFT JOIN learning_packs_tag_relations AS lptagrels
         ON td.id_tutor = lptagrels.id_tutor
       LEFT JOIN learning_packs AS lp
         ON lptagrels.id_lp = lp.id_lp
       LEFT JOIN learning_packs_categories AS lpc
         ON lpc.id_lp_cat = lp.id_lp_cat
       LEFT JOIN learning_packs_categories AS lpcp
         ON lpcp.id_lp_cat = lpc.id_parent
       LEFT JOIN learning_pack_content AS lpct
         ON ( lp.id_lp = lpct.id_lp )
       LEFT JOIN webclasses_tag_relations AS wtagrels
         ON td.id_tutor = wtagrels.id_tutor
       LEFT JOIN webclasses AS wc
         ON wtagrels.id_wc = wc.id_wc
       LEFT JOIN learning_packs_categories AS wcc
         ON wcc.id_lp_cat = wc.id_wp_cat
       LEFT JOIN learning_packs_categories AS wccp
         ON wccp.id_lp_cat = wcc.id_parent
       LEFT JOIN order_details AS od
         ON td.id_tutor = od.id_author
       LEFT JOIN orders AS o
         ON od.id_order = o.id_order
       LEFT JOIN tutors_tag_relations AS ttagrels
         ON td.id_tutor = ttagrels.id_tutor
       LEFT JOIN tags AS t
         ON t.id_tag = ttagrels.id_tag
       LEFT JOIN tags AS tt
         ON tt.id_tag = lptagrels.id_tag
       LEFT JOIN tags AS ttt
         ON ttt.id_tag = wtagrels.id_tag
WHERE  ( u.country = 'IE'
          OR u.country IN ( 'INT' ) )
       AND CASE
             WHEN ( ( tt.id_tag = lptagrels.id_tag )
                    AND ( lp.id_lp > 0 ) ) THEN lp.id_status = 1
                                                AND lp.published = 1
                                                AND lpcp.status = 1
                                                AND ( lpcp.country_code = 'IE'
                                                       OR lpcp.country_code IN (
                                                          'INT'
                                                          ) )
             ELSE 1
           END
       AND CASE
             WHEN ( ( ttt.id_tag = wtagrels.id_tag )
                    AND ( wc.id_wc > 0 ) ) THEN wc.wc_api_status = 1
                                                AND wc.wc_type = 0
                                                AND
             wc.class_date > '2010-06-01 22:00:56'
                                                AND wccp.status = 1
                                                AND ( wccp.country_code = 'IE'
                                                       OR wccp.country_code IN (
                                                          'INT'
                                                          ) )
             ELSE 1
           END
       AND CASE
             WHEN ( od.id_od > 0 ) THEN od.id_author = td.id_tutor
                                        AND o.order_status = 'paid'
                                        AND CASE
             WHEN ( od.id_wc > 0 ) THEN od.can_attend_class = 1
             ELSE 1
                                            END
             ELSE 1
           END
       AND ( t.tag LIKE "%Dictatorship%"
              OR t.tag LIKE "%democracy%"
              OR tt.tag LIKE "%Dictatorship%"
              OR tt.tag LIKE "%democracy%"
              OR ttt.tag LIKE "%Dictatorship%"
              OR ttt.tag LIKE "%democracy%" )
GROUP  BY td.id_tutor
HAVING key_1_total_matches = 1
       AND key_2_total_matches = 1
ORDER  BY tutor_popularity DESC,
          u.surname ASC,
          u.name ASC
LIMIT  0, 20  

问题

上述查询返回的结果是正确的(并且逻辑按预期工作),但是对于较重的数据,查询所花费的时间会惊人地增加,而对于我拥有的当前数据,查询所花费的时间大约为 10 秒,而正常查询时间为 0.005 - 0.0002 秒,这使得它完全无法使用。

有人在我的上一个问题中建议执行以下操作:

  • 创建一个临时表并在此处插入可能最终出现在最终结果集中的所有相关数据,
  • 在此表上运行多次更新,一次连接所需的表而不是全部他们同时
  • 最终对该临时表执行查询以提取最终结果

所有这一切都是在存储过程中完成的,最终结果已经通过了单元测试,并且速度非常快。

到目前为止我从未使用过临时表。只有当我能得到一些提示、某种示意性表示,以便我可以从...开始时,

查询是否有问题? 执行时间超过 10 秒的原因是什么?

标签在该系统中如何工作?

  • 当导师注册时,会输入标签并根据导师的详细信息(如姓名等)创建标签关系。
  • 当导师创建包时,会再次输入标签并标记关系是根据包的详细信息(例如包名称、描述等)创建的。
  • 存储在tutorials_tag_relations中的导师的标签关系以及存储在learning_packs_tag_relations中的包的标签关系。所有单独的标签都存储在标签表中。

About the system:
- There are tutors who create classes and packs
- A tags based search approach is being followed.Tag relations are created when new tutors register and when tutors create packs (this makes tutors and packs searcheable). For details please check the section How tags work in this system? below.

Following is the concerned query
Can anybody help me suggest an approach using temporary tables. We have indexed all the relevant fields and it looks like this is the least time possible with this approach:-

 SELECT SUM(DISTINCT( t.tag LIKE "%Dictatorship%"
                      OR tt.tag LIKE "%Dictatorship%"
                      OR ttt.tag LIKE "%Dictatorship%" )) AS key_1_total_matches
       ,
       SUM(DISTINCT( t.tag LIKE "%democracy%"
                      OR tt.tag LIKE "%democracy%"
                      OR ttt.tag LIKE "%democracy%" ))    AS key_2_total_matches
       ,
       COUNT(DISTINCT( od.id_od ))                        AS
       tutor_popularity,
       CASE
         WHEN ( IF(( wc.id_wc > 0 ), ( wc.wc_api_status = 1
                                       AND wc.wc_type = 0
                                       AND wc.class_date > '2010-06-01 22:00:56'
                                       AND wccp.status = 1
                                       AND ( wccp.country_code = 'IE'
                                              OR wccp.country_code IN ( 'INT' )
                                           ) ), 0)
              ) THEN 1
         ELSE 0
       END                                                AS 'classes_published'
       ,
       CASE
         WHEN ( IF(( lp.id_lp > 0 ), ( lp.id_status = 1
                                       AND lp.published = 1
                                       AND lpcp.status = 1
                                       AND ( lpcp.country_code = 'IE'
                                              OR lpcp.country_code IN ( 'INT' )
                                           ) ), 0)
              ) THEN 1
         ELSE 0
       END                                                AS 'packs_published',
       td . *,
       u . *
FROM   tutor_details AS td
       JOIN users AS u
         ON u.id_user = td.id_user
       LEFT JOIN learning_packs_tag_relations AS lptagrels
         ON td.id_tutor = lptagrels.id_tutor
       LEFT JOIN learning_packs AS lp
         ON lptagrels.id_lp = lp.id_lp
       LEFT JOIN learning_packs_categories AS lpc
         ON lpc.id_lp_cat = lp.id_lp_cat
       LEFT JOIN learning_packs_categories AS lpcp
         ON lpcp.id_lp_cat = lpc.id_parent
       LEFT JOIN learning_pack_content AS lpct
         ON ( lp.id_lp = lpct.id_lp )
       LEFT JOIN webclasses_tag_relations AS wtagrels
         ON td.id_tutor = wtagrels.id_tutor
       LEFT JOIN webclasses AS wc
         ON wtagrels.id_wc = wc.id_wc
       LEFT JOIN learning_packs_categories AS wcc
         ON wcc.id_lp_cat = wc.id_wp_cat
       LEFT JOIN learning_packs_categories AS wccp
         ON wccp.id_lp_cat = wcc.id_parent
       LEFT JOIN order_details AS od
         ON td.id_tutor = od.id_author
       LEFT JOIN orders AS o
         ON od.id_order = o.id_order
       LEFT JOIN tutors_tag_relations AS ttagrels
         ON td.id_tutor = ttagrels.id_tutor
       LEFT JOIN tags AS t
         ON t.id_tag = ttagrels.id_tag
       LEFT JOIN tags AS tt
         ON tt.id_tag = lptagrels.id_tag
       LEFT JOIN tags AS ttt
         ON ttt.id_tag = wtagrels.id_tag
WHERE  ( u.country = 'IE'
          OR u.country IN ( 'INT' ) )
       AND CASE
             WHEN ( ( tt.id_tag = lptagrels.id_tag )
                    AND ( lp.id_lp > 0 ) ) THEN lp.id_status = 1
                                                AND lp.published = 1
                                                AND lpcp.status = 1
                                                AND ( lpcp.country_code = 'IE'
                                                       OR lpcp.country_code IN (
                                                          'INT'
                                                          ) )
             ELSE 1
           END
       AND CASE
             WHEN ( ( ttt.id_tag = wtagrels.id_tag )
                    AND ( wc.id_wc > 0 ) ) THEN wc.wc_api_status = 1
                                                AND wc.wc_type = 0
                                                AND
             wc.class_date > '2010-06-01 22:00:56'
                                                AND wccp.status = 1
                                                AND ( wccp.country_code = 'IE'
                                                       OR wccp.country_code IN (
                                                          'INT'
                                                          ) )
             ELSE 1
           END
       AND CASE
             WHEN ( od.id_od > 0 ) THEN od.id_author = td.id_tutor
                                        AND o.order_status = 'paid'
                                        AND CASE
             WHEN ( od.id_wc > 0 ) THEN od.can_attend_class = 1
             ELSE 1
                                            END
             ELSE 1
           END
       AND ( t.tag LIKE "%Dictatorship%"
              OR t.tag LIKE "%democracy%"
              OR tt.tag LIKE "%Dictatorship%"
              OR tt.tag LIKE "%democracy%"
              OR ttt.tag LIKE "%Dictatorship%"
              OR ttt.tag LIKE "%democracy%" )
GROUP  BY td.id_tutor
HAVING key_1_total_matches = 1
       AND key_2_total_matches = 1
ORDER  BY tutor_popularity DESC,
          u.surname ASC,
          u.name ASC
LIMIT  0, 20  

The problem

The results returned by the above query are correct (AND logic working as per expectation), but the time taken by the query rises alarmingly for heavier data and for the current data I have it is like 10 seconds as against normal query timings of the order of 0.005 - 0.0002 seconds, which makes it totally unusable.

Somebody suggested in my previous question to do the following:-

  • create a temporary table and insert here all relevant data that might end up in the final result set
  • run several updates on this table, joining the required tables one at a time instead of all of them at the same time
  • finally perform a query on this temporary table to extract the end result

All this was done in a stored procedure, the end result has passed unit tests, and is blazing fast.

I have never worked with temporary tables till now. Only if I could get some hints, kind of schematic representations so that I can start with...

Is there something faulty with the query?
What can be the reason behind 10+ seconds of execution time?

How tags work in this system?

  • When a tutor registers, tags are entered and tag relations are created with respect to tutor's details like name, surname etc.
  • When a Tutors create packs, again tags are entered and tag relations are created with respect to pack's details like pack name, description etc.
  • tag relations for tutors stored in tutors_tag_relations and those for packs stored in learning_packs_tag_relations. All individual tags are stored in tags table.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

谁把谁当真 2024-09-12 03:17:33

临时表并不是灵丹妙药。查询的根本问题在于这样的模式:

   t.tag LIKE "%Dictatorship%"
OR tt.tag LIKE "%Dictatorship%"
OR ttt.tag LIKE "%Dictatorship%"

对 LIKE 比较的左侧使用通配符会保证无法使用索引。实际上,您正在扫描涉及的所有三个表...

您需要利用全文搜索,或者 MySQL 的原生 FTS 或 3rd 方东西,如 Sphinx。我所知道的所有 FTS 都包含一个评分/排名值,表明比赛的强度 - 算法细节可以阅读MySQL文档。但分数/排名与您得到的不一样:SUM(DISTINCT LIKE...),您可以使用以下方法获得相同的结果:

  SELECT t.id_tag, 
         COUNT(*) AS num_matches 
    FROM TABGS
   WHERE MATCH(tag) AGAINST ('Dictatorship')
GROUP BY t.id_tag

Temporary tables are not a silver bullet. The fundamental problem with your queries lies with patterns like this:

   t.tag LIKE "%Dictatorship%"
OR tt.tag LIKE "%Dictatorship%"
OR ttt.tag LIKE "%Dictatorship%"

Wildcarding the left side of a LIKE comparison guarantees that an index can not be used. Effectively, you're table scanning all three tables involved...

You need to leverage Full Text Searching, either MySQL's native FTS or 3rd party stuff like Sphinx. All the FTS I've known include a scoring/rank value indicating the strength of the match - you can read the MySQL documentation for the algorithm details. But the score/rank is not the same as what you've got: SUM(DISTINCT LIKE...), you could get the same using something like:

  SELECT t.id_tag, 
         COUNT(*) AS num_matches 
    FROM TABGS
   WHERE MATCH(tag) AGAINST ('Dictatorship')
GROUP BY t.id_tag
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文