如何将这个 SELECT 变成 DELETE

发布于 2024-12-11 08:46:45 字数 1663 浏览 0 评论 0原文

我有一组基本上需要“清理”的数据

用户将他们最喜欢的香水输入到表单中，在表单上我有自动完成功能以帮助选择并希望减少拼写错误，这对于任务中的任务至关重要手。

例如，以下是我拥有的 Paco Rabanne 1 Million 产品的数据 - 然而，我希望他们选择 Paco Rabanne 1 Million，但我不希望他们有各种乳液、肥皂、除臭剂等的选择，这是我感兴趣的香水品牌和选择。

我拥有的数据示例如下所示。

Paco Rabanne 1 Million Gift Set
Paco Rabanne 1 Million Deodorant Stick 75ml
Paco Rabanne 1 Million Deodorant
Paco Rabanne 1 Million Eau de Toilette
Paco Rabanne 1 Million Gift Set 50ml
Paco Rabanne 1 Million Shower Gel 150ml
Paco Rabanne 1 Million Eau De Toilette Spray 100ml
Paco Rabanne 1 Million After Shave Lotion 100ml

我有一个查询，它取出项目标题中的某些单词并显示剩下的内容

查询：

select DISTINCT( TRIM( LEFT(title, 
        IF( LOCATE('Deodorant', title), LOCATE('Deodorant', title) - 1, 
            IF( LOCATE('Shower', title), LOCATE('Shower', title) - 1, 
                IF( LOCATE('Refillable', title), LOCATE('Refillable', title) - 1,
                    IF( LOCATE('Spray', title), LOCATE('Spray', title) - 1,
                        IF( LOCATE('ml', title), LOCATE('ml', title) - 1,
            999
        )
    )
  )
))))) FROM `PRprod_FRAGRANCES`

这基本上是从表中选择，同时省略标题中带有除臭剂或淋浴等的任何重复行，但我想永久删除记录每次运行此查询的成本很高，因为表有超过 200,000 行

本质上，我想扭转查询，这样它就不会显示我不想要的行，而是实际上删除它们。

除非有另一种方法来解决这个问题？

这是表定义：

field, type, NULL, Key, Default, Extra

autoc_id, int(11), NO, PRI, , auto_increment
title, varchar(128), YES, , , 
genre, varchar(128), YES, , , 
author, varchar(128), YES, , , 
actors, varchar(256), YES, , , 
artist, varchar(128), YES, , , 
main_category, varchar(128), NO, , , 
dateadded, timestamp, NO, , CURRENT_TIMESTAMP,

原文

I have a set of data which I basically need to "clean"

A user inputs their favorite Fragrance into a form, and on the form I have autocomplete to aid in choice and hopefully cut down on spelling mistakes which is vitally important to the task in hand.

For example, the following is the data I have for Paco Rabanne 1 Million products - however, I want them to choose Paco Rabanne 1 Million but I don't want them to have a choice of various lotions, soaps, Deodorant etc etc it's the brand and choice of fragrance that i'm interested in.

An example of the data I have is shown below.

Paco Rabanne 1 Million Gift Set
Paco Rabanne 1 Million Deodorant Stick 75ml
Paco Rabanne 1 Million Deodorant
Paco Rabanne 1 Million Eau de Toilette
Paco Rabanne 1 Million Gift Set 50ml
Paco Rabanne 1 Million Shower Gel 150ml
Paco Rabanne 1 Million Eau De Toilette Spray 100ml
Paco Rabanne 1 Million After Shave Lotion 100ml

I have a query which takes out certain words within the item title and shows what's left

The query:

select DISTINCT( TRIM( LEFT(title, 
        IF( LOCATE('Deodorant', title), LOCATE('Deodorant', title) - 1, 
            IF( LOCATE('Shower', title), LOCATE('Shower', title) - 1, 
                IF( LOCATE('Refillable', title), LOCATE('Refillable', title) - 1,
                    IF( LOCATE('Spray', title), LOCATE('Spray', title) - 1,
                        IF( LOCATE('ml', title), LOCATE('ml', title) - 1,
            999
        )
    )
  )
))))) FROM `PRprod_FRAGRANCES`

This basically selects from the table whilst omitting any duplicate line with Deodorant in the title, or Shower etc etc but I want to permenantly delete the records instead of running this query every time which is expensive as the table has over 200,000 rows

Essentially, I want to turn the query around so it doesn't not show the lines I don't want, but, actually deletes them instead.

Unless there is another way to attack this issue?

This is the table definition:

field, type, NULL, Key, Default, Extra

autoc_id, int(11), NO, PRI, , auto_increment
title, varchar(128), YES, , , 
genre, varchar(128), YES, , , 
author, varchar(128), YES, , , 
actors, varchar(256), YES, , , 
artist, varchar(128), YES, , , 
main_category, varchar(128), NO, , , 
dateadded, timestamp, NO, , CURRENT_TIMESTAMP,

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

迟到的我 2024-12-18 08:46:45

你的问题还是有点模糊。

在您的示例中，您有数据...

Paco Rabanne 1 Million Gift Set
Paco Rabanne 1 Million Deodorant Stick 75ml
Paco Rabanne 1 Million Deodorant
Paco Rabanne 1 Million Eau de Toilette
Paco Rabanne 1 Million Gift Set 50ml
Paco Rabanne 1 Million Shower Gel 150ml
Paco Rabanne 1 Million Eau De Toilette Spray 100ml
Paco Rabanne 1 Million After Shave Lotion 100ml

您的表中还有其他字段吗？比如id之类的？或者只是一个领域？

上面的例子只出现一次，还是每个出现多次？

根据代码将数据分为以下组...

Original                                           | After Processing

Paco Rabanne 1 Million Deodorant                   | Paco Rabanne 1 Million
Paco Rabanne 1 Million Deodorant Stick 75ml        | Paco Rabanne 1 Million
Paco Rabanne 1 Million Shower Gel 150ml            | Paco Rabanne 1 Million

Paco Rabanne 1 Million Eau De Toilette Spray 100ml | Paco Rabanne 1 Million Eau De Toilette Spray 10

Paco Rabanne 1 Million Gift Set 50ml               | Paco Rabanne 1 Million Gift Set 5

Paco Rabanne 1 Million After Shave Lotion 100ml    | Paco Rabanne 1 Million After Shave Lotion 10

Paco Rabanne 1 Million Gift Set                    | Paco Rabanne 1 Million Gift Set

Paco Rabanne 1 Million Eau de Toilette             | Paco Rabanne 1 Million Eau de Toilette

注意：我认为您不希望在 ml 过滤器中使用 - 1 。它将 100ml 变成 10。

对于第一组，您是否希望删除所有记录并替换为仅包含 Paco Rabanne 1 Million Deodorant 的单个记录？或者您想要删除其中 2 个并保留其中 1 个？如果是这样，我们应该如何确定保留哪一个？

对于接下来的三组，每组中只有 1 条记录，但您已将末尾的 0ml 切掉。您想用缩短版本替换这些记录吗？或者对他们什么也不做？

最后两组也只有 1 条记录，过滤器对它们没有任何作用。他们会被孤立吗？

那么，处理完之后，还会一次又一次添加新数据吗？您是否需要可以重复运行的代码，或者是否需要单次运行来执行此操作？

不幸的是，如果没有更多信息，我认为我们无法提供太多帮助。

Your question is still a bit vague.

In your example you have the data...

Paco Rabanne 1 Million Gift Set
Paco Rabanne 1 Million Deodorant Stick 75ml
Paco Rabanne 1 Million Deodorant
Paco Rabanne 1 Million Eau de Toilette
Paco Rabanne 1 Million Gift Set 50ml
Paco Rabanne 1 Million Shower Gel 150ml
Paco Rabanne 1 Million Eau De Toilette Spray 100ml
Paco Rabanne 1 Million After Shave Lotion 100ml

Are there any other fields in your table though? Such as an id, etc? Or is it just one field?

And do the examples above only ever appear once, or do they appear multiple times each?

Breaking your data down into groups based on your code gives the following groups...

Original                                           | After Processing

Paco Rabanne 1 Million Deodorant                   | Paco Rabanne 1 Million
Paco Rabanne 1 Million Deodorant Stick 75ml        | Paco Rabanne 1 Million
Paco Rabanne 1 Million Shower Gel 150ml            | Paco Rabanne 1 Million

Paco Rabanne 1 Million Eau De Toilette Spray 100ml | Paco Rabanne 1 Million Eau De Toilette Spray 10

Paco Rabanne 1 Million Gift Set 50ml               | Paco Rabanne 1 Million Gift Set 5

Paco Rabanne 1 Million After Shave Lotion 100ml    | Paco Rabanne 1 Million After Shave Lotion 10

Paco Rabanne 1 Million Gift Set                    | Paco Rabanne 1 Million Gift Set

Paco Rabanne 1 Million Eau de Toilette             | Paco Rabanne 1 Million Eau de Toilette

Note: I think you don't want a - 1 in the ml filter. It turns 100ml into 10.

In the case of the first group, do you want all records deleted and replaced with a single record saying just Paco Rabanne 1 Million Deodorant? Or do you want 2 of them deleted, and keep 1 of them? And if so, how should we determine which to keep?

In the case of the next three groups, each group only has 1 record in it, but you've chopped the 0ml off the end. Do you want to replace those records with the shortened version? Or do nothing with them?

The final two groups also only have 1 record each, and the filter did nothing to them. Do they get left alone?

Then, once you've processed this, will new data get added again and again? Do you need code that you can run repeatedly, or do you need to do this just as a single run?

Unfortunately, without a lot more information I don't think we're going to be able to help much.

回复收藏 0 原文