在 MySQL 查询中使用 Like 和 Concat

发布于 2024-11-09 15:39:44 字数 1197 浏览 2 评论 0原文

我正在将文本文件加载到数据库中,并尝试在列出组织名称的表和保存文本文件以及与这些组织的潜在匹配项的表之间进行一些快速匹配。

我使用 LOAD INFILE CONCURRENT 加载文件,并且没有任何问题。

这种变化来自于这样一个事实:我试图在原始文本表中匹配的字段(职业或隶属关系)不仅仅包含组织名称。所以我尝试使用 LIKE 和通配符来匹配字符串。

为了匹配文本,我尝试使用此查询:

UPDATE raw_faca JOIN orgs AS o
    ON raw_faca.org_id IS NULL AND raw_faca.occupationoraffiliation LIKE CONCAT('%',o.org_name,'%')
SET raw_faca.org_id = o.org_id;

我也尝试过不使用 CONCAT:

UPDATE raw_faca JOIN orgs AS o
    ON raw_faca.org_id IS NULL AND raw_faca.occupationoraffiliation LIKE ('%' + o.org_name + '%')
SET raw_faca.org_id = o.org_id;

raw_faca 表有约 40,000 行,orgs< /code> 表有约 20,000 行。我对所有查询都有索引。查询已经运行了几个小时左右——这对于操作来说似乎太长了。我试图进行的比较是否效率低下,或者我在这里做了一些非常愚蠢的事情?我希望避免逐行使用外部 php 或 python 脚本。

回应下面有关使用 Match 的评论。 。 。反对,我也尝试了以下查询:

UPDATE raw_faca JOIN orgs AS o ON raw_faca.org_id IS NULL AND MATCH(raw_faca.occupationoraffiliation) AGAINST (o.org_name IN NATURAL LANGUAGE MODE)
SET raw_faca.org_id = o.org_id; 

它给了我这个错误:

incorrect arguments to AGAINST

有什么想法吗?

I'm loading text files into my db and trying to do some quick matching between a table that lists names of organizations, and a table that holds the text file and potential matches to those organizations.

I load the file using LOAD INFILE CONCURRENT and don't have any problems with that.

The twist comes from the fact that the field I'm trying to match in the raw text table (occupationoraffiliation) has more than just organization names in it. So I'm trying to use LIKE with wildcards to match the strings.

To match the text, I'm trying to use this query:

UPDATE raw_faca JOIN orgs AS o
    ON raw_faca.org_id IS NULL AND raw_faca.occupationoraffiliation LIKE CONCAT('%',o.org_name,'%')
SET raw_faca.org_id = o.org_id;

I've also tried without CONCAT:

UPDATE raw_faca JOIN orgs AS o
    ON raw_faca.org_id IS NULL AND raw_faca.occupationoraffiliation LIKE ('%' + o.org_name + '%')
SET raw_faca.org_id = o.org_id;

The raw_faca table has ~40,000 rows and the orgs table has ~ 20,000 rows. I have indexes on all the The query has been running for a couple of hours or so -- this seems like way too long for the operation. Is the comparison I'm trying to run just that inefficient or am I doing something spectacularly stupid here? I was hoping to avoid going line-by-line with an external php or python script.

In response to comments below about using Match . . . Against, I've tried the following query as well:

UPDATE raw_faca JOIN orgs AS o ON raw_faca.org_id IS NULL AND MATCH(raw_faca.occupationoraffiliation) AGAINST (o.org_name IN NATURAL LANGUAGE MODE)
SET raw_faca.org_id = o.org_id; 

And it's giving me this error:

incorrect arguments to AGAINST

Any thoughts?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

三生一梦 2024-11-16 15:39:44

带有前导通配符的 LIKE 子句将无法利用任何索引。

A LIKE clause with a leading wild card is not going to be able to take advantage of any indexes.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文