在 MySQL 查询中使用 Like 和 Concat
我正在将文本文件加载到数据库中,并尝试在列出组织名称的表和保存文本文件以及与这些组织的潜在匹配项的表之间进行一些快速匹配。
我使用 LOAD INFILE CONCURRENT 加载文件,并且没有任何问题。
这种变化来自于这样一个事实:我试图在原始文本表中匹配的字段(职业或隶属关系
)不仅仅包含组织名称。所以我尝试使用 LIKE
和通配符来匹配字符串。
为了匹配文本,我尝试使用此查询:
UPDATE raw_faca JOIN orgs AS o
ON raw_faca.org_id IS NULL AND raw_faca.occupationoraffiliation LIKE CONCAT('%',o.org_name,'%')
SET raw_faca.org_id = o.org_id;
我也尝试过不使用 CONCAT:
UPDATE raw_faca JOIN orgs AS o
ON raw_faca.org_id IS NULL AND raw_faca.occupationoraffiliation LIKE ('%' + o.org_name + '%')
SET raw_faca.org_id = o.org_id;
raw_faca
表有约 40,000 行,orgs< /code> 表有约 20,000 行。我对所有查询都有索引。查询已经运行了几个小时左右——这对于操作来说似乎太长了。我试图进行的比较是否效率低下,或者我在这里做了一些非常愚蠢的事情?我希望避免逐行使用外部 php 或 python 脚本。
回应下面有关使用 Match 的评论。 。 。反对,我也尝试了以下查询:
UPDATE raw_faca JOIN orgs AS o ON raw_faca.org_id IS NULL AND MATCH(raw_faca.occupationoraffiliation) AGAINST (o.org_name IN NATURAL LANGUAGE MODE)
SET raw_faca.org_id = o.org_id;
它给了我这个错误:
incorrect arguments to AGAINST
有什么想法吗?
I'm loading text files into my db and trying to do some quick matching between a table that lists names of organizations, and a table that holds the text file and potential matches to those organizations.
I load the file using LOAD INFILE CONCURRENT
and don't have any problems with that.
The twist comes from the fact that the field I'm trying to match in the raw text table (occupationoraffiliation
) has more than just organization names in it. So I'm trying to use LIKE
with wildcards to match the strings.
To match the text, I'm trying to use this query:
UPDATE raw_faca JOIN orgs AS o
ON raw_faca.org_id IS NULL AND raw_faca.occupationoraffiliation LIKE CONCAT('%',o.org_name,'%')
SET raw_faca.org_id = o.org_id;
I've also tried without CONCAT
:
UPDATE raw_faca JOIN orgs AS o
ON raw_faca.org_id IS NULL AND raw_faca.occupationoraffiliation LIKE ('%' + o.org_name + '%')
SET raw_faca.org_id = o.org_id;
The raw_faca
table has ~40,000 rows and the orgs
table has ~ 20,000 rows. I have indexes on all the The query has been running for a couple of hours or so -- this seems like way too long for the operation. Is the comparison I'm trying to run just that inefficient or am I doing something spectacularly stupid here? I was hoping to avoid going line-by-line with an external php or python script.
In response to comments below about using Match . . . Against
, I've tried the following query as well:
UPDATE raw_faca JOIN orgs AS o ON raw_faca.org_id IS NULL AND MATCH(raw_faca.occupationoraffiliation) AGAINST (o.org_name IN NATURAL LANGUAGE MODE)
SET raw_faca.org_id = o.org_id;
And it's giving me this error:
incorrect arguments to AGAINST
Any thoughts?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
带有前导通配符的
LIKE
子句将无法利用任何索引。A
LIKE
clause with a leading wild card is not going to be able to take advantage of any indexes.