索引文本 - MySQL 与 MS SQL

发布于 2024-10-16 12:57:29 字数 978 浏览 3 评论 0原文

图像你有这样的应用程序:1个数据库表,几个int字段,几个小的varchar字段,和大约10个TEXT字段(内容变量 - 一些数据大约50个字符长,大多数大约100-200,一些大约1000,很少超过1000)。行数的单位是 x0 000 - x00 000。 现在,我需要有效的方法来查询这样的(元语言):

SELECT (1 if textfield1 LIKE %param1% ELSE 0) as r1,(1 if textfield2 LIKE %param2% ELSE 0) as r2, ... 等等,对于 1 个查询中的大多数文本字段(它是动态的 - 可能包含其中 2 个,也可能包含全部)。

现在的问题是 - MySQL 和 MSSQL 哪个对我更好(可能的话可能是 Express,如果确实需要则升级到 Full)?

我知道 MySQL 有很好的文本索引,您可以根据自定义的第一个字符数设置索引,因此我可以在典型场景中平衡它(如下所示: http://fernandoipar.com/2009/08/12/indexing-text-columns-in-mysql/)

MSSQL 有只有全文索引,我没有经验。请注意,我不需要诸如单词邻近度或类似单词之类的功能(run = ran;一些词干提取会很好,但因为数据是多语言的,所以无论如何都是不可能的)。我只需要通用的 LIKE %word% 系统,仅此而已。而且我还必须能够找到短子字符串(2 个字符)。

实际上,目标是每小时/每天运行尽可能多的这些查询(不会有足够的结果,永远不会,因为它们应该尽可能频繁地刷新),所以将这种效率视为要求:)

谢谢!

更新:显然没有办法使用索引来优化 LIKE %foo% 查询。所以新的问题是:有没有其他方法可以加快此类查询的速度? (请省略“购买更多内存或 SSD”之类的内容:)

image you have application like this : 1 DB table, few int fields, few small varchar fields, and about 10 TEXT fields (contents variable - some data about 50 chars long, most about 100-200, some about 1000, very few more than 1000). Row count is in x0 000 - x00 000.
Now, i need effective way to query like this (meta-language):

SELECT (1 if textfield1 LIKE %param1% ELSE 0) as r1,(1 if textfield2 LIKE %param2% ELSE 0) as r2, ... etc, for most of the text fields in 1 query typically (it is dynamic - may be 2 of them included, may be all of them).

Now the question - what is better for me, MySQL or MSSQL (probably express while possible,upgrade to full if really needed) ?

I know that MySQL have nice text indexes, which you have set on custom number of first characters, so i can balance it for the typical scenario (like this : http://fernandoipar.com/2009/08/12/indexing-text-columns-in-mysql/)

MSSQL has only full text indexing, which i have no experience with. Note that i do NOT need features like words proximity or similar words (run = ran; some stemming would be nice, but because data are multilingual it is impossible anyway). I need just common LIKE %word% system, thats all. And i also have to be able to find short substrings (2 chars).

Virtually the goal is to run as many as possible of these queries per hour/day (there wont be enough results, never ever, because they should be refreshed as often as possible), so think of this kind of efficiency as requirement :)

Thanx!

UPDATE: well aparently there is no way to use index for optimizing LIKE %foo% queries. So the new question is : is there any other way to speed up this type of queries ? (please omit things like "buy more ram or SSD" :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

哆啦不做梦 2024-10-23 12:57:29

LIKE '%foo%' 表达式无法在任何 RDBMS 中进行优化。

您需要在 mysqlsql 服务器

我只需要通用的 LIKE %word% 系统

然后选择你想要的任何 DBMS,因为它们都会吸收这样的子句;-)

LIKE '%foo%' expression cannot be optimized in any RDBMS.

You need fulltext indexes in mysql or in sql server

I need just common LIKE %word% system

Then choose any DBMS you want, because all they will suck on such clause ;-)

浅浅 2024-10-23 12:57:29

如今,许多应用程序都使用外部索引和搜索引擎。

看看 http://lucene.apache.org/

Today many applications use an external index and search engine.

Have a look at http://lucene.apache.org/

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文