MySQL 全文不工作

发布于 2024-07-28 21:52:37 字数 2214 浏览 5 评论 0原文

我正在尝试使用 MySQL 的 FULLTEXT 索引为我的 PHP Web 应用程序添加搜索支持。

我创建了一个测试表(使用 MyISAM 类型,带有单个文本字段 a)并输入了一些示例数据。 现在,如果我是对的,以下查询应该返回这两行:

SELECT * FROM test WHERE MATCH(a) AGAINST('databases')

但是它不返回任何内容。 我已经做了一些研究,据我所知,我所做的一切都是正确的 - 该表是一个 MyISAM 表,并且设置了 FULLTEXT 索引。 我尝试从提示符和 phpMyAdmin 运行查询,但没有成功。 我错过了一些重要的事情吗?


更新:好吧,虽然科迪的解决方案在我的测试用例中有效,但它似乎不适用于我的实际表:

CREATE TABLE IF NOT EXISTS `uploads` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` text NOT NULL,
  `size` int(11) NOT NULL,
  `type` text NOT NULL,
  `alias` text NOT NULL,
  `md5sum` text NOT NULL,
  `uploaded` datetime NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=6 ;

我正在使用的数据:

INSERT INTO `uploads` (`id`, `name`, `size`, `type`, `alias`, `md5sum`, `uploaded`) VALUES
(1, '04 Sickman.mp3', 5261182, 'audio/mp3', '1', 'df2eb6a360fbfa8e0c9893aadc2289de', '2009-07-14 16:08:02'),
(2, '07 Dirt.mp3', 5056435, 'audio/mp3', '2', 'edcb873a75c94b5d0368681e4bd9ca41', '2009-07-14 16:08:08'),
(3, 'header_bg2.png', 16765, 'image/png', '3', '5bc5cb5c45c7fa329dc881a8476a2af6', '2009-07-14 16:08:30'),
(4, 'page_top_right2.png', 5299, 'image/png', '4', '53ea39f826b7c7aeba11060c0d8f4e81', '2009-07-14 16:08:37'),
(5, 'todo.txt', 392, 'text/plain', '5', '7ee46db77d1b98b145c9a95444d8dc67', '2009-07-14 16:08:46');

我现在运行的查询是:

SELECT * FROM `uploads` WHERE MATCH(name) AGAINST ('header' IN BOOLEAN MODE)

哪个应该返回第 3 行,header_bg2.png。 相反,我得到另一个空结果集。 我的布尔搜索选项如下:

mysql> show variables like 'ft_%';
+--------------------------+----------------+
| Variable_name            | Value          |
+--------------------------+----------------+
| ft_boolean_syntax        | + -><()~*:""&| |
| ft_max_word_len          | 84             |
| ft_min_word_len          | 4              |
| ft_query_expansion_limit | 20             |
| ft_stopword_file         | (built-in)     |
+--------------------------+----------------+
5 rows in set (0.02 sec)

“header”在字长限制范围内,我怀疑它是一个停止词(我不知道如何获取列表)。 有任何想法吗?

I'm attempting to add searching support for my PHP web app using MySQL's FULLTEXT indexes.

I created a test table (using the MyISAM type, with a single text field a) and entered some sample data. Now if I'm right the following query should return both those rows:

SELECT * FROM test WHERE MATCH(a) AGAINST('databases')

However it returns none. I've done a bit of research and I'm doing everything right as far as I can tell - the table is a MyISAM table, the FULLTEXT indexes are set. I've tried running the query from the prompt and from phpMyAdmin, with no luck. Am I missing something crucial?


UPDATE: Ok, while Cody's solution worked in my test case it doesn't seem to work on my actual table:

CREATE TABLE IF NOT EXISTS `uploads` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` text NOT NULL,
  `size` int(11) NOT NULL,
  `type` text NOT NULL,
  `alias` text NOT NULL,
  `md5sum` text NOT NULL,
  `uploaded` datetime NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=6 ;

And the data I'm using:

INSERT INTO `uploads` (`id`, `name`, `size`, `type`, `alias`, `md5sum`, `uploaded`) VALUES
(1, '04 Sickman.mp3', 5261182, 'audio/mp3', '1', 'df2eb6a360fbfa8e0c9893aadc2289de', '2009-07-14 16:08:02'),
(2, '07 Dirt.mp3', 5056435, 'audio/mp3', '2', 'edcb873a75c94b5d0368681e4bd9ca41', '2009-07-14 16:08:08'),
(3, 'header_bg2.png', 16765, 'image/png', '3', '5bc5cb5c45c7fa329dc881a8476a2af6', '2009-07-14 16:08:30'),
(4, 'page_top_right2.png', 5299, 'image/png', '4', '53ea39f826b7c7aeba11060c0d8f4e81', '2009-07-14 16:08:37'),
(5, 'todo.txt', 392, 'text/plain', '5', '7ee46db77d1b98b145c9a95444d8dc67', '2009-07-14 16:08:46');

The query I'm now running is:

SELECT * FROM `uploads` WHERE MATCH(name) AGAINST ('header' IN BOOLEAN MODE)

Which should return row 3, header_bg2.png. Instead I get another empty result set. My options for boolean searching are below:

mysql> show variables like 'ft_%';
+--------------------------+----------------+
| Variable_name            | Value          |
+--------------------------+----------------+
| ft_boolean_syntax        | + -><()~*:""&| |
| ft_max_word_len          | 84             |
| ft_min_word_len          | 4              |
| ft_query_expansion_limit | 20             |
| ft_stopword_file         | (built-in)     |
+--------------------------+----------------+
5 rows in set (0.02 sec)

"header" is within the word length restrictions and I doubt it's a stop word (I'm not sure how to get the list). Any ideas?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

猫弦 2024-08-04 21:53:40

MySQL全文检索有两种模式:自然语言模式和布尔模式。 自然语言模式的限制是“...出现在 50% 或更多行中的单词被认为是常见的且不匹配。如果没有给出修饰符,全文搜索就是自然语言搜索。” 自然语言是默认模式。 这在全文文档中有记录:

http://dev.mysql .com/doc/refman/5.0/en/fulltext-search.html

如果将查询切换为使用布尔模式:

SELECT * FROM test WHERE MATCH(a) AGAINST('databases' IN BOOLEAN MODE)

则返回两行。

布尔模式有其自身的限制,其中一个常见的限制是它不会按相关顺序返回其行。 总的来说,它确实提供了比自然语言模式更多的功能和灵活性,因此您可能最终会使用它。

如果您的应用程序严重依赖全文搜索,您可能需要考虑功能更全的软件包,例如 Lucene/Solr Sphinx

There are two modes for MySQL Fulltext searching: natural language mode and Boolean mode. A restriction of natural language mode is " ... words that are present in 50% or more of the rows are considered common and do not match. Full-text searches are natural language searches if no modifier is given." And natural language is the default mode. This is documented in the Fulltext docs:

http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html

If you switch your query to using Boolean mode:

SELECT * FROM test WHERE MATCH(a) AGAINST('databases' IN BOOLEAN MODE)

Then the two rows are returned.

Boolean mode has its own restrictions, one common one being that it does not return its rows in order of relevance. Overall, it does offer more features and flexibility than natural language mode, so you'll probably end up using it.

If your application is going to rely heavily on fulltext searching you might want to consider more full-featured packages such as Lucene/Solr or Sphinx

往日情怀 2024-08-04 21:53:30

添加更多数据。 默认情况下,MySQL 将忽略表中 50% 或更多行中的任何单词,因为它认为它是“噪音”单词。

由于表中的行很少,因此经常会达到 50% 的限制(即,如果有两行,则每个单词至少占行的 50%!)。

Add more data. By default MySQL will ignore any word that is in 50% or more of the rows in the table as it considers it would be a 'noise' word.

With very few rows in a table, it is common to hit this 50% limit often (ie. if you have two rows, every word is in at least 50% of the rows!).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文