Mysql德语口音全文搜索时不敏感搜索
让我们有一个示例酒店表:
CREATE TABLE `hotels` (
`HotelNo` varchar(4) character set latin1 NOT NULL default '0000',
`Hotel` varchar(80) character set latin1 NOT NULL default '',
`City` varchar(100) character set latin1 default NULL,
`CityFR` varchar(100) character set latin1 default NULL,
`Region` varchar(50) character set latin1 default NULL,
`RegionFR` varchar(100) character set latin1 default NULL,
`Country` varchar(50) character set latin1 default NULL,
`CountryFR` varchar(50) character set latin1 default NULL,
`HotelText` text character set latin1,
`HotelTextFR` text character set latin1,
`tagsforsearch` text character set latin1,
`tagsforsearchFR` text character set latin1,
PRIMARY KEY (`HotelNo`),
FULLTEXT KEY `fulltextHotelSearch` (`HotelNo`,`Hotel`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`,`HotelText`,`HotelTextFR`,`tagsforsearch`,`tagsforsearchFR`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_german1_ci;
例如,在此表中,我们只有一家酒店,其区域名称 =“Graubünden”(请注意元音变音 ü 字符)
现在我想实现相同的短语搜索匹配: “格劳宾登”和 'graubünden'
使用内置的 MySql 很简单 常规搜索中的排序规则如下:
SELECT *
FROM `hotels`
WHERE `Region` LIKE CONVERT(_utf8 '%graubunden%' USING latin1)
COLLATE latin1_german1_ci
这适用于“graubunden”和“graubünden”以及 结果我收到了正确的结果,但问题是 当我们进行 MySQL 全文搜索时,
此 SQL 语句有什么问题?:
SELECT
*
FROM
hotels
WHERE
MATCH (`HotelNo`,`Hotel`,`Address`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`, `HotelText`, `HotelTextFR`, `tagsforsearch`, `tagsforsearchFR`)
AGAINST( CONVERT('+graubunden' USING latin1) COLLATE latin1_german1_ci IN BOOLEAN MODE)
ORDER BY Country ASC, Region ASC, City ASC
这不会返回任何结果。 有什么想法把狗埋在哪里吗?
Let`s have a example hotels table:
CREATE TABLE `hotels` (
`HotelNo` varchar(4) character set latin1 NOT NULL default '0000',
`Hotel` varchar(80) character set latin1 NOT NULL default '',
`City` varchar(100) character set latin1 default NULL,
`CityFR` varchar(100) character set latin1 default NULL,
`Region` varchar(50) character set latin1 default NULL,
`RegionFR` varchar(100) character set latin1 default NULL,
`Country` varchar(50) character set latin1 default NULL,
`CountryFR` varchar(50) character set latin1 default NULL,
`HotelText` text character set latin1,
`HotelTextFR` text character set latin1,
`tagsforsearch` text character set latin1,
`tagsforsearchFR` text character set latin1,
PRIMARY KEY (`HotelNo`),
FULLTEXT KEY `fulltextHotelSearch` (`HotelNo`,`Hotel`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`,`HotelText`,`HotelTextFR`,`tagsforsearch`,`tagsforsearchFR`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_german1_ci;
In this table for example we have only one hotel with Region name = "Graubünden" (please note umlaut ü character)
And now I want to achieve same search match for phrases:
'graubunden' and
'graubünden'
This is simple with use of MySql built in
collations in regular searches as follows:
SELECT *
FROM `hotels`
WHERE `Region` LIKE CONVERT(_utf8 '%graubunden%' USING latin1)
COLLATE latin1_german1_ci
This works fine for 'graubunden' and 'graubünden' and
as a result I receive proper result, but problem is
when we make MySQL full text search
Whats wrong with this SQL statement?:
SELECT
*
FROM
hotels
WHERE
MATCH (`HotelNo`,`Hotel`,`Address`,`City`,`CityFR`,`Region`,`RegionFR`,`Country`,`CountryFR`, `HotelText`, `HotelTextFR`, `tagsforsearch`, `tagsforsearchFR`)
AGAINST( CONVERT('+graubunden' USING latin1) COLLATE latin1_german1_ci IN BOOLEAN MODE)
ORDER BY Country ASC, Region ASC, City ASC
This doesn`t return any result.
Any ideas where the dog is buried ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
当您为列定义单独的
CHARACTER SETS
时,您会覆盖在表级别设置的默认排序规则。您的每个列都有默认的
latin1
排序规则(即latin1_swedish_ci
)。您可以通过运行SHOW CREATE TABLE
来查看它。在
FULLTEXT
查询中,索引列的COERCIBILITY
为0
,即所有全文查询都会转换为索引中使用的排序规则,反之亦然。您需要从列中删除
CHARACTER SET
定义,或将所有列显式设置为latin1_german_ci
:When you define individual
CHARACTER SETS
for your columns, you override the collation you set default on table level.Each of your columns has default
latin1
collation (which islatin1_swedish_ci
). You can see it by runningSHOW CREATE TABLE
.In
FULLTEXT
queries, indexed columns haveCOERCIBILITY
of0
, that is all fulltext queries are converted to the collation used in the index, not vice versa.You need to remove
CHARACTER SET
definitions from your columns or explicitly set all columns tolatin1_german_ci
: