SQL按字符串计数不正确
好的,我检查了一下,似乎没有人问过这个问题。
所以我有两个词:
thiep cuoi
thiệp cưới
问题是当我 COUNT() 这些 mysql 会将这两个组合为相同。例如这个sql:
#lets assume these two words have an id of 1 and 2 and that the column name
#in the table is `word`
SELECT `word`, COUNT(`word`)
FROM table_name
WHERE `id` IN(1,2)
GROUP BY `word`;
将返回两个单词作为一行,计数为2。这些不是通过UTF-8的相同单词,我如何在MySQL中绕过这种行为? MySQL 不是按 UTF-8 分组而不转换为 ASCII 吗? :/
OK so I checked and it doesn't seem someone asked this question.
So I have two words:
thiep cuoi
thiệp cưới
The problem is when I COUNT() these mysql will combine these two as the same. For instance this sql:
#lets assume these two words have an id of 1 and 2 and that the column name
#in the table is `word`
SELECT `word`, COUNT(`word`)
FROM table_name
WHERE `id` IN(1,2)
GROUP BY `word`;
Will return the two words as one row with a count of 2. These are not the same words via UTF-8, how can I bypass this behavior in MySQL? Doesn't MySQL group by UTF-8 and not convert to ASCII? : /
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
MySQL 使用您在列上设置的排序规则来确定单词中字母的相等性。无论您在此列上设置何种排序规则,出于比较目的,这些字母都被视为相同。 MySQL 不会进行任何类型的转换或丢弃任何数据。
http://dev.mysql.com/doc/refman/5.5 /en/charset-general.html
您可能正在使用常见的东西,例如 latin1_general_ci 或 utf8_general_ci。如果您希望这些字母被视为不同,那么您可能需要二进制排序规则。运行
SHOW COLLATION LIKE 'utf8%'
等查询来查看服务器上可用的内容。MySQL uses the collation you set on the column to determine equality of letters in words. In whatever collation you've set on this column, those letters are considered equal for the purposes of comparison. MySQL is not doing any kind of conversion or throwing away any data.
http://dev.mysql.com/doc/refman/5.5/en/charset-general.html
You're probably using something common like latin1_general_ci or utf8_general_ci. If you want those letters treated as different then you probably want a binary collation. Run a query like
SHOW COLLATION LIKE 'utf8%'
to see what's available on your server.