mysql 在java中选择带有重音utf8字符串的列
我在尝试通过 utf-8 文本列从 Java 中的 MySQL 表中选择数据时遇到了问题。有趣的是,Python 代码运行良好,而 Java 代码则不然。
该表如下所示:
CREATE TABLE `x` (`id` int(10) unsigned NOT NULL AUTO_INCREMENT, `text` varchar(255) COLLATE utf8_bin NOT NULL, PRIMARY KEY (`id`)) ENGINE=MyISAM AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
查询如下所示:
SELECT * FROM x WHERE text = 'ěščřž'"
未按预期工作的 Java 代码如下:
public class test {
public static void main(String [] args) {
java.sql.Connection conn = null;
System.out.println("SQL Test");
try {
Class.forName("com.mysql.jdbc.Driver").newInstance();
conn = java.sql.DriverManager.getConnection(
"jdbc:mysql://127.0.0.1/x?user=root&password=root&characterSet=utf8&useUnicode=true&characterEncoding=utf-8&characterSetResults=utf8");
} catch (Exception e) {
System.out.println(e);
System.exit(0);
}
System.out.println("Connection established");
try {
java.sql.Statement s = conn.createStatement();
java.sql.ResultSet r = s.executeQuery("SELECT * FROM x WHERE text = 'ěščřž'");
while(r.next()) {
System.out.println (
r.getString("id") + " " +
r.getString("text")
);
}
} catch (Exception e) {
System.out.println(e);
System.exit(0);
}
}
}
Python 代码为:
# encoding: utf8
import MySQLdb
conn = MySQLdb.connect (host = "127.0.0.1",
port = 3307,
user = "root",
passwd = "root",
db = "x")
cursor = conn.cursor ()
cursor.execute ("SELECT * FROM x where text = 'ěščřž'")
row = cursor.fetchone ()
print row
cursor.close ()
conn.close ()
两者都以 utf8 编码存储在文件系统上(使用 hexedit 检查)。我尝试过不同版本的mysql-connector(当前使用5.1.15)。 Mysqld 是 5.1.54。
分别记录 Java 代码和 Python 代码的 Mysqld 日志:
110427 12:45:07 1 Connect root@localhost on x
110427 12:45:08 1 Query /* mysql-connector-java-5.1.15 ( Revision: ${bzr.revision-id} ) */SHOW VARIABLES WHERE Variable_name ='language' OR Variable_name = 'net_write_timeout' OR Variable_name = 'interactive_timeout' OR Variable_name = 'wait_timeout' OR Variable_name = 'character_set_client' OR Variable_name = 'character_set_connection' OR Variable_name = 'character_set' OR Variable_name = 'character_set_server' OR Variable_name = 'tx_isolation' OR Variable_name = 'transaction_isolation' OR Variable_name = 'character_set_results' OR Variable_name = 'timezone' OR Variable_name = 'time_zone' OR Variable_name = 'system_time_zone' OR Variable_name = 'lower_case_table_names' OR Variable_name = 'max_allowed_packet' OR Variable_name = 'net_buffer_length' OR Variable_name = 'sql_mode' OR Variable_name = 'query_cache_type' OR Variable_name = 'query_cache_size' OR Variable_name = 'init_connect'
1 Query /* mysql-connector-java-5.1.15 ( Revision: ${bzr.revision-id} ) */SELECT @@session.auto_increment_increment
1 Query SHOW COLLATION
1 Query SET autocommit=1
1 Query SET sql_mode='STRICT_TRANS_TABLES'
1 Query SELECT * FROM x WHERE text = 'ěščřž'
110427 12:45:22 2 Connect root@localhost on x
2 Query set autocommit=0
2 Query SELECT * FROM x where text = 'ěščřž'
2 Quit
有没有人有任何建议,可能是什么原因导致 Python 代码工作而 Java 代码不工作? (不工作是指找不到所需的数据——连接工作正常)
非常感谢。
I have encountered a problem when trying to select data from a table in MySQL in Java by a text column that is in utf-8. The interesting thing is that with code in Python it works well, in Java it doesn't.
The table looks as follows:
CREATE TABLE `x` (`id` int(10) unsigned NOT NULL AUTO_INCREMENT, `text` varchar(255) COLLATE utf8_bin NOT NULL, PRIMARY KEY (`id`)) ENGINE=MyISAM AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
The query looks like this:
SELECT * FROM x WHERE text = 'ěščřž'"
The Java code that doesn't work as exptected is the following:
public class test {
public static void main(String [] args) {
java.sql.Connection conn = null;
System.out.println("SQL Test");
try {
Class.forName("com.mysql.jdbc.Driver").newInstance();
conn = java.sql.DriverManager.getConnection(
"jdbc:mysql://127.0.0.1/x?user=root&password=root&characterSet=utf8&useUnicode=true&characterEncoding=utf-8&characterSetResults=utf8");
} catch (Exception e) {
System.out.println(e);
System.exit(0);
}
System.out.println("Connection established");
try {
java.sql.Statement s = conn.createStatement();
java.sql.ResultSet r = s.executeQuery("SELECT * FROM x WHERE text = 'ěščřž'");
while(r.next()) {
System.out.println (
r.getString("id") + " " +
r.getString("text")
);
}
} catch (Exception e) {
System.out.println(e);
System.exit(0);
}
}
}
The Python code is:
# encoding: utf8
import MySQLdb
conn = MySQLdb.connect (host = "127.0.0.1",
port = 3307,
user = "root",
passwd = "root",
db = "x")
cursor = conn.cursor ()
cursor.execute ("SELECT * FROM x where text = 'ěščřž'")
row = cursor.fetchone ()
print row
cursor.close ()
conn.close ()
Both are stored on the filesystem in utf8 encoding (checked with hexedit). I have tried different versions of mysql-connector (currently using 5.1.15). Mysqld is 5.1.54.
Mysqld log for the Java code and Python code respectively:
110427 12:45:07 1 Connect root@localhost on x
110427 12:45:08 1 Query /* mysql-connector-java-5.1.15 ( Revision: ${bzr.revision-id} ) */SHOW VARIABLES WHERE Variable_name ='language' OR Variable_name = 'net_write_timeout' OR Variable_name = 'interactive_timeout' OR Variable_name = 'wait_timeout' OR Variable_name = 'character_set_client' OR Variable_name = 'character_set_connection' OR Variable_name = 'character_set' OR Variable_name = 'character_set_server' OR Variable_name = 'tx_isolation' OR Variable_name = 'transaction_isolation' OR Variable_name = 'character_set_results' OR Variable_name = 'timezone' OR Variable_name = 'time_zone' OR Variable_name = 'system_time_zone' OR Variable_name = 'lower_case_table_names' OR Variable_name = 'max_allowed_packet' OR Variable_name = 'net_buffer_length' OR Variable_name = 'sql_mode' OR Variable_name = 'query_cache_type' OR Variable_name = 'query_cache_size' OR Variable_name = 'init_connect'
1 Query /* mysql-connector-java-5.1.15 ( Revision: ${bzr.revision-id} ) */SELECT @@session.auto_increment_increment
1 Query SHOW COLLATION
1 Query SET autocommit=1
1 Query SET sql_mode='STRICT_TRANS_TABLES'
1 Query SELECT * FROM x WHERE text = 'ěščřž'
110427 12:45:22 2 Connect root@localhost on x
2 Query set autocommit=0
2 Query SELECT * FROM x where text = 'ěščřž'
2 Quit
Does anybody have any suggestions what might be the cause why the Python code works and why the Java code does not? (by not working I mean not finding the desired data -- the connection works fine)
Many thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
好吧,我的错。数据库建错了。它是通过mysql客户端构建的,默认情况下是latin1,因此在数据库中数据采用utf8编码两次。
问题和两个源代码之间的主要区别在于,Python 代码没有设置默认字符集(因此它是 latin1),而 Java 代码则设置了默认字符集(因此它是 utf8)。因此,许多因素的巧合让我觉得确实发生了一些奇怪的事情。
不管怎样,谢谢你的回复。
Okay, my bad. The database was wrongly built. It was built through the mysql client that by default is latin1 so in the database the data were encoded by utf8 twice.
The problem and the major difference between the two source codes is in that the Python code doesn't set the default charset (therefore it is latin1) whereas the Java code does (therefore it is utf8). So it was coincidence of many factors that made me think that something peculiar is actually going on.
Thanks for your responses anyway.
使用PreparedStatement 并将搜索字符串设置为该语句中的位置参数。
阅读有关PreparedStatements 的教程 -> http://download.oracle.com/javase/tutorial/jdbc/basics /prepared.html
另外,切勿在 Java 代码中创建包含非 ASCII 字符的字符串文字。
如果你想传递非 ASCII 字符,请对其进行 unicode 转义。
这应该能让您了解我在说什么 -> http://en.wikibooks.org/wiki/Java_Programming/Syntax/Unicode_Escape_Sequences
Use PreparedStatement and set your search string as a positional parameter into that statement.
Read this tutorial about PreparedStatements -> http://download.oracle.com/javase/tutorial/jdbc/basics/prepared.html
Also, never create a String literal in Java code that contains non-ASCII characters.
If you want to pass non-ASCII characters do a unicode escaping on them.
This should give you an idea what I am talking about -> http://en.wikibooks.org/wiki/Java_Programming/Syntax/Unicode_Escape_Sequences