ÅØÆ - HTML 文本编码出现问题 -> Java-> MySQL-> Java->超文本标记语言
我目前正在开发我的主页,该主页完全由许多 java 类和 MYSQL 数据库支持。
我有一个 HTML 表单,允许查看者输入评论。然后,该文本由 CGI 脚本解析到 java 类,我在其中读取文本:
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
String[] data = {in.readLine()};
然后使用以下内容将注释解析到数据库:
Connection conn;
forName("com.mysql.jdbc.Driver").newInstance();
//String url = "jdbc:mysql://localhost/pagebuilder";
String url = "jdbc:mysql://localhost/pagebuilder?useUnicode=true&characterEncoding=utf-8";
//String url = "jdbc:mysql://localhost/pagebuilder?characterEncoding=utf-8";
String userName = "username";
String password = "password";
conn = DriverManager.getConnection(url, userName, password);
return conn;
}
public static void closeConnection(Connection conn) throws SQLException {
conn.close();
}
public static void comment(String image, String name, String comment, String email){
Connection conn = null;
try {
conn = Database.getConnection();
}
catch (Exception e) {
e.printStackTrace();
}
if (conn != null) {
try {
java.sql.Timestamp sqlDate = new java.sql.Timestamp(new java.util.Date().getTime());
PreparedStatement pstmt1 = conn.prepareStatement("INSERT INTO comment VALUES(0,?,?,?,?,?)");
pstmt1.setTimestamp(1,sqlDate);
pstmt1.setString(2, image);
pstmt1.setString(3, name);
pstmt1.setString(4, comment);
pstmt1.setString(5, email);
pstmt1.executeUpdate();
conn.close();
}
如果我输入特殊的丹麦字符,例如 æøp
甚至逗号,输出如下:
输入:,æøå
输出:%2C%C3%A6%C3%B8%C3%A5
如何保持输入和输出相同?
我曾多次尝试将 HTML、Java 连接和数据库设置为 UTF-8,但没有成功。
我能做些什么?
I am currently working on my homepage which is entirely supported by a number of java classes and a MYSQL database.
I have a form in HTML where I allow viewers to input a comment. This text is then parsed by a CGI script to the java class where i read the text with:
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
String[] data = {in.readLine()};
The comment is then parsed on to the database with the following:
Connection conn;
forName("com.mysql.jdbc.Driver").newInstance();
//String url = "jdbc:mysql://localhost/pagebuilder";
String url = "jdbc:mysql://localhost/pagebuilder?useUnicode=true&characterEncoding=utf-8";
//String url = "jdbc:mysql://localhost/pagebuilder?characterEncoding=utf-8";
String userName = "username";
String password = "password";
conn = DriverManager.getConnection(url, userName, password);
return conn;
}
public static void closeConnection(Connection conn) throws SQLException {
conn.close();
}
public static void comment(String image, String name, String comment, String email){
Connection conn = null;
try {
conn = Database.getConnection();
}
catch (Exception e) {
e.printStackTrace();
}
if (conn != null) {
try {
java.sql.Timestamp sqlDate = new java.sql.Timestamp(new java.util.Date().getTime());
PreparedStatement pstmt1 = conn.prepareStatement("INSERT INTO comment VALUES(0,?,?,?,?,?)");
pstmt1.setTimestamp(1,sqlDate);
pstmt1.setString(2, image);
pstmt1.setString(3, name);
pstmt1.setString(4, comment);
pstmt1.setString(5, email);
pstmt1.executeUpdate();
conn.close();
}
If I input special Danish characters like æøp
or even commas the output is as follows:
Input: ,æøå
Output: %2C%C3%A6%C3%B8%C3%A5
How do I keep input and output the same?
I have made several attempts of setting the HTML, Java connection and the database to UTF-8 but with no luck.
What can I do?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您的 CGI 程序正在接收这样编码的文本。
%2C 是逗号的 urlencoded 版本(十六进制为 0x2c,十进制为 32+12 = 44 -- 44 是 ascii 逗号 http://www.asciitable.com/index/asciifull.gif)。
%C3%A6 是 æ 的 UTF-8 编码版本的 urlencoded 版本
%C3%B8 是 ø 的 UTF-8 编码版本的 urlencoded 版本
%C3%A5 是 UTF-8 编码版本的 urlencoded 版本å
您需要做的是:
(a) 将原始 urlencoded 流转换为 urldecoded 流;进而
(b) 将 urldecoded 流解释为 UTF-8
Your CGI program is receiving the text encoded like that.
%2C is the urlencoded version of a comma (0x2c in hex, 32+12 = 44 in decimal -- 44 is ascii comma http://www.asciitable.com/index/asciifull.gif).
%C3%A6 is the urlencoded version of the UTF-8 encoded version of æ
%C3%B8 is the urlencoded version of the UTF-8 encoded version of ø
%C3%A5 is the urlencoded version of the UTF-8 encoded version of å
What you need to do is:
(a) convert your raw urlencoded stream to a urldecoded stream; and then
(b) interpret your urldecoded stream as UTF-8