ÅØÆ - HTML 文本编码出现问题 -> Java-> MySQL-> Java->超文本标记语言

发布于 2024-12-04 16:06:34 字数 1879 浏览 0 评论 0原文

我目前正在开发我的主页,该主页完全由许多 java 类和 MYSQL 数据库支持。

我有一个 HTML 表单,允许查看者输入评论。然后,该文本由 CGI 脚本解析到 java 类,我在其中读取文本:

BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
String[] data = {in.readLine()};

然后使用以下内容将注释解析到数据库:

Connection conn;
        forName("com.mysql.jdbc.Driver").newInstance();
        //String url = "jdbc:mysql://localhost/pagebuilder";
        String url = "jdbc:mysql://localhost/pagebuilder?useUnicode=true&characterEncoding=utf-8";
        //String url = "jdbc:mysql://localhost/pagebuilder?characterEncoding=utf-8";       
        String userName = "username";
        String password = "password";
        conn = DriverManager.getConnection(url, userName, password);

        return conn;
    }

    public static void closeConnection(Connection conn) throws SQLException {
        conn.close();
    }

    public static void comment(String image, String name, String comment, String email){

    Connection conn = null;
    try {
        conn = Database.getConnection();
    }
    catch (Exception e) {
        e.printStackTrace();
    }
    if (conn != null) {
        try {
            java.sql.Timestamp  sqlDate = new java.sql.Timestamp(new java.util.Date().getTime());

            PreparedStatement pstmt1 = conn.prepareStatement("INSERT INTO comment VALUES(0,?,?,?,?,?)");
            pstmt1.setTimestamp(1,sqlDate);
            pstmt1.setString(2, image);
            pstmt1.setString(3, name);
            pstmt1.setString(4, comment);
            pstmt1.setString(5, email);

            pstmt1.executeUpdate();
            conn.close();
        }

如果我输入特殊的丹麦字符,例如 æøp 甚至逗号,输出如下:

输入:,æøå

输出:%2C%C3%A6%C3%B8%C3%A5

如何保持输入和输出相同?

我曾多次尝试将 HTML、Java 连接和数据库设置为 UTF-8,但没有成功。

我能做些什么?

I am currently working on my homepage which is entirely supported by a number of java classes and a MYSQL database.

I have a form in HTML where I allow viewers to input a comment. This text is then parsed by a CGI script to the java class where i read the text with:

BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
String[] data = {in.readLine()};

The comment is then parsed on to the database with the following:

Connection conn;
        forName("com.mysql.jdbc.Driver").newInstance();
        //String url = "jdbc:mysql://localhost/pagebuilder";
        String url = "jdbc:mysql://localhost/pagebuilder?useUnicode=true&characterEncoding=utf-8";
        //String url = "jdbc:mysql://localhost/pagebuilder?characterEncoding=utf-8";       
        String userName = "username";
        String password = "password";
        conn = DriverManager.getConnection(url, userName, password);

        return conn;
    }

    public static void closeConnection(Connection conn) throws SQLException {
        conn.close();
    }

    public static void comment(String image, String name, String comment, String email){

    Connection conn = null;
    try {
        conn = Database.getConnection();
    }
    catch (Exception e) {
        e.printStackTrace();
    }
    if (conn != null) {
        try {
            java.sql.Timestamp  sqlDate = new java.sql.Timestamp(new java.util.Date().getTime());

            PreparedStatement pstmt1 = conn.prepareStatement("INSERT INTO comment VALUES(0,?,?,?,?,?)");
            pstmt1.setTimestamp(1,sqlDate);
            pstmt1.setString(2, image);
            pstmt1.setString(3, name);
            pstmt1.setString(4, comment);
            pstmt1.setString(5, email);

            pstmt1.executeUpdate();
            conn.close();
        }

If I input special Danish characters like æøp or even commas the output is as follows:

Input: ,æøå

Output: %2C%C3%A6%C3%B8%C3%A5

How do I keep input and output the same?

I have made several attempts of setting the HTML, Java connection and the database to UTF-8 but with no luck.

What can I do?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

多像笑话 2024-12-11 16:06:34

您的 CGI 程序正在接收这样编码的文本。

%2C 是逗号的 urlencoded 版本(十六进制为 0x2c,十进制为 32+12 = 44 -- 44 是 ascii 逗号 http://www.asciitable.com/index/asciifull.gif)。

%C3%A6 是 æ 的 UTF-8 编码版本的 urlencoded 版本

%C3%B8 是 ø 的 UTF-8 编码版本的 urlencoded 版本

%C3%A5 是 UTF-8 编码版本的 urlencoded 版本å

您需要做的是:
(a) 将原始 urlencoded 流转换为 urldecoded 流;进而
(b) 将 urldecoded 流解释为 UTF-8

Your CGI program is receiving the text encoded like that.

%2C is the urlencoded version of a comma (0x2c in hex, 32+12 = 44 in decimal -- 44 is ascii comma http://www.asciitable.com/index/asciifull.gif).

%C3%A6 is the urlencoded version of the UTF-8 encoded version of æ

%C3%B8 is the urlencoded version of the UTF-8 encoded version of ø

%C3%A5 is the urlencoded version of the UTF-8 encoded version of å

What you need to do is:
(a) convert your raw urlencoded stream to a urldecoded stream; and then
(b) interpret your urldecoded stream as UTF-8

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文