java中通过socket的Unicode文本

发布于 2025-01-07 23:39:17 字数 282 浏览 0 评论 0原文

我在套接字编程中面临一个小问题(我相信)。从非英语语言发送文本时,我收到乱码结果。经过谷歌大量研究后,我做了一些修正。我将 getBytes() 更改为 getBytes("UTF-8") 并尝试发送一些阿拉伯文本。

当本地连接套接字时,它工作正常。我看到了我期望的阿拉伯文本。但在线测试时,结果显示奇怪/乱码。

以下是我尝试过的文本:

“Μ?巴”(这是“你好”的阿拉伯文本),它向我显示为“مرØا”,

请帮助我解决此问题。

I am facing a tiny issue (I believe) in socket programming. When sending text from non-English languages, I get garbled results. After a lot of researching on google, I made some corrections. I changed getBytes() to getBytes("UTF-8") and tried to send some Arabic text.

When connecting sockets locally, it works fine. I see the arabic text I expected. But when testing from online, the results display strange/garbled characters.

Here is the text I tried:

"مرحبا" (this is the arab text of "hello") which displayed to me as "مرحبا"

Please help me in resolving this issue.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

却一份温柔 2025-01-14 23:39:17

这是我手头上的一些 Java 代码,用于在一对字节流上设置流编码,但是您可以对单例执行相同的操作,至少假设您使用的是 TCP 流套接字而不是 UDP 数据报。

    Process
    slave_process = Runtime.getRuntime().exec("cmdname -opts cmdargs");

 OutputStream
 __bytes_into_his_stdin  = slave_process.getOutputStream();

 OutputStreamWriter
   chars_into_his_stdin  = new OutputStreamWriter(
                             __bytes_into_his_stdin,
         /* DO NOT OMIT! */  Charset.forName("UTF-8").newEncoder()
                         );

 InputStream
 __bytes_from_his_stdout = slave_process.getInputStream();

 InputStreamReader
   chars_from_his_stdout = new InputStreamReader(
                             __bytes_from_his_stdout,
         /* DO NOT OMIT! */  Charset.forName("UTF-8").newDecoder()
                         );

 InputStream
 __bytes_from_his_stderr = slave_process.getErrorStream();

 InputStreamReader
   chars_from_his_stderr = new InputStreamReader(
                             __bytes_from_his_stderr,
         /* DO NOT OMIT! */  Charset.forName("UTF-8").newDecoder()
                         );

This is some Java code I had lying around that’s used for setting the stream encodings on a pair of byte streams, but you could do the same with a singleton, at least assuming you’re using TCP stream sockets not UDP datagrams.

    Process
    slave_process = Runtime.getRuntime().exec("cmdname -opts cmdargs");

 OutputStream
 __bytes_into_his_stdin  = slave_process.getOutputStream();

 OutputStreamWriter
   chars_into_his_stdin  = new OutputStreamWriter(
                             __bytes_into_his_stdin,
         /* DO NOT OMIT! */  Charset.forName("UTF-8").newEncoder()
                         );

 InputStream
 __bytes_from_his_stdout = slave_process.getInputStream();

 InputStreamReader
   chars_from_his_stdout = new InputStreamReader(
                             __bytes_from_his_stdout,
         /* DO NOT OMIT! */  Charset.forName("UTF-8").newDecoder()
                         );

 InputStream
 __bytes_from_his_stderr = slave_process.getErrorStream();

 InputStreamReader
   chars_from_his_stderr = new InputStreamReader(
                             __bytes_from_his_stderr,
         /* DO NOT OMIT! */  Charset.forName("UTF-8").newDecoder()
                         );
大姐,你呐 2025-01-14 23:39:17

也许,您忘记在创建字符串时指定编码。

byte[] utf8bytes = yourString.getBytes("UTF-8");       // encoding
String otherString = new String(utf8bytes, "UTF-8");   // decoding

Perhaps, you forgot to specify encoding on string creation.

byte[] utf8bytes = yourString.getBytes("UTF-8");       // encoding
String otherString = new String(utf8bytes, "UTF-8");   // decoding
呆° 2025-01-14 23:39:17

我认为解决这个问题的最简单方法是使用一个序列化对象,该对象有一个字符串容器,其中包含阿拉伯文文本。

不要直接写入字节,而是使用:

ObjectOutputStream oos = yourSocket.getOutputStream();
oos.writeObject(yourContainer);

然后在接收端,执行以下操作:

if (receivedObject instanceof YourContainer) {
    // get out arabic string
}

I think the easiest way to solve this would be to use a Serialized object that has a String container with your arabic text inside it.

Don't write the bytes directly, instead use:

ObjectOutputStream oos = yourSocket.getOutputStream();
oos.writeObject(yourContainer);

Then on the receiving end, do this:

if (receivedObject instanceof YourContainer) {
    // get out arabic string
}
囍孤女 2025-01-14 23:39:17

如果有人仍在尝试解决此问题:

在您的套接字响应中:

HTTP/1.1 200 OK\r\n
Content-Type: text/html; charset=utf8\r\n\r\n

只是不要忘记将字符集设置为 utf8 的 Content-Type
它应该适用于阿拉伯字母。

If anyone still trying to solve this :

in your Socket response:

HTTP/1.1 200 OK\r\n
Content-Type: text/html; charset=utf8\r\n\r\n

Just don't forget the Content-Type with charset set to utf8
it should work with Arabic letters.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文