C# TCP 服务器和 Java TCP 客户端之间的编码问题

发布于 2024-12-01 18:28:01 字数 1183 浏览 4 评论 0原文

我面临一些编码问题,我无法找到正确的解决方案。

我有一个 C# TCP 服务器,作为接收和响应 XML 的窗口服务运行,当在输出中传递特殊字符(例如带重音的西班牙语字符(如 á、é、í 等))时,问题就会出现。

服务器响应被编码为 UTF-8,并且 java 客户端正在使用 UTF-8 读取。但是当我打印它的输出时,字符完全不同。

此问题仅发生在 Java 客户端中(C# TCP 客户端按预期工作)。

以下是显示编码问题的服务器代码片段: C# 服务器

   byte[] destBytes = System.Text.Encoding.UTF8.GetBytes("á");
    try
    {
       clientStream.Write(destBytes, 0, destBytes.Length);
       clientStream.Flush();
    }catch (Exception ex)
    {
       LogErrorMessage("Error en SendResponseToClient: Detalle::", ex);
    }

Java 客户端:

socket.connect(new InetSocketAddress(param.getServerIp(), param.getPort()), 20000);
InputStream sockInp = socket.getInputStream();
InputStreamReader streamReader = new InputStreamReader(sockInp, Charset.forName("UTF-8"));
sockReader =  new BufferedReader(streamReader);
String tmp = null;
while((tmp = sockReader.readLine()) != null){
  System.out.println(tmp);
}

对于这个简单的测试,输出显示为:

ß

我做了一些测试,在每种语言上打印出 byte[],而在 C# 上则输出为: 195, 161

在 java byte[] 中将 print 读为: -61,-95

这与字节类型的有符号(java)、无符号(C#)有关吗?

非常感谢任何反馈。

i'm facing some encoding issue which i'm not able to find the correct solution.

I have a C# TCP server, running as a window service which received and respond XML, the problem comes down when passing special characters in the output such as spanish characters with accents (like á,é,í and others).

Server response is being encoded as UTF-8, and java client is reading using UTF-8. But when i print its output the character is totally different.

This problem only happens in Java client(C# TCP client works as expected).

Following is an snippet of the server code that shows the encoding issue:
C# Server

   byte[] destBytes = System.Text.Encoding.UTF8.GetBytes("á");
    try
    {
       clientStream.Write(destBytes, 0, destBytes.Length);
       clientStream.Flush();
    }catch (Exception ex)
    {
       LogErrorMessage("Error en SendResponseToClient: Detalle::", ex);
    }

Java Client:

socket.connect(new InetSocketAddress(param.getServerIp(), param.getPort()), 20000);
InputStream sockInp = socket.getInputStream();
InputStreamReader streamReader = new InputStreamReader(sockInp, Charset.forName("UTF-8"));
sockReader =  new BufferedReader(streamReader);
String tmp = null;
while((tmp = sockReader.readLine()) != null){
  System.out.println(tmp);
}

For this simple test, the output show is:

ß

I did some testing printing out the byte[] on each language and while on C# á output as:
195, 161

In java byte[] read print as:
-61,-95

Will this have to do with the Signed (java), UnSigned (C#) of byte type?.

Any feedback is greatly appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

笑忘罢 2024-12-08 18:28:01

对我来说,这似乎是一个字节顺序问题...您可以通过在打印字符串之前反转 Java 中的字节来检查这一点...

这通常可以通过包含 BOM 来解决...参见 http://de.wikipedia.org/wiki/Byte_Order_Mark

To me this seems like an endianess problem... you can check that by reversing the bytes in Java before printing the string...

which usually would be solved by including a BOM... see http://de.wikipedia.org/wiki/Byte_Order_Mark

Saygoodbye 2024-12-08 18:28:01

您确定这不是您尝试将字节编码为 UTF-8 数据的 unicode 字符吗?

我发现下面有一种有用的方法可以在发送之前测试该字符串中的数据是否正确 UTF-8。

如何测试应用程序正确的编码(例如UTF-8)

Are you sure that's not a unicode character you are attemping to encode to bytes as UTF-8 data?

I found the below has a useful way of testing to see if the data in that string is ccorrect UTF-8 before you send it.

How to test an application for correct encoding (e.g. UTF-8)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文