希腊语文本显示不正确

发布于 2024-07-30 19:30:49 字数 929 浏览 2 评论 0原文

我们有一个应用程序，它使用 C++ zApp 框架来实现 UI（表单、字体等一切）。我们慢慢地将其转换为使用 .net 框架，最近发现希腊字符不再正确显示。

在该应用程序的一个版本中，我有一个 C# .net 表单和一个 C++ zApp 表单，它们都显示相同的数据。该项目使用 MS Visual Studio 2005 编译并使用 .net 2.0。在 .net 表单中，希腊语无法正确显示。我可以从 .net 表单复制文本，将其粘贴到 zApp 表单中，它将在 zApp 表单中正确显示。这告诉我数据加载正常，并且所有正确的信息都在字符串中。

我尝试更改 .net 代码中使用的字体。 zApp 代码使用 LOGFONT 结构为显示希腊语的控件创建字体。我采用了用于 zApp 的确切值，使用这些值创建了 LOGFONT，并使用该结构设置 .net 表单的字体 (this.Font = Font.FromLogFont((object)lFont);)。我使用了相同的面孔名称、字符集等。LOGFONT 结构中的所有内容都已设置。希腊语仍然显示错误。我可以看出我创建的字体正在被使用，因为如果我设置下划线，它会给文本加下划线，如果我在使用 LOGFONT 设置后查看控件字体（this.Font）的属性，它们就像我想要的那样期望他们是。我最初确实遇到了非 true type 字体的问题，但后来将 zApp 字体切换为 true type 字体，它仍然没问题，所以我在测试中使用了它（Microsoft Sans Serif）。

另外，如果我从键盘输入希腊字符，它们会在 .net 表单和 zApp 表单中正确显示，但是，在 .net 表单中输入并保存到数据库的字符将在 zApp 表单中显示为垃圾，并且会被删除。与zApp表单保存的数据不同。同样，如果我从 .net 表单中复制看起来像垃圾的文本并将其粘贴到 zApp 表单中，那么它会显示得很好（不会丢失数据）。

有人有什么想法吗？

原文

We have an application that used the C++ zApp framework for UI (forms, fonts, everything). We have slowly converted it to use the .net framework and recently found that Greek characters are no longer displaying correctly.

In one version of the application I have a C# .net form and a C++ zApp form that both display the same data. The project is compiled with MS Visual Studio 2005 and uses .net 2.0. In the .net form the Greek is not displayed correctly. I can copy the text from the .net form, paste it into the zApp form and it will display correctly in the zApp form. This tells me that the data is being loaded okay and all the correct information is there in the string.

I tried making changes to the font being used in the .net code. The zApp code creates a font using a LOGFONT struct for the control displaying the Greek. I took the exact values that were being used for zApp, created a LOGFONT with those values and set the .net form's font using that structure (this.Font = Font.FromLogFont((object)lFont);). I used the same facename, charset, etc. Everything in the LOGFONT structure is getting set. The Greek was still displayed wrong. I can tell that the font I created is being used because if I set underline it will underline the text and if I look at the properties of the control's font (this.Font) after setting it with the LOGFONT, they are as I'd expect them to be. I did initially have issues with a font that wasn't a true type font, but then switched the zApp font to a true type font and it was still fine so I used that for my tests (Microsoft Sans Serif).

Also, if I type Greek characters from the keyboard they display correctly in both the .net form and the zApp form, however, the characters entered in the .net form and saved to the database will then show as garbage in the zApp forms and are different from the data saved by the zApp form. Again, if I copy the text that looks like garbage from the .net form and paste it into the zApp form then it displays just fine (no loss of data).

Does anyone have any ideas?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

自演自醉 2024-08-06 19:30:49

我用 C# 创建了一个小型测试应用程序，并制作了一个带有一些希腊文本的按钮：ελληνικά。当我在按钮中设置文本后，Visual Studio 就询问我是否要切换到 Unicode，我说“是”。之后，希腊文本显示在我的按钮上。

我怀疑 Visual Studio 中或应用程序配置的某些属性中存在需要正确设置的设置。

编辑：

您的回答中的进一步信息使我相信 Oracle 数据库中的文本可能是 UTF-8。如果是，则使用一些高位来定义给定字符中是否有更多字节。因此，并非所有字符的字节长度都相同！您的解决方案可能不起作用。我建议尝试使用加载它

Encoding.UTF8.GetString()

I created a small test app in C#, and made a button with some Greek text: ελληνικά. As soon as I set the text in the button, Visual Studio asked me if I wanted to switch to Unicode, I said 'yes'. After that, the Greek text showed on my button.

I suspect that there's a setting either in Visual Studio or some property of your application configuration that needs to be set correctly.

Edit:

Your further information in your answer leads me to believe the text from the Oracle database might be UTF-8. If it is, then some of the high-order bits are used to define whether there are more bytes in the given character. Thus, not all characters are the same byte length! You solution might not work. I suggest trying to load it using

Encoding.UTF8.GetString()

回复收藏 0 原文

不打扰别人 2024-08-06 19:30:49

我想出了如何让文本在 .net 表单中正确显示。它实际上与字体无关，更多地与将数据转换为 .net 有关。我更改的代码基本上是这样的：

string Name = reader.GetString(column);

我

string Name = System.Text.Encoding.Default.GetString(reader.GetOracleString(column).GetNonUnicodeBytes());

仍然必须验证这不会对客户端使用的任何其他运行良好的语言造成问题，但到目前为止，希腊语和英语看起来不错。

现在，在添加 OracleCommand 参数进行保存时，我需要反转该过程。最初的代码是这样的：

cmd.Parameters.Add(new OracleParameter(":name", Name));

这样可以节省垃圾。字符串“Name”的值看起来不错。有效的非托管 C++ 代码只是将 sql 语句放在字符数组中（希腊文本也始终在字符数组中处理），并通过调用 OCI 函数（Oracle 的 API）来执行它。 .net 代码使用 ODAC（Oracle 数据访问客户端）进行数据库访问。

更新：

我已经解决了问题的第二部分（保存）并了解了更多关于正在发生的事情。

当我将数据放入 .net 字符串数据类型而不进行任何转换时，从 Oracle 传入 .net 的数据在内存中看起来像这样：

00 0a 33 79 07 00 00 00 06 00 00 00 d4 00 e1 00 ec 00 e5 00 df 00 ef 00 00 00 00 00 00 00 00 00 00 00 00 ..3y........Τ.α.μ.ε.ί.ο......

这个字符串在 .net 中错误地显示为：
Ôáìåßï

转换后.net字符串的内存内容（转换代码如上所示）：
00 0a 33 79 07 00 00 00 06 00 00 00 a4 03 b1 03 bc 03 b5 03 af 03 bf 03 00 00 00 00 00 00 00 00 00 00 00 ..3y........¤.±。 Ό.μ.―.Ώ............

可以看到，对于每个字符，低字节的高半字节中取出了3，放入高字节。
该字符串现在在 .net 中正确显示为：
正如

上面的信息所示，.net 表示字符的方式似乎与非托管 C++ 和 Oracle 不同。我做了一些测试，发现断点是160（十六进制值a0）。因此，当使用 0 到 159（00 到 9f）的字符值时，没有区别。一旦使用160或更高的值，就会出现差异。

我的解决方案仅适用于 0 到 255 之间的字符值，因为我在转换中删除了字符的高字节。不过，这应该适用于我们的应用程序，因为我们从来不支持多字节字符集。

我正在做的将字符串转换回保存到 Oracle 的格式的简化版本是：

//"name" represents a .net string data type containing the data to save  

char[] textChars = new char[4000]; //4000 is the max varchar2 column size in Oracle  
byte[] textBytes;  
int index = 0;  
textBytes = (System.Text.Encoding.Default.GetBytes((name).ToCharArray()));  
foreach (byte textByte in textBytes)  
{  
    textChars[index++] = (char)textByte;  
}  
string textString = new string(textChars, 0, index);  
cmd.Parameters.Add(new OracleParameter(":name", (object)(textString)));

这整个事情就是一个 hack - 如果有人有更好的方法，请分享它。似乎应该有一些简单的方法来处理整个问题。

I figured out how to get the text to display correctly in the .net form. It actually had nothing to do with the font and more to do with converting the data for .net. I have changed code that was basically like this:

string Name = reader.GetString(column);

string Name = System.Text.Encoding.Default.GetString(reader.GetOracleString(column).GetNonUnicodeBytes());

I will still have to verify that this does not cause problems for any of the other languages clients use that have been working fine, but so far it looks good with Greek and English.

Now I need to reverse that process when adding the OracleCommand parameter for saving. The original code went something like this:

cmd.Parameters.Add(new OracleParameter(":name", Name));

which saves garbage. The value of the string "Name" looks fine. The unmanaged C++ code that works just puts together a sql statement in a character array (the Greek text is always handled in a char array too) and executes it with a call to an OCI function (Oracle's API). The .net code is using ODAC (Oracle Data Access Client) for database access.

UPDATE:

I have solved the second part of my problem (saving) and learned more about what is happening.

The data coming in to .net from Oracle looks like this in memory when I put it into a .net string data type without doing any conversion:

00 0a 33 79 07 00 00 00 06 00 00 00 d4 00 e1 00 ec 00 e5 00 df 00 ef 00 00 00 00 00 00 00 00 00 00 00 00 ..3y........Τ.α.μ.ε.ί.ο............

This string displays incorrectly in .net as:
Ôáìåßï

The memory contents of the .net string after the conversion (conversion code shown above):
00 0a 33 79 07 00 00 00 06 00 00 00 a4 03 b1 03 bc 03 b5 03 af 03 bf 03 00 00 00 00 00 00 00 00 00 00 00 ..3y........¤.±.Ό.µ.―.Ώ............

You can see that for every character, 3 has been taken from the high nibble of the low byte and put into the high byte.
The string now displays correctly in .net as:
Ταμείο

As the information above shows, it seems that .net represents characters differently than unmanaged C++ and Oracle. I did some tests and found that the breaking point is 160 (hex value a0). So when using character values of 0 to 159 (00 to 9f), there is no difference. As soon as a value of 160 or higher is used, there will be a difference.

My solution will only work for character values between 0 and 255 because I'm dropping the high byte of the character in my conversions. This should work for our application though since we have never supported multibyte character sets anyway.

The simplified version of what I'm doing to convert the string back to a format for saving to Oracle is:

//"name" represents a .net string data type containing the data to save  

char[] textChars = new char[4000]; //4000 is the max varchar2 column size in Oracle  
byte[] textBytes;  
int index = 0;  
textBytes = (System.Text.Encoding.Default.GetBytes((name).ToCharArray()));  
foreach (byte textByte in textBytes)  
{  
    textChars[index++] = (char)textByte;  
}  
string textString = new string(textChars, 0, index);  
cmd.Parameters.Add(new OracleParameter(":name", (object)(textString)));

This whole thing is such a hack - if anyone has a better way, please share it. It seems like there ought to be some simple way of handling this entire problem.

回复收藏 0 原文

~没有更多了~