.Net 中包含希伯来字母和数字的 Unicode 字符串

发布于 2024-11-18 19:54:16 字数 589 浏览 3 评论 0原文

尝试创建包含希伯来字母和数字的字符串时出现奇怪的行为。数字将始终显示在字母的左侧。例如:

string A = "\u05E9"; //A Hebrew letter
string B = "23";
string AB = A + B;
textBlock1.Text = AB;
//Ouput bug - B is left to A.

此错误仅在同时使用希伯来字母和数字时才会发生。当从方程中省略其中之一时,错误不会发生:

string A = "\u20AA"; //Some random Unicode.
string B = "23";
string AB = A + B;
textBlock1.Text = AB;
//Output OK.

string A = "\u05E9"; //A Hebrew letter.
string B = "HELLO";
string AB = A + B;
textBlock1.Text = AB;
//Output OK.

我尝试使用 FlowDirection 属性,但它没有帮助。

一种使文本在第一个代码示例中正确显示的解决方法将受到欢迎。

There is a strange behavior when trying to create string which contains a Hebrew letter and a digit. The digit will always be displayed left to the letter. For example:

string A = "\u05E9"; //A Hebrew letter
string B = "23";
string AB = A + B;
textBlock1.Text = AB;
//Ouput bug - B is left to A.

This bug only happens when using both a Hebrew letter and digits. When omitting one of those from the equation the bug won't happen:

string A = "\u20AA"; //Some random Unicode.
string B = "23";
string AB = A + B;
textBlock1.Text = AB;
//Output OK.

string A = "\u05E9"; //A Hebrew letter.
string B = "HELLO";
string AB = A + B;
textBlock1.Text = AB;
//Output OK.

I tried playing with FlowDirection property but it didn't help.

A workaround to get the text displayed properly in the first code exmaple would be welcomed.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

别把无礼当个性 2024-11-25 19:54:16

unicode 字符“RTL 标记”(U+200F) 和“LTR 标记”(U+200E) 正是为此目的而创建的。

在您的示例中,只需在希伯来语字符后面放置一个 LTR 标记,然后数字就会根据您的意愿显示在希伯来语字符的右侧。

所以你的代码将调整如下:

string A = "\u05E9"; //A Hebrew letter
string LTRMark = "\u200E"; 
string B = "23";
string AB = A + LTRMark + B;

The unicode characters "RTL mark" (U+200F) and "LTR mark" (U+200E) were created precisely for this purpose.

In your example, simply place an LTR mark after the Hebrew character, and the numbers will then be displayed to the right of the Hebrew character, as you wish.

So your code would be adjusted as follows:

string A = "\u05E9"; //A Hebrew letter
string LTRMark = "\u200E"; 
string B = "23";
string AB = A + LTRMark + B;
我是男神闪亮亮 2024-11-25 19:54:16

这是因为 Unicode 双向算法。如果我理解正确的话,unicode 字符有一个“标识符”,说明当它位于另一个单词旁边时它应该在哪里。

在本例中,\u05E9 表示它应该位于左侧。即使您这样做:

var ab = string.Format("{0}{1}", a, b);

您仍然会将其放在左侧。但是,如果您采用另一个 unicode 字符,例如 \u05D9,则该字符将添加到右侧,因为该字符并未被认为位于左侧。

这是语言的布局,输出时布局引擎将根据语言布局进行输出。

This is because of Unicode Bidirectional Algorithms. If I understand this correctly, the unicode character has an "identifier" that says where it should be when it's next to another word.

In this case \u05E9 says that it should be to the left. Even if you do:

var ab = string.Format("{0}{1}", a, b);

You will still get it to the left. However, if you take another unicoded character such as \u05D9 this will be added to the right because that character is not said to be on the left.

This is the layout of the language and when outputting this the layout enginge will output it according to the language layout.

黎夕旧梦 2024-11-25 19:54:16

这种奇怪的行为有解释。带有 unicode 字符的数字被视为 unicode 字符串的一部分。由于希伯来语语言是从右向左读取的,因此场景将

string A = "\u05E9"; //A Hebrew letter
string B = "23";
string AB = A + B;

先出现 B,然后是 A

第二种情况:

string A = "\u20AA"; //Some random Unicode.
string B = "23";
string AB = A + B;

A 是某种 unicode,不是从右向左读取的 lang 的一部分。所以输出是 - 首先 A 后跟 B

现在考虑我自己的场景,

string A = "\u05E9";
string B = "\u05EA";
string AB = A + B;

AB 都是从右到左读取语言的一部分,因此 ABB随后是A。不是 A 后跟 B

编辑,回答评论

考虑到这种情况 -

string A = "\u05E9"; //A Hebrew letter
string B = "23";
string AB = A + B;

唯一的解决方案,获取字母后跟数字,是:string AB = B + A;

Prolly,而不是a一般情况下都有效的解决方案。所以,我想你必须实现一些检查条件并根据要求构建字符串。

That strange Behavior has explanation. Digits with unicode chars are treated as a part of unicode string. and as Hebrew lang is read right to left, scenario will give

string A = "\u05E9"; //A Hebrew letter
string B = "23";
string AB = A + B;

B comes first, followed by A.

second scenario:

string A = "\u20AA"; //Some random Unicode.
string B = "23";
string AB = A + B;

A is some unicode, not part of lang that is read right to left. so output is - first A followed by B.

now consider my own scenario

string A = "\u05E9";
string B = "\u05EA";
string AB = A + B;

both A and B are part of right to left read lang, so AB is B followed by A. not A followed by B.

EDITED, to answer the comment

taking into account this scenario -

string A = "\u05E9"; //A Hebrew letter
string B = "23";
string AB = A + B;

The only solution, to get letter followed by digit, is : string AB = B + A;

prolly, not a solution that will work in general. So, I guess u have to implement some checking conditions and build string according the requirements.

向日葵 2024-11-25 19:54:16
string A = "\u05E9"; //A Hebrew letter
string B = "23";
string AB = B + A; // !
textBlock1.Text = AB;
textBlock1.FlowDirection = FlowDirection.RightToLeft;
//Ouput Ok - A is left to B as intended.
string A = "\u05E9"; //A Hebrew letter
string B = "23";
string AB = B + A; // !
textBlock1.Text = AB;
textBlock1.FlowDirection = FlowDirection.RightToLeft;
//Ouput Ok - A is left to B as intended.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文