icu4c--> ushape.c 塑造过程中缺少字符?
在我们的语言中,我们使用阿拉伯字符进行书写,但存在一些差异, icu 的 ushape.c (阿拉伯语整形器)仅适用于主要阿拉伯语字符,不会塑造我的语言特定字符(即 0x6D5 等),我更改了 ushape.c 以适用于我的语言,除了字符外,它运行良好,即是 0x649,在阿拉伯语中它们只有 2 个形状,在我的语言中我们有 4 个形状。
我将第 183 行更改
1 + 256 * 0x7F,/*0x0649*/
为
1+2+8 + 256 * 0x98 /*0x649*/
并将第 121 行更改
static const UChar yehHamzaToYeh[] =
{
/* isolated*/ 0xFEEF,
/* final */ 0xFEF0
};
为
static const UChar yehHamzaToYeh[] =
{
/* isolated */0xFEEF,
0xFBE8, // my language specific
0xFBE9,// my language specific
/* final */ 0xFEF0
};
现在它可以毫无问题地生成 3 个形状(开始、孤立和最终),但中间形状显示为正方形(缺少字符)。
我尝试用其他数字替换“* 0x98”,但这是我能得到的最好的。
我应该怎么办 ?
in our langauge we use arabic characters in writing with some differences,
icu's ushape.c ( arabic shaper) only works with main arabic characters and dosn't shape my language specific characters ( i.e 0x6D5 etc) i'v changed ushape.c to work with my language and it worked well except for on character, that is 0x649, in arabic they have only 2 shapes, in my langauge we have 4 shapes for it.
i'v changed line 183
1 + 256 * 0x7F,/*0x0649*/
to
1+2+8 + 256 * 0x98 /*0x649*/
and changed line 121
static const UChar yehHamzaToYeh[] =
{
/* isolated*/ 0xFEEF,
/* final */ 0xFEF0
};
to
static const UChar yehHamzaToYeh[] =
{
/* isolated */0xFEEF,
0xFBE8, // my language specific
0xFBE9,// my language specific
/* final */ 0xFEF0
};
from ushape.c
now it can produce 3 shapes with no problem ( the beginning,isolated and final), but middle shape is displayed as a square ( missing character ) .
i tried replacing "* 0x98" with other numbers, but this best i can get.
what should i do ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
维吾尔?我和几个人讨论了维吾尔语翻译的问题,不是这个特定问题,而是一般问题。
当你说你得到一个正方形时,你得到的 Unicode 字符是什么?
你真正应该做的是向 ICU 提交错误并在那里进行讨论。这是一个功能请求,而不是一个使用问题。
我的记忆是,对于维吾尔族来说,它对塑造有不同的用途,你基本上会希望在塑造器上有不同的模式。
Uighur? I discussed with a couple of people about Uighur rendering, not this particular issue but in general.
When you said you get a square, what Unicode character do you get?
What you really should do is to file a bug with ICU and discuss it there. This is a feature request, not a usage question.
My rusty recollection is that for Uighur it makes different use of shaping, and you will want to basically have a different mode on the shaper.
ICU 确实似乎在某些语言的塑造方面存在问题,例如乌尔都语。
然而,您的特定字符 649 可能不是您正在寻找的字符。
U+649 是 alef maksura,看起来与 Farsi Yeh U+6cc 由 ICU 适当塑造。
他们确实有不同的表现形式:
Alef maksura 只有孤立的最终形式: U+feef U+feef U+feef净/U+fef0" rel="nofollow">U+fef0
波斯语 yeh 具有全部四种形式: U+fbfc U+fbfd U+fbfe U+fbff
ICU indeed seems to have problems for shaping with some languages, e.g. Urdu.
Your specific character 649 however is probably not the characters that you are looking for.
U+649 is alef maksura which looks identical to Farsi Yeh U+6cc which is shaped properly by ICU.
They do have different presentation forms:
Alef maksura only has isolated and final form: U+feef U+fef0
Farsi yeh has all four forms: U+fbfc U+fbfd U+fbfe U+fbff