将文本(印度语言 - 古吉拉特语)从 Word 文档复制到网页文本区域
我正在开发一个印度语言(古吉拉特语)的网站。
我的问题如下:
我的客户希望他们能够从Word文档复制古吉拉特语文本并粘贴到文本区域。
但是,当我从 Word 文档复制文本并将其粘贴到文本区域时,它会转换为英文字母。
http://www.chanakyanipothi.com/gujchanakya/Gopika.ttf
以上是我正在使用的字体的链接。
我可以为您提供演示代码,供您进行一些工作。
I am developing one site in Indian language (Gujarati).
My problem is as below:
My client wants that they able to copy Gujarati text from word document and paste into the Text area.
But when i copy text from word doc and paste into text area the its get converted to the English letters.
http://www.chanakyanipothi.com/gujchanakya/Gopika.ttf
Above is the link of fonts which I am using.
I can provide you the demo code for you to make some work on it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您应该将 TextArea 的字体设置为 Gopika。
You should set TextArea's font to Gopika.
由于无法复制,我可以将印度语 ગુજરાતી 粘贴到文本区域上,没有任何明显的问题。问题可能是您实际上使用的不是印度字符而是英语字符,或者更可能的是该单词以某种方式搞砸了您。
我建议您将文本区域的代码粘贴到此处,并可能上传单词文件或类似文件的示例(不像我要理解的那样),以便我们可以尝试复制问题。
更新:
第一个测试场景,我从维基百科粘贴了一些文本,其中包括印度语,文本在文本区域和发布后都正确显示。我假设您正在自行测试我在图像上看到的内容:
替代文本 http://img8.imageshack.us/img8/5140/gujaratitest12768464518.png< /a>
第二个测试场景,我从docx文件中复制了文本,当粘贴到文本区域时,它显示为英文字母。为什么?因为这些不是印度字符,而是看起来像印度字符的英文字符。
这意味着即使它们看起来像印度语,它们的底层仍然是相同的 ascii 代码,并且当翻译为文本区域时,它们失去了“外观”。您应该尝试一些真正的印度文本。
例如,根据您使用的字体,字母“a”看起来会有所不同,实际上可以是一只鸟、一棵树、印度字符或我们所关心的摩托车,但如果复制并粘贴到某个地方,只允许纯文本而不是基于字体的文本,我们仍然会看到字母“a”,因为它始终是 ASCII 字符 97。要亲自测试这一点,请转到您的 Word 文档并按键 ALT + 97(然后让转到 ALT),实际上您将输入字母“a”,无论它看起来是否像这样。
希望你明白。
真正的印度语、中文或任何其他字体都可以正常工作,除非您将文本区域的字体设置为该特定字体,否则看起来不会正常工作 -.-
如果这不能说服您,如果您使用 Cujarati 字体,则所有不在其中的字符英语,假设实际上引用英语的评论仍会以 Cujarati 字体显示,因此完全没有任何意义。
最后但并非最不重要的一点是,打开字符映射表,查看 Cujarati 映射表,然后查看任何其他字体的映射表。然后你会发现角色实际上是相同的。
但我放弃了试图说服那些不想看到的人。
Unable to replicate, I can paste Indian ગુજરાતી on textareas without any apparent issue. The problem may be that you are not actually using Indian characters but English characters, or more probably that word is screwing you in some way.
I suggest you paste the code of the textarea here, and possibly upload the word file, or an example of a similar one (not like I'm going to understand it) so we can try to replicate the issues.
Update:
First test scenario, I pasted some text from wikipedia that included Indian, the text shown correctly on both the textarea and after posting. I assume you are doing some testing yourself for what I can see on the image:
alt text http://img8.imageshack.us/img8/5140/gujaratitest12768464518.png
Second test scenario, I copied the text from the docx file and when pasted on the textarea, it appeared as english letters. Why? Because those are not Indian characters, they are English characters that look like Indian characters.
Meaning that even though they appear like Indian, they're still the same ascii codes underneath and when translated for the textarea, they lose their 'look'. You should try with some real Indian text.
For example, depending on the font you're using, the letter 'a' will look different and can in fact be a bird, a tree an Indian character or a motorcycle for all we care, but if copied and pasted somewhere that allows just plain text and not font-base text we will still see letter 'a', since it's always ASCII character 97. To test this yourself, go to your word document and press key ALT + 97 (then let go of ALT) you will by this in fact enter letter 'a' whether it looks like it or not.
Hope you understood.
Real Indian, Chinese or whatever will work correctly, fonts that look like it wont unless you set the textarea's font to that particular font -.-
If that doesn't convince you, if you use the Cujarati font all characters that weren't in English, let's say a comment that was in fact quoting English would still be shown in the Cujarati font and therefore will make absolutely no sense at all.
Last but not least, open the character map, view the Cujarati map and then any other font's map. Then you can see that characters are in fact the same.
But I give up on trying to convince those who don't want to see.