将法语转换为 ASCII(需要会说法语的人)
我需要将法语文本转换为最正确的 ASCII 类似文本。让我解释一下。在德语中,您应该将 ä 转换为 ae,这不是简单地删除变音符号,而是找到最正确的类似物。请帮我学法语。我发现没有编程方式可以做到这一点,我创建了 Dictionary
。
转换(+ 大写):é、à、è、ù、â、ê、î、ô、û、ë、ï、ü、ÿ、ç。以及您建议的任何其他内容!请以 ascii 格式写下建议的替换。
谢谢,安德烈。
PS:请不要指向 How do我从 .NET 中的字符串中删除变音符号(重音符号)?。这种方法很棒,但有点与语言无关。它只是删除变音符号。如果我没有好的类似物,我计划将其用作默认值。
PPS:请不要关闭问题,它与编程有关,因为我实现了多语言应用程序
i need to convert French text to most correct analogue in ASCII. Let me explain. In German you should convert ä to ae, this is not simple removing of diacritics, it is finding most correct analogue. Please help me with French. I found that there is no programmatic way to do it, i create Dictionary<char, string>
.
To convert (+ capitals): é, à, è, ù, â, ê, î, ô, û, ë, ï, ü, ÿ, ç. and any other you suggest! Please write suggested substitution in ascii.
Thanks, Andrey.
PS: Please don't point to How do I remove diacritics (accents) from a string in .NET?. That method is great but a bit language agnostic. It just strips diacritics. I plan to use it as a default if i don't have good analogue.
PPS: Pleas don't close the question, it is related to programming, since i implement multingual app
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
据我所知,当法语中没有重音符号时(即转换为 ASCII 时),您只需键入等效的 ASCII 字符(与德语不同,您可以在带有元音变音的元音后面添加 e)。仅就您提供的口音而言,我从未见过法语中使用过 ÿ 。不要忘记 æ 和 œ。
As far as I know, when accents aren't available in French (ie, when converting to ASCII) you simply type the equivalent ASCII character (unlike German, where you can add an e after the vowel with the umlaut). Just for the accents you provided, I've never seen ÿ used in French. Don't forget æ and œ.
通常,当没有可用的重音符号时,我们就不会写它们。
如果你想保留这些信息,你需要使用某种编码,来指示正在使用哪种字符集,并且使用超过ascii的字符(即使用字符集的128到255个字符)。
或者,您可以以您自己的形式进行编码。 Sparcstations 有一种输入重音字符的方法:
但它是一种用于存储数据的编码方法,而不是一种用于为法语读者写下数据的音译方法。恐怕我们还没有采用替代口音的方法。
Normally, when accents aren't available, we simply don't write them.
If you want to retain the information, you need to use some kind of encoding, to indicate which character set is being used, and use more than ascii (that is, use characters 128 to 255 of the charset).
alternatively, you could encode in a form of your own. Sparcstations had a way of entering accented characters:
But it's an encoding method, for storing the data, not a transliteration method, for writing it down for French readers. I'm afraid we haven't adopted an alternative to the accents yet.