将十六进制 UTF-8 字节转换为十六进制代码点

发布于 2024-12-09 12:33:41 字数 1036 浏览 0 评论 0 原文

我该如何转换
十六进制 UTF-8 字节 -E0 A4 A4 到十六进制代码点 - 0924

参考: http://www.ltg.ed.ac.uk/~richard/utf-8.cgi?input=e0+a4+a4&mode=bytes

我需要这个,因为当我在c#中读取Unicode数据,它将其作为单字节序列并显示3个字符而不是1个,但我需要3字节序列(读取3个字节并显示单个字符),我尝试了很多解决方案但没有得到结果。

如果我可以显示或存储 3 字节序列 utf-8 字符,那么我不需要转换。

senario 是这样的:

    string str=getivrresult();

在 str 中我有一个单词,每个字符都是 3 字节 utf-8 序列。

编辑:

             string str="त";
             //i want it as "त" in str.

    Character                                   त
    Character name                              DEVANAGARI LETTER TA
    Hex code point                              0924
    Decimal code point                          2340
    Hex UTF-8 bytes                             E0 A4 A4
    Octal UTF-8 bytes                           340 244 244
    UTF-8 bytes as Latin-1 characters bytes     à ¤ ¤  

谢谢。

how can i convert
Hex UTF-8 bytes -E0 A4 A4 to hex code point - 0924

ref: http://www.ltg.ed.ac.uk/~richard/utf-8.cgi?input=e0+a4+a4&mode=bytes

I need this because when i read Unicode data in c# it is taking it as single byte sequence and displaying 3 characters instead of 1,but i need 3 byte sequence(read 3 bytes and display single character),I tried many solutions but didn't get the result.

If I can display or store a 3-byte sequence utf-8 character then I don't need conversion.

senario is like this:

    string str=getivrresult();

in str I have a word with each character as 3 byte utf-8 sequence.

Edited:

             string str="त";
             //i want it as "त" in str.

    Character                                   त
    Character name                              DEVANAGARI LETTER TA
    Hex code point                              0924
    Decimal code point                          2340
    Hex UTF-8 bytes                             E0 A4 A4
    Octal UTF-8 bytes                           340 244 244
    UTF-8 bytes as Latin-1 characters bytes     à ¤ ¤  

Thank You.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

谁的新欢旧爱 2024-12-16 12:33:42

使用Encoding类中的GetString方法:

byte[] data = { 0xE0, 0xA4, 0xA4 };
string str = Encoding.UTF8.GetString(data);

字符串现在包含一个字符,字符代码为0x924。

Use the GetString methdod in the Encoding class:

byte[] data = { 0xE0, 0xA4, 0xA4 };
string str = Encoding.UTF8.GetString(data);

The string now contains one character with the character code 0x924.

那小子欠揍 2024-12-16 12:33:42
        //utf-8 Single Byte Sequence input
        string str = "त";
        int i = 0;
        byte[] data=new byte[3];

        foreach (char c in str)
        {
            string tmpstr = String.Format("{0:x2}", (int)c);
            data[i] = Convert.ToByte(int.Parse(tmpstr, System.Globalization.NumberStyles.HexNumber));
            i++;
        }


        //utf-8 3-Byte Sequence Output now stp contains "त".
        string stp = Encoding.UTF8.GetString(data);
        //utf-8 Single Byte Sequence input
        string str = "त";
        int i = 0;
        byte[] data=new byte[3];

        foreach (char c in str)
        {
            string tmpstr = String.Format("{0:x2}", (int)c);
            data[i] = Convert.ToByte(int.Parse(tmpstr, System.Globalization.NumberStyles.HexNumber));
            i++;
        }


        //utf-8 3-Byte Sequence Output now stp contains "त".
        string stp = Encoding.UTF8.GetString(data);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文