扩展UTF8编码

发布于 2025-01-20 07:59:53 字数 1235 浏览 0 评论 0 原文

我正在尝试对字符串进行编码和解码。我意识到有些字符被编码为 2 个字节,有些仅编码为 1 个字节。有没有办法用前导零扩展那些 1 字节字符?

        public byte[] ToByte()
        {
            List<byte> result = new List<byte>();

            //First four are for the Command
            result.AddRange(BitConverter.GetBytes((int)cmdCommand));

            //Add the length of the name
            if (strName != null)
                result.AddRange(BitConverter.GetBytes(strName.Length));
            else
                result.AddRange(BitConverter.GetBytes(0));

            //Length of the message
            if (strMessage != null)
                result.AddRange(BitConverter.GetBytes(strMessage.Length));
            else
                result.AddRange(BitConverter.GetBytes(0));

            //Console.WriteLine("name length: " + strName.Length + "  message length: " + strMessage.Length);

            //Add the name
            if (strName != null)
                result.AddRange(Encoding.UTF32.GetBytes(strName));


            //And, lastly we add the message text to our array of bytes
            if (strMessage != null)
                result.AddRange(Encoding.UTF32.GetBytes(strMessage));


            return result.ToArray();
        }

I'm trying to encode and decode a string. I realised that some characters are encoded as 2 bytes, some as 1 byte only. Is there a way to extend those 1 byte characters with leading zeros?

        public byte[] ToByte()
        {
            List<byte> result = new List<byte>();

            //First four are for the Command
            result.AddRange(BitConverter.GetBytes((int)cmdCommand));

            //Add the length of the name
            if (strName != null)
                result.AddRange(BitConverter.GetBytes(strName.Length));
            else
                result.AddRange(BitConverter.GetBytes(0));

            //Length of the message
            if (strMessage != null)
                result.AddRange(BitConverter.GetBytes(strMessage.Length));
            else
                result.AddRange(BitConverter.GetBytes(0));

            //Console.WriteLine("name length: " + strName.Length + "  message length: " + strMessage.Length);

            //Add the name
            if (strName != null)
                result.AddRange(Encoding.UTF32.GetBytes(strName));


            //And, lastly we add the message text to our array of bytes
            if (strMessage != null)
                result.AddRange(Encoding.UTF32.GetBytes(strMessage));


            return result.ToArray();
        }

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

千鲤 2025-01-27 07:59:53

只需将1个字节字符带有0个字节的字节才能从我知道的任何有用的编码中转换为UTF-8。

另外,UTF-8是每个字符编码的1、2、3或4个字节(请参阅 https ://en.wikipedia.org/wiki/utf-8 )。如果您具有输入UTF-8字符串,并且想将其转换为UTF-16(这是每个字符编码的2个字节),则可以使用类似的内容:

var inputAsString = System.Text.Encoding.UTF8.GetString (inputByteArray);
var utf16ByteArray = System.Text.Encoding.GetEncoding ("utf-16").GetBytes (inputAsString);

以下是可以使用的编码列表。 net 或您使用以下代码检查以下代码,哪些编码可在您的系统 system.text.encoding.getencodings();

Just prefixing 1-byte characters with a 0-byte would not convert from UTF-8 to any usefull encoding I know.

And in addition UTF-8 is a 1, 2, 3 or 4 byte per character encoding (see https://en.wikipedia.org/wiki/UTF-8). If you have as input an UTF-8 string and want to convert it to UTF-16 (this would be a true 2-byte per character encoding) you could use something like that:

var inputAsString = System.Text.Encoding.UTF8.GetString (inputByteArray);
var utf16ByteArray = System.Text.Encoding.GetEncoding ("utf-16").GetBytes (inputAsString);

Here is a list of encodings you can use in .net https://learn.microsoft.com/en-us/dotnet/api/system.text.encoding?view=net-6.0#list-of-encodings or you check with the following code which encodings are available on your system System.Text.Encoding.GetEncodings ();

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文