C#:用于解码 Quoted-Printable 编码的类?

发布于 2024-08-20 16:55:53 字数 900 浏览 5 评论 0原文

C# 中是否存在可以将 Quoted-Printable 编码转换为 String 的现有类?单击上面的链接以获取有关编码的更多信息。

以下内容摘自上述链接,方便您参考。

可以对任何 8 位字节值进行编码 包含 3 个字符,后跟一个“=” 两个十六进制数字(0–9 或 A–F) 表示字节的数值。 例如,US-ASCII 换页符 字符(十进制值12)可以是 由“=0C”和 US-ASCII 表示 等号(十进制值 61)是 用“=3D”表示。所有角色 除了可打印的 ASCII 字符或 行尾字符必须进行编码 以这种方式。

所有可打印的 ASCII 字符 (33 到 126 之间的十进制值) 可以由他们自己代表, 除了“=”(十进制 61)。

ASCII 制表符和空格字符, 十进制值 9 和 32 可能是 由他们自己代表,除非 这些字符出现在末尾 一条线。如果这些字符之一 出现在一行的末尾,它必须 编码为“=09”(制表符)或“=20” (空格)。

如果正在编码的数据包含 有意义的换行符,它们必须是 编码为 ASCII CR LF 序列, 不作为它们的原始字节值。 相反,如果字节值 13 和 10 具有行尾以外的含义 那么它们必须被编码为 =0D 并且 =0A。

引用可打印编码数据行 不得超过 76 个字符。 为了满足这个要求,无需 改变编码文本,软线 可以根据需要添加中断。软软的 换行符由“=”组成 编码行的末尾,并且不 导致解码后换行 文本。

Is there an existing class in C# that can convert Quoted-Printable encoding to String? Click on the above link to get more information on the encoding.

The following is quoted from the above link for your convenience.

Any 8-bit byte value may be encoded
with 3 characters, an "=" followed by
two hexadecimal digits (0–9 or A–F)
representing the byte's numeric value.
For example, a US-ASCII form feed
character (decimal value 12) can be
represented by "=0C", and a US-ASCII
equal sign (decimal value 61) is
represented by "=3D". All characters
except printable ASCII characters or
end of line characters must be encoded
in this fashion.

All printable ASCII characters
(decimal values between 33 and 126)
may be represented by themselves,
except "=" (decimal 61).

ASCII tab and space characters,
decimal values 9 and 32, may be
represented by themselves, except if
these characters appear at the end of
a line. If one of these characters
appears at the end of a line it must
be encoded as "=09" (tab) or "=20"
(space).

If the data being encoded contains
meaningful line breaks, they must be
encoded as an ASCII CR LF sequence,
not as their original byte values.
Conversely if byte values 13 and 10
have meanings other than end of line
then they must be encoded as =0D and
=0A.

Lines of quoted-printable encoded data
must not be longer than 76 characters.
To satisfy this requirement without
altering the encoded text, soft line
breaks may be added as desired. A soft
line break consists of an "=" at the
end of an encoded line, and does not
cause a line break in the decoded
text.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(15

作业与我同在 2024-08-27 16:55:53

框架库中有执行此操作的功能,但它似乎没有完全公开。该实现位于内部类System.Net.Mime.QuotedPrintableStream中。此类定义了一个名为 DecodeBytes 的方法,它可以执行您想要的操作。该方法似乎仅由一种用于解码 MIME 标头的方法使用。该方法也是内部方法,但在几个地方相当直接地调用,例如 Attachment.Name setter。演示:

using System;
using System.Net.Mail;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            Attachment attachment = Attachment.CreateAttachmentFromString("", "=?iso-8859-1?Q?=A1Hola,_se=F1or!?=");
            Console.WriteLine(attachment.Name);
        }
    }
}

产生输出:

¡你好,_先生!

您可能需要进行一些测试以确保回车符等得到正确处理,尽管在快速测试中我似乎确实如此。但是,依赖此功能可能并不明智,除非您的用例足够接近 MIME 标头字符串的解码,并且您认为对库所做的任何更改都不会破坏它。您最好编写自己的引用可打印解码器。

There is functionality in the framework libraries to do this, but it doesn't appear to be cleanly exposed. The implementation is in the internal class System.Net.Mime.QuotedPrintableStream. This class defines a method called DecodeBytes which does what you want. The method appears to be used by only one method which is used to decode MIME headers. This method is also internal, but is called fairly directly in a couple of places, e.g., the Attachment.Name setter. A demonstration:

using System;
using System.Net.Mail;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            Attachment attachment = Attachment.CreateAttachmentFromString("", "=?iso-8859-1?Q?=A1Hola,_se=F1or!?=");
            Console.WriteLine(attachment.Name);
        }
    }
}

Produces the output:

¡Hola,_señor!

You may have to do some testing to ensure carriage returns, etc are treated correctly although in a quick test I did they seem to be. However, it may not be wise to rely on this functionality unless your use-case is close enough to decoding of a MIME header string that you don't think it will be broken by any changes made to the library. You might be better off writing your own quoted-printable decoder.

故事灯 2024-08-27 16:55:53

我扩展了马丁·墨菲的解决方案,我希望它适用于所有情况。

private static string DecodeQuotedPrintables(string input, string charSet)
{           
    if (string.IsNullOrEmpty(charSet))
    {
        var charSetOccurences = new Regex(@"=\?.*\?Q\?", RegexOptions.IgnoreCase);
        var charSetMatches = charSetOccurences.Matches(input);
        foreach (Match match in charSetMatches)
        {
            charSet = match.Groups[0].Value.Replace("=?", "").Replace("?Q?", "");
            input = input.Replace(match.Groups[0].Value, "").Replace("?=", "");
        }
    }

    Encoding enc = new ASCIIEncoding();
    if (!string.IsNullOrEmpty(charSet))
    {
        try
        {
            enc = Encoding.GetEncoding(charSet);
        }
        catch
        {
            enc = new ASCIIEncoding();
        }
    }

    //decode iso-8859-[0-9]
    var occurences = new Regex(@"=[0-9A-Z]{2}", RegexOptions.Multiline);
    var matches = occurences.Matches(input);
    foreach (Match match in matches)
    {
        try
        {
            byte[] b = new byte[] { byte.Parse(match.Groups[0].Value.Substring(1), System.Globalization.NumberStyles.AllowHexSpecifier) };
            char[] hexChar = enc.GetChars(b);
            input = input.Replace(match.Groups[0].Value, hexChar[0].ToString());
        }
        catch { }
    }

    //decode base64String (utf-8?B?)
    occurences = new Regex(@"\?utf-8\?B\?.*\?", RegexOptions.IgnoreCase);
    matches = occurences.Matches(input);
    foreach (Match match in matches)
    {
        byte[] b = Convert.FromBase64String(match.Groups[0].Value.Replace("?utf-8?B?", "").Replace("?UTF-8?B?", "").Replace("?", ""));
        string temp = Encoding.UTF8.GetString(b);
        input = input.Replace(match.Groups[0].Value, temp);
    }

    input = input.Replace("=\r\n", "");
    return input;
}

I extended the solution of Martin Murphy and I hope it will work in every case.

private static string DecodeQuotedPrintables(string input, string charSet)
{           
    if (string.IsNullOrEmpty(charSet))
    {
        var charSetOccurences = new Regex(@"=\?.*\?Q\?", RegexOptions.IgnoreCase);
        var charSetMatches = charSetOccurences.Matches(input);
        foreach (Match match in charSetMatches)
        {
            charSet = match.Groups[0].Value.Replace("=?", "").Replace("?Q?", "");
            input = input.Replace(match.Groups[0].Value, "").Replace("?=", "");
        }
    }

    Encoding enc = new ASCIIEncoding();
    if (!string.IsNullOrEmpty(charSet))
    {
        try
        {
            enc = Encoding.GetEncoding(charSet);
        }
        catch
        {
            enc = new ASCIIEncoding();
        }
    }

    //decode iso-8859-[0-9]
    var occurences = new Regex(@"=[0-9A-Z]{2}", RegexOptions.Multiline);
    var matches = occurences.Matches(input);
    foreach (Match match in matches)
    {
        try
        {
            byte[] b = new byte[] { byte.Parse(match.Groups[0].Value.Substring(1), System.Globalization.NumberStyles.AllowHexSpecifier) };
            char[] hexChar = enc.GetChars(b);
            input = input.Replace(match.Groups[0].Value, hexChar[0].ToString());
        }
        catch { }
    }

    //decode base64String (utf-8?B?)
    occurences = new Regex(@"\?utf-8\?B\?.*\?", RegexOptions.IgnoreCase);
    matches = occurences.Matches(input);
    foreach (Match match in matches)
    {
        byte[] b = Convert.FromBase64String(match.Groups[0].Value.Replace("?utf-8?B?", "").Replace("?UTF-8?B?", "").Replace("?", ""));
        string temp = Encoding.UTF8.GetString(b);
        input = input.Replace(match.Groups[0].Value, temp);
    }

    input = input.Replace("=\r\n", "");
    return input;
}
烟织青萝梦 2024-08-27 16:55:53

我写得很快。

    public static string DecodeQuotedPrintables(string input)
    {
        var occurences = new Regex(@"=[0-9A-H]{2}", RegexOptions.Multiline);
        var matches = occurences.Matches(input);
        var uniqueMatches = new HashSet<string>(matches);
        foreach (string match in uniqueMatches)
        {
            char hexChar= (char) Convert.ToInt32(match.Substring(1), 16);
            input =input.Replace(match, hexChar.ToString());
        }
        return input.Replace("=\r\n", "");
    }       

I wrote this up real quick.

    public static string DecodeQuotedPrintables(string input)
    {
        var occurences = new Regex(@"=[0-9A-H]{2}", RegexOptions.Multiline);
        var matches = occurences.Matches(input);
        var uniqueMatches = new HashSet<string>(matches);
        foreach (string match in uniqueMatches)
        {
            char hexChar= (char) Convert.ToInt32(match.Substring(1), 16);
            input =input.Replace(match, hexChar.ToString());
        }
        return input.Replace("=\r\n", "");
    }       
非要怀念 2024-08-27 16:55:53

我一直在寻找动态解决方案,并花了 2 天尝试不同的解决方案。此解决方案将支持日语字符和其他标准字符集

private static string Decode(string input, string bodycharset) {
        var i = 0;
        var output = new List<byte>();
        while (i < input.Length) {
            if (input[i] == '=' && input[i + 1] == '\r' && input[i + 2] == '\n') {
                //Skip
                i += 3;
            } else if (input[i] == '=') {
                string sHex = input;
                sHex = sHex.Substring(i + 1, 2);
                int hex = Convert.ToInt32(sHex, 16);
                byte b = Convert.ToByte(hex);
                output.Add(b);
                i += 3;
            } else {
                output.Add((byte)input[i]);
                i++;
            }
        }


        if (String.IsNullOrEmpty(bodycharset))
            return Encoding.UTF8.GetString(output.ToArray());
        else {
            if (String.Compare(bodycharset, "ISO-2022-JP", true) == 0)
                return Encoding.GetEncoding("Shift_JIS").GetString(output.ToArray());
            else
                return Encoding.GetEncoding(bodycharset).GetString(output.ToArray());
        }

    }

然后您可以使用此调用该函数

Decode("=E3=82=AB=E3=82=B9=E3", "utf-8")

最初发现此处

I was looking for a dynamic solution and spent 2 days trying different solutions. This solution will support Japanese characters and other standard character sets

private static string Decode(string input, string bodycharset) {
        var i = 0;
        var output = new List<byte>();
        while (i < input.Length) {
            if (input[i] == '=' && input[i + 1] == '\r' && input[i + 2] == '\n') {
                //Skip
                i += 3;
            } else if (input[i] == '=') {
                string sHex = input;
                sHex = sHex.Substring(i + 1, 2);
                int hex = Convert.ToInt32(sHex, 16);
                byte b = Convert.ToByte(hex);
                output.Add(b);
                i += 3;
            } else {
                output.Add((byte)input[i]);
                i++;
            }
        }


        if (String.IsNullOrEmpty(bodycharset))
            return Encoding.UTF8.GetString(output.ToArray());
        else {
            if (String.Compare(bodycharset, "ISO-2022-JP", true) == 0)
                return Encoding.GetEncoding("Shift_JIS").GetString(output.ToArray());
            else
                return Encoding.GetEncoding(bodycharset).GetString(output.ToArray());
        }

    }

Then you can call the function with

Decode("=E3=82=AB=E3=82=B9=E3", "utf-8")

This was originally found here

她如夕阳 2024-08-27 16:55:53

如果您使用 UTF-8 编码解码引用的可打印序列,则需要注意,如果存在一起运行的引用的可打印字符,则无法一次解码每个引用的可打印序列,正如其他人所显示的那样。

例如 - 如果您有以下序列 =E2=80=99 并使用 UTF8 一次解码它,您会得到三个“奇怪”的字符 - 如果您改为构建一个三个字节的数组并将这三个字节转换为UTF8 编码你会得到一个单撇号。

显然,如果您使用 ASCII 编码,那么一次一个是没有问题的,但是解码运行意味着无论使用什么文本编码器,您的代码都可以工作。

哦,别忘了 =3D 是一种特殊情况,这意味着您需要再次解码您拥有的任何内容......这是一个疯狂的陷阱!

希望有帮助

If you are decoding quoted-printable with UTF-8 encoding you will need to be aware that you cannot decode each quoted-printable sequence one-at-a-time as the others have shown if there are runs of quoted printable characters together.

For example - if you have the following sequence =E2=80=99 and decode this using UTF8 one-at-a-time you get three "weird" characters - if you instead build an array of three bytes and convert the three bytes with the UTF8 encoding you get a single aphostrope.

Obviously if you are using ASCII encoding then one-at-a-time is no problem however decoding runs means your code will work regardless of the text encoder used.

Oh and don't forget =3D is a special case that means you need to decode whatever you have one more time... That is a crazy gotcha!

Hope that helps

合约呢 2024-08-27 16:55:53
    private string quotedprintable(string data, string encoding)
    {
        data = data.Replace("=\r\n", "");
        for (int position = -1; (position = data.IndexOf("=")) != -1;)
        {
            string leftpart = data.Substring(0, position);
            System.Collections.ArrayList hex = new System.Collections.ArrayList();
            hex.Add(data.Substring(1 + position, 2));
            while (position + 3 < data.Length && data.Substring(position + 3, 1) == "=")
            {
                position = position + 3;
                hex.Add(data.Substring(1 + position, 2));
            }
            byte[] bytes = new byte[hex.Count];
            for (int i = 0; i < hex.Count; i++)
            {
                bytes[i] = System.Convert.ToByte(new string(((string)hex[i]).ToCharArray()), 16);
            }
            string equivalent = System.Text.Encoding.GetEncoding(encoding).GetString(bytes);
            string rightpart = data.Substring(position + 3);
            data = leftpart + equivalent + rightpart;
        }
        return data;
    }
    private string quotedprintable(string data, string encoding)
    {
        data = data.Replace("=\r\n", "");
        for (int position = -1; (position = data.IndexOf("=")) != -1;)
        {
            string leftpart = data.Substring(0, position);
            System.Collections.ArrayList hex = new System.Collections.ArrayList();
            hex.Add(data.Substring(1 + position, 2));
            while (position + 3 < data.Length && data.Substring(position + 3, 1) == "=")
            {
                position = position + 3;
                hex.Add(data.Substring(1 + position, 2));
            }
            byte[] bytes = new byte[hex.Count];
            for (int i = 0; i < hex.Count; i++)
            {
                bytes[i] = System.Convert.ToByte(new string(((string)hex[i]).ToCharArray()), 16);
            }
            string equivalent = System.Text.Encoding.GetEncoding(encoding).GetString(bytes);
            string rightpart = data.Substring(position + 3);
            data = leftpart + equivalent + rightpart;
        }
        return data;
    }
淡看悲欢离合 2024-08-27 16:55:53

这个引用的可打印解码器效果很好!

public static byte[] FromHex(byte[] hexData)
    {
        if (hexData == null)
        {
            throw new ArgumentNullException("hexData");
        }

        if (hexData.Length < 2 || (hexData.Length / (double)2 != Math.Floor(hexData.Length / (double)2)))
        {
            throw new Exception("Illegal hex data, hex data must be in two bytes pairs, for example: 0F,FF,A3,... .");
        }

        MemoryStream retVal = new MemoryStream(hexData.Length / 2);
        // Loop hex value pairs
        for (int i = 0; i < hexData.Length; i += 2)
        {
            byte[] hexPairInDecimal = new byte[2];
            // We need to convert hex char to decimal number, for example F = 15
            for (int h = 0; h < 2; h++)
            {
                if (((char)hexData[i + h]) == '0')
                {
                    hexPairInDecimal[h] = 0;
                }
                else if (((char)hexData[i + h]) == '1')
                {
                    hexPairInDecimal[h] = 1;
                }
                else if (((char)hexData[i + h]) == '2')
                {
                    hexPairInDecimal[h] = 2;
                }
                else if (((char)hexData[i + h]) == '3')
                {
                    hexPairInDecimal[h] = 3;
                }
                else if (((char)hexData[i + h]) == '4')
                {
                    hexPairInDecimal[h] = 4;
                }
                else if (((char)hexData[i + h]) == '5')
                {
                    hexPairInDecimal[h] = 5;
                }
                else if (((char)hexData[i + h]) == '6')
                {
                    hexPairInDecimal[h] = 6;
                }
                else if (((char)hexData[i + h]) == '7')
                {
                    hexPairInDecimal[h] = 7;
                }
                else if (((char)hexData[i + h]) == '8')
                {
                    hexPairInDecimal[h] = 8;
                }
                else if (((char)hexData[i + h]) == '9')
                {
                    hexPairInDecimal[h] = 9;
                }
                else if (((char)hexData[i + h]) == 'A' || ((char)hexData[i + h]) == 'a')
                {
                    hexPairInDecimal[h] = 10;
                }
                else if (((char)hexData[i + h]) == 'B' || ((char)hexData[i + h]) == 'b')
                {
                    hexPairInDecimal[h] = 11;
                }
                else if (((char)hexData[i + h]) == 'C' || ((char)hexData[i + h]) == 'c')
                {
                    hexPairInDecimal[h] = 12;
                }
                else if (((char)hexData[i + h]) == 'D' || ((char)hexData[i + h]) == 'd')
                {
                    hexPairInDecimal[h] = 13;
                }
                else if (((char)hexData[i + h]) == 'E' || ((char)hexData[i + h]) == 'e')
                {
                    hexPairInDecimal[h] = 14;
                }
                else if (((char)hexData[i + h]) == 'F' || ((char)hexData[i + h]) == 'f')
                {
                    hexPairInDecimal[h] = 15;
                }
            }

            // Join hex 4 bit(left hex cahr) + 4bit(right hex char) in bytes 8 it
            retVal.WriteByte((byte)((hexPairInDecimal[0] << 4) | hexPairInDecimal[1]));
        }

        return retVal.ToArray();
    }
    public static byte[] QuotedPrintableDecode(byte[] data)
    {
        if (data == null)
        {
            throw new ArgumentNullException("data");
        }

        MemoryStream msRetVal = new MemoryStream();
        MemoryStream msSourceStream = new MemoryStream(data);

        int b = msSourceStream.ReadByte();
        while (b > -1)
        {
            // Encoded 8-bit byte(=XX) or soft line break(=CRLF)
            if (b == '=')
            {
                byte[] buffer = new byte[2];
                int nCount = msSourceStream.Read(buffer, 0, 2);
                if (nCount == 2)
                {
                    // Soft line break, line splitted, just skip CRLF
                    if (buffer[0] == '\r' && buffer[1] == '\n')
                    {
                    }
                    // This must be encoded 8-bit byte
                    else
                    {
                        try
                        {
                            msRetVal.Write(FromHex(buffer), 0, 1);
                        }
                        catch
                        {
                            // Illegal value after =, just leave it as is
                            msRetVal.WriteByte((byte)'=');
                            msRetVal.Write(buffer, 0, 2);
                        }
                    }
                }
                // Illegal =, just leave as it is
                else
                {
                    msRetVal.Write(buffer, 0, nCount);
                }
            }
            // Just write back all other bytes
            else
            {
                msRetVal.WriteByte((byte)b);
            }

            // Read next byte
            b = msSourceStream.ReadByte();
        }

        return msRetVal.ToArray();
    }

This Quoted Printable Decoder works great!

public static byte[] FromHex(byte[] hexData)
    {
        if (hexData == null)
        {
            throw new ArgumentNullException("hexData");
        }

        if (hexData.Length < 2 || (hexData.Length / (double)2 != Math.Floor(hexData.Length / (double)2)))
        {
            throw new Exception("Illegal hex data, hex data must be in two bytes pairs, for example: 0F,FF,A3,... .");
        }

        MemoryStream retVal = new MemoryStream(hexData.Length / 2);
        // Loop hex value pairs
        for (int i = 0; i < hexData.Length; i += 2)
        {
            byte[] hexPairInDecimal = new byte[2];
            // We need to convert hex char to decimal number, for example F = 15
            for (int h = 0; h < 2; h++)
            {
                if (((char)hexData[i + h]) == '0')
                {
                    hexPairInDecimal[h] = 0;
                }
                else if (((char)hexData[i + h]) == '1')
                {
                    hexPairInDecimal[h] = 1;
                }
                else if (((char)hexData[i + h]) == '2')
                {
                    hexPairInDecimal[h] = 2;
                }
                else if (((char)hexData[i + h]) == '3')
                {
                    hexPairInDecimal[h] = 3;
                }
                else if (((char)hexData[i + h]) == '4')
                {
                    hexPairInDecimal[h] = 4;
                }
                else if (((char)hexData[i + h]) == '5')
                {
                    hexPairInDecimal[h] = 5;
                }
                else if (((char)hexData[i + h]) == '6')
                {
                    hexPairInDecimal[h] = 6;
                }
                else if (((char)hexData[i + h]) == '7')
                {
                    hexPairInDecimal[h] = 7;
                }
                else if (((char)hexData[i + h]) == '8')
                {
                    hexPairInDecimal[h] = 8;
                }
                else if (((char)hexData[i + h]) == '9')
                {
                    hexPairInDecimal[h] = 9;
                }
                else if (((char)hexData[i + h]) == 'A' || ((char)hexData[i + h]) == 'a')
                {
                    hexPairInDecimal[h] = 10;
                }
                else if (((char)hexData[i + h]) == 'B' || ((char)hexData[i + h]) == 'b')
                {
                    hexPairInDecimal[h] = 11;
                }
                else if (((char)hexData[i + h]) == 'C' || ((char)hexData[i + h]) == 'c')
                {
                    hexPairInDecimal[h] = 12;
                }
                else if (((char)hexData[i + h]) == 'D' || ((char)hexData[i + h]) == 'd')
                {
                    hexPairInDecimal[h] = 13;
                }
                else if (((char)hexData[i + h]) == 'E' || ((char)hexData[i + h]) == 'e')
                {
                    hexPairInDecimal[h] = 14;
                }
                else if (((char)hexData[i + h]) == 'F' || ((char)hexData[i + h]) == 'f')
                {
                    hexPairInDecimal[h] = 15;
                }
            }

            // Join hex 4 bit(left hex cahr) + 4bit(right hex char) in bytes 8 it
            retVal.WriteByte((byte)((hexPairInDecimal[0] << 4) | hexPairInDecimal[1]));
        }

        return retVal.ToArray();
    }
    public static byte[] QuotedPrintableDecode(byte[] data)
    {
        if (data == null)
        {
            throw new ArgumentNullException("data");
        }

        MemoryStream msRetVal = new MemoryStream();
        MemoryStream msSourceStream = new MemoryStream(data);

        int b = msSourceStream.ReadByte();
        while (b > -1)
        {
            // Encoded 8-bit byte(=XX) or soft line break(=CRLF)
            if (b == '=')
            {
                byte[] buffer = new byte[2];
                int nCount = msSourceStream.Read(buffer, 0, 2);
                if (nCount == 2)
                {
                    // Soft line break, line splitted, just skip CRLF
                    if (buffer[0] == '\r' && buffer[1] == '\n')
                    {
                    }
                    // This must be encoded 8-bit byte
                    else
                    {
                        try
                        {
                            msRetVal.Write(FromHex(buffer), 0, 1);
                        }
                        catch
                        {
                            // Illegal value after =, just leave it as is
                            msRetVal.WriteByte((byte)'=');
                            msRetVal.Write(buffer, 0, 2);
                        }
                    }
                }
                // Illegal =, just leave as it is
                else
                {
                    msRetVal.Write(buffer, 0, nCount);
                }
            }
            // Just write back all other bytes
            else
            {
                msRetVal.WriteByte((byte)b);
            }

            // Read next byte
            b = msSourceStream.ReadByte();
        }

        return msRetVal.ToArray();
    }
锦欢 2024-08-27 16:55:53

唯一对我有用的。

http://sourceforge.net/apps/trac/syncmldotnet/wiki/Quoted%20Printable

如果您只需要解码 QP,请从上面的链接中提取代码中的这三个函数:

    HexDecoderEvaluator(Match m)
    HexDecoder(string line)
    Decode(string encodedText)

然后:

var humanReadable = Decode(myQPString);

享受

The only one that worked for me.

http://sourceforge.net/apps/trac/syncmldotnet/wiki/Quoted%20Printable

If you just need to decode the QPs, pull inside of your code those three functions from the link above:

    HexDecoderEvaluator(Match m)
    HexDecoder(string line)
    Decode(string encodedText)

And then just:

var humanReadable = Decode(myQPString);

Enjoy

那支青花 2024-08-27 16:55:53

更好的解决方案

    private static string DecodeQuotedPrintables(string input, string charSet)
    {
        try
        {
            enc = Encoding.GetEncoding(CharSet);
        }
        catch
        {
            enc = new UTF8Encoding();
        }

        var occurences = new Regex(@"(=[0-9A-Z]{2}){1,}", RegexOptions.Multiline);
        var matches = occurences.Matches(input);

    foreach (Match match in matches)
    {
            try
            {
                byte[] b = new byte[match.Groups[0].Value.Length / 3];
                for (int i = 0; i < match.Groups[0].Value.Length / 3; i++)
                {
                    b[i] = byte.Parse(match.Groups[0].Value.Substring(i * 3 + 1, 2), System.Globalization.NumberStyles.AllowHexSpecifier);
                }
                char[] hexChar = enc.GetChars(b);
                input = input.Replace(match.Groups[0].Value, hexChar[0].ToString());
        }
            catch
            { ;}
        }
        input = input.Replace("=\r\n", "").Replace("=\n", "").Replace("?=", "");

        return input;
}

Better solution

    private static string DecodeQuotedPrintables(string input, string charSet)
    {
        try
        {
            enc = Encoding.GetEncoding(CharSet);
        }
        catch
        {
            enc = new UTF8Encoding();
        }

        var occurences = new Regex(@"(=[0-9A-Z]{2}){1,}", RegexOptions.Multiline);
        var matches = occurences.Matches(input);

    foreach (Match match in matches)
    {
            try
            {
                byte[] b = new byte[match.Groups[0].Value.Length / 3];
                for (int i = 0; i < match.Groups[0].Value.Length / 3; i++)
                {
                    b[i] = byte.Parse(match.Groups[0].Value.Substring(i * 3 + 1, 2), System.Globalization.NumberStyles.AllowHexSpecifier);
                }
                char[] hexChar = enc.GetChars(b);
                input = input.Replace(match.Groups[0].Value, hexChar[0].ToString());
        }
            catch
            { ;}
        }
        input = input.Replace("=\r\n", "").Replace("=\n", "").Replace("?=", "");

        return input;
}
以为你会在 2024-08-27 16:55:53

有时,EML 文件中的字符串由多个编码部分组成。这是一个在这些情况下使用 Dave 方法的函数:

public string DecodeQP(string codedstring)
{
    Regex codified;

    codified=new Regex(@"=\?((?!\?=).)*\?=", RegexOptions.IgnoreCase);
    MatchCollection setMatches = codified.Matches(cadena);
    if(setMatches.Count > 0)
    {
        Attachment attdecode;
        codedstring= "";
        foreach (Match match in setMatches)
        {
            attdecode = Attachment.CreateAttachmentFromString("", match.Value);
            codedstring+= attdecode.Name;

        }                
    }
    return codedstring;
}

Sometimes the string into an EML file is composed by several encoded parts. This is a function to use the Dave's method for these cases:

public string DecodeQP(string codedstring)
{
    Regex codified;

    codified=new Regex(@"=\?((?!\?=).)*\?=", RegexOptions.IgnoreCase);
    MatchCollection setMatches = codified.Matches(cadena);
    if(setMatches.Count > 0)
    {
        Attachment attdecode;
        codedstring= "";
        foreach (Match match in setMatches)
        {
            attdecode = Attachment.CreateAttachmentFromString("", match.Value);
            codedstring+= attdecode.Name;

        }                
    }
    return codedstring;
}
絕版丫頭 2024-08-27 16:55:53

请注意:
互联网上到处都有“input.Replace”的解决方案,但它们仍然不正确。

看,如果您有一个解码符号,然后使用“替换”
“输入”中的所有符号将被替换,然后所有后续解码将被破坏。

更正确的解决方案:

public static string DecodeQuotedPrintable(string input, string charSet)
    {

        Encoding enc;

        try
        {
            enc = Encoding.GetEncoding(charSet);
        }
        catch
        {
            enc = new UTF8Encoding();
        }

        input = input.Replace("=\r\n=", "=");
        input = input.Replace("=\r\n ", "\r\n ");
        input = input.Replace("= \r\n", " \r\n");
        var occurences = new Regex(@"(=[0-9A-Z]{2})", RegexOptions.Multiline); //{1,}
        var matches = occurences.Matches(input);

        foreach (Match match in matches)
        {
            try
            {
                byte[] b = new byte[match.Groups[0].Value.Length / 3];
                for (int i = 0; i < match.Groups[0].Value.Length / 3; i++)
                {
                    b[i] = byte.Parse(match.Groups[0].Value.Substring(i * 3 + 1, 2), System.Globalization.NumberStyles.AllowHexSpecifier);
                }
                char[] hexChar = enc.GetChars(b);
                input = input.Replace(match.Groups[0].Value, new String(hexChar));

            }
            catch
            { Console.WriteLine("QP dec err"); }
        }
        input = input.Replace("?=", ""); //.Replace("\r\n", "");

        return input;
    }

Please note:
solutions with "input.Replace" are all over Internet and still they are not correct.

See, if you have ONE decoded symbol and then use "replace",
ALL symbols in "input" will be replaced, and then all following decoding will be broken.

More correct solution:

public static string DecodeQuotedPrintable(string input, string charSet)
    {

        Encoding enc;

        try
        {
            enc = Encoding.GetEncoding(charSet);
        }
        catch
        {
            enc = new UTF8Encoding();
        }

        input = input.Replace("=\r\n=", "=");
        input = input.Replace("=\r\n ", "\r\n ");
        input = input.Replace("= \r\n", " \r\n");
        var occurences = new Regex(@"(=[0-9A-Z]{2})", RegexOptions.Multiline); //{1,}
        var matches = occurences.Matches(input);

        foreach (Match match in matches)
        {
            try
            {
                byte[] b = new byte[match.Groups[0].Value.Length / 3];
                for (int i = 0; i < match.Groups[0].Value.Length / 3; i++)
                {
                    b[i] = byte.Parse(match.Groups[0].Value.Substring(i * 3 + 1, 2), System.Globalization.NumberStyles.AllowHexSpecifier);
                }
                char[] hexChar = enc.GetChars(b);
                input = input.Replace(match.Groups[0].Value, new String(hexChar));

            }
            catch
            { Console.WriteLine("QP dec err"); }
        }
        input = input.Replace("?=", ""); //.Replace("\r\n", "");

        return input;
    }
笑着哭最痛 2024-08-27 16:55:53

我知道这是老问题,但这应该有帮助

    private static string GetPrintableCharacter(char character)
    {
        switch (character)
        {
            case '\a':
            {
                return "\\a";
            }
            case '\b':
            {
                return "\\b";
            }
            case '\t':
            {
                return "\\t";
            }
            case '\n':
            {
                return "\\n";
            }
            case '\v':
            {
                return "\\v";
            }
            case '\f':
            {
                return "\\f";
            }
            case '\r':
            {
                return "\\r";
            }
            default:
            {
                if (character == ' ')
                {
                    break;
                }
                else
                {
                    throw new InvalidArgumentException(Resources.NOTSUPPORTCHAR, new object[] { character });
                }
            }
        }
        return "\\x20";
    }

    public static string GetPrintableText(string text)
    {
        StringBuilder stringBuilder = new StringBuilder(1024);
        if (text == null)
        {
            return "[~NULL~]";
        }
        if (text.Length == 0)
        {
            return "[~EMPTY~]";
        }
        stringBuilder.Remove(0, stringBuilder.Length);
        int num = 0;
        for (int i = 0; i < text.Length; i++)
        {
            if (text[i] == '\a' || text[i] == '\b' || text[i] == '\f' || text[i] == '\v' || text[i] == '\t' || text[i] == '\n' || text[i] == '\r' || text[i] == ' ')
            {
                num += 3;
            }
        }
        int length = text.Length + num;
        if (stringBuilder.Capacity < length)
        {
            stringBuilder = new StringBuilder(length);
        }
        string str = text;
        for (int j = 0; j < str.Length; j++)
        {
            char chr = str[j];
            if (chr > ' ')
            {
                stringBuilder.Append(chr);
            }
            else
            {
                stringBuilder.Append(StringHelper.GetPrintableCharacter(chr));
            }
        }
        return stringBuilder.ToString();
    }

I know its old question, but this should help

    private static string GetPrintableCharacter(char character)
    {
        switch (character)
        {
            case '\a':
            {
                return "\\a";
            }
            case '\b':
            {
                return "\\b";
            }
            case '\t':
            {
                return "\\t";
            }
            case '\n':
            {
                return "\\n";
            }
            case '\v':
            {
                return "\\v";
            }
            case '\f':
            {
                return "\\f";
            }
            case '\r':
            {
                return "\\r";
            }
            default:
            {
                if (character == ' ')
                {
                    break;
                }
                else
                {
                    throw new InvalidArgumentException(Resources.NOTSUPPORTCHAR, new object[] { character });
                }
            }
        }
        return "\\x20";
    }

    public static string GetPrintableText(string text)
    {
        StringBuilder stringBuilder = new StringBuilder(1024);
        if (text == null)
        {
            return "[~NULL~]";
        }
        if (text.Length == 0)
        {
            return "[~EMPTY~]";
        }
        stringBuilder.Remove(0, stringBuilder.Length);
        int num = 0;
        for (int i = 0; i < text.Length; i++)
        {
            if (text[i] == '\a' || text[i] == '\b' || text[i] == '\f' || text[i] == '\v' || text[i] == '\t' || text[i] == '\n' || text[i] == '\r' || text[i] == ' ')
            {
                num += 3;
            }
        }
        int length = text.Length + num;
        if (stringBuilder.Capacity < length)
        {
            stringBuilder = new StringBuilder(length);
        }
        string str = text;
        for (int j = 0; j < str.Length; j++)
        {
            char chr = str[j];
            if (chr > ' ')
            {
                stringBuilder.Append(chr);
            }
            else
            {
                stringBuilder.Append(StringHelper.GetPrintableCharacter(chr));
            }
        }
        return stringBuilder.ToString();
    }
走野 2024-08-27 16:55:53

Martin Murphy 的(非工作)代码的一点改进版本:

static Regex reQuotHex = new Regex(@"=[0-9A-H]{2}", RegexOptions.Multiline|RegexOptions.Compiled);

public static string DecodeQuotedPrintable(string input)
{
    var dic = new Dictionary<string, string>();
    foreach (var qp in new HashSet<string>(reQuotHex.Matches(input).Cast<Match>().Select(m => m.Value)))
        dic[qp] = ((char)Convert.ToInt32(qp.Substring(1), 16)).ToString();
        
    foreach (string qp in dic.Keys) {
        input = input.Replace(qp, dic[qp]);
    }
    return input.Replace("=\r\n", "");
}

A bit improved version of (non-working) code from Martin Murphy:

static Regex reQuotHex = new Regex(@"=[0-9A-H]{2}", RegexOptions.Multiline|RegexOptions.Compiled);

public static string DecodeQuotedPrintable(string input)
{
    var dic = new Dictionary<string, string>();
    foreach (var qp in new HashSet<string>(reQuotHex.Matches(input).Cast<Match>().Select(m => m.Value)))
        dic[qp] = ((char)Convert.ToInt32(qp.Substring(1), 16)).ToString();
        
    foreach (string qp in dic.Keys) {
        input = input.Replace(qp, dic[qp]);
    }
    return input.Replace("=\r\n", "");
}
酒中人 2024-08-27 16:55:53

从 @Dave 解决方案开始,这会使用多种编码来解码带引号的可打印字符串,例如
"=?utf-8?Q?Firststring?=\t=?utf-8?Q?_-_1.250=2C50_=E2=82=AC=5F1000=5F2646.pdf?="

public static string DecodeQuotedPrintable(string text)
{
    Regex quotedPrintableEncodingRegex = new Regex(@"=\?((?!\?=).)*\?=", RegexOptions.IgnoreCase);

    MatchCollection quotedPrintableEncodingMatches = quotedPrintableEncodingRegex.Matches(text);

    if (quotedPrintableEncodingMatches.Count <= 0) 
        return text;

    var decodedText = "";
    foreach (Match match in quotedPrintableEncodingMatches)
    {
        Attachment decodedTextPart = Attachment.CreateAttachmentFromString("", match.Value);
        decodedText += decodedTextPart.Name;
    }
    return decodedText;
}

Starting from @Dave solution, this decodes quoted printable strings with more than one encoding, for example
"=?utf-8?Q?Firststring?=\t=?utf-8?Q?_-_1.250=2C50_=E2=82=AC=5F1000=5F2646.pdf?="

public static string DecodeQuotedPrintable(string text)
{
    Regex quotedPrintableEncodingRegex = new Regex(@"=\?((?!\?=).)*\?=", RegexOptions.IgnoreCase);

    MatchCollection quotedPrintableEncodingMatches = quotedPrintableEncodingRegex.Matches(text);

    if (quotedPrintableEncodingMatches.Count <= 0) 
        return text;

    var decodedText = "";
    foreach (Match match in quotedPrintableEncodingMatches)
    {
        Attachment decodedTextPart = Attachment.CreateAttachmentFromString("", match.Value);
        decodedText += decodedTextPart.Name;
    }
    return decodedText;
}
小伙你站住 2024-08-27 16:55:53
public static string DecodeQuotedPrintables(string input, Encoding encoding)
    {
        var regex = new Regex(@"\=(?<Symbol>[0-9A-Z]{2})", RegexOptions.Multiline);
        var matches = regex.Matches(input);
        var bytes = new byte[matches.Count];

        for (var i = 0; i < matches.Count; i++)
        {
            bytes[i] = Convert.ToByte(matches[i].Groups["Symbol"].Value, 16);
        }

        return encoding.GetString(bytes);
    }
public static string DecodeQuotedPrintables(string input, Encoding encoding)
    {
        var regex = new Regex(@"\=(?<Symbol>[0-9A-Z]{2})", RegexOptions.Multiline);
        var matches = regex.Matches(input);
        var bytes = new byte[matches.Count];

        for (var i = 0; i < matches.Count; i++)
        {
            bytes[i] = Convert.ToByte(matches[i].Groups["Symbol"].Value, 16);
        }

        return encoding.GetString(bytes);
    }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文