C# 电子邮件主题解析

发布于 2024-10-01 00:01:49 字数 538 浏览 9 评论 0原文

我正在用 C# 构建一个用于阅读电子邮件的系统。我在解析主题时遇到问题,我认为该问题与编码有关。

我正在阅读的主题如下:=?ISO-8859-1?Q?=E6=F8sd=E5f=F8sdf_sdfsdf?=,发送的原始主题是æøsdåføsdf sdfsdf(其中有挪威语字符)。

有什么想法可以更改编码或正确解析它吗?到目前为止,我已经尝试使用 C# 编码转换技术将主题编码为 utf8,但没有任何运气。

这是我尝试过的解决方案之一:

Encoding iso = Encoding.GetEncoding("iso-8859-1");
Encoding utf = Encoding.UTF8;
string decodedSubject =
    utf.GetString(Encoding.Convert(utf, iso,
                                   iso.GetBytes(m.Subject.Split('?')[3])));

I'm building a system for reading emails in C#. I've got a problem parsing the subject, a problem which I think is related to encoding.

The subject I'm reading is as follows: =?ISO-8859-1?Q?=E6=F8sd=E5f=F8sdf_sdfsdf?=, the original subject sent is æøsdåføsdf sdfsdf (Norwegian characters in there).

Any ideas how I can change encoding or parse this correctly? So far I've tried to use the C# encoding conversion techniques to encode the subject to utf8, but without any luck.

Here is one of the solutions I tried:

Encoding iso = Encoding.GetEncoding("iso-8859-1");
Encoding utf = Encoding.UTF8;
string decodedSubject =
    utf.GetString(Encoding.Convert(utf, iso,
                                   iso.GetBytes(m.Subject.Split('?')[3])));

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

霓裳挽歌倾城醉 2024-10-08 00:01:49
    public static string DecodeEncodedWordValue(string mimeString)
    {
        var regex = new Regex(@"=\?(?<charset>.*?)\?(?<encoding>[qQbB])\?(?<value>.*?)\?=");
        var encodedString = mimeString;
        var decodedString = string.Empty;

        while (encodedString.Length > 0)
        {
            var match = regex.Match(encodedString);
            if (match.Success)
            {
                // If the match isn't at the start of the string, copy the initial few chars to the output
                decodedString += encodedString.Substring(0, match.Index);

                var charset = match.Groups["charset"].Value;
                var encoding = match.Groups["encoding"].Value.ToUpper();
                var value = match.Groups["value"].Value;

                if (encoding.Equals("B"))
                {
                    // Encoded value is Base-64
                    var bytes = Convert.FromBase64String(value);
                    decodedString += Encoding.GetEncoding(charset).GetString(bytes);
                }
                else if (encoding.Equals("Q"))
                {
                    // Encoded value is Quoted-Printable
                    // Parse looking for =XX where XX is hexadecimal
                    var regx = new Regex("(\\=([0-9A-F][0-9A-F]))", RegexOptions.IgnoreCase);
                    decodedString += regx.Replace(value, new MatchEvaluator(delegate(Match m)
                    {
                        var hex = m.Groups[2].Value;
                        var iHex = Convert.ToInt32(hex, 16);

                        // Return the string in the charset defined
                        var bytes = new byte[1];
                        bytes[0] = Convert.ToByte(iHex);
                        return Encoding.GetEncoding(charset).GetString(bytes);
                    }));
                    decodedString = decodedString.Replace('_', ' ');
                }
                else
                {
                    // Encoded value not known, return original string
                    // (Match should not be successful in this case, so this code may never get hit)
                    decodedString += encodedString;
                    break;
                }

                // Trim off up to and including the match, then we'll loop and try matching again.
                encodedString = encodedString.Substring(match.Index + match.Length);
            }
            else
            {
                // No match, not encoded, return original string
                decodedString += encodedString;
                break;
            }
        }
        return decodedString;
    }
    public static string DecodeEncodedWordValue(string mimeString)
    {
        var regex = new Regex(@"=\?(?<charset>.*?)\?(?<encoding>[qQbB])\?(?<value>.*?)\?=");
        var encodedString = mimeString;
        var decodedString = string.Empty;

        while (encodedString.Length > 0)
        {
            var match = regex.Match(encodedString);
            if (match.Success)
            {
                // If the match isn't at the start of the string, copy the initial few chars to the output
                decodedString += encodedString.Substring(0, match.Index);

                var charset = match.Groups["charset"].Value;
                var encoding = match.Groups["encoding"].Value.ToUpper();
                var value = match.Groups["value"].Value;

                if (encoding.Equals("B"))
                {
                    // Encoded value is Base-64
                    var bytes = Convert.FromBase64String(value);
                    decodedString += Encoding.GetEncoding(charset).GetString(bytes);
                }
                else if (encoding.Equals("Q"))
                {
                    // Encoded value is Quoted-Printable
                    // Parse looking for =XX where XX is hexadecimal
                    var regx = new Regex("(\\=([0-9A-F][0-9A-F]))", RegexOptions.IgnoreCase);
                    decodedString += regx.Replace(value, new MatchEvaluator(delegate(Match m)
                    {
                        var hex = m.Groups[2].Value;
                        var iHex = Convert.ToInt32(hex, 16);

                        // Return the string in the charset defined
                        var bytes = new byte[1];
                        bytes[0] = Convert.ToByte(iHex);
                        return Encoding.GetEncoding(charset).GetString(bytes);
                    }));
                    decodedString = decodedString.Replace('_', ' ');
                }
                else
                {
                    // Encoded value not known, return original string
                    // (Match should not be successful in this case, so this code may never get hit)
                    decodedString += encodedString;
                    break;
                }

                // Trim off up to and including the match, then we'll loop and try matching again.
                encodedString = encodedString.Substring(match.Index + match.Length);
            }
            else
            {
                // No match, not encoded, return original string
                decodedString += encodedString;
                break;
            }
        }
        return decodedString;
    }
深府石板幽径 2024-10-08 00:01:49

该编码称为quoted printable

请参阅问题的答案。

改编自已接受的答案

public string DecodeQuotedPrintable(string value)
{
        Attachment attachment = Attachment.CreateAttachmentFromString("", value);
        return attachment.Name;
}

当传递字符串时=?ISO-8859-1?Q?=E6=F8sd=E5f=F8sdf_sdfsdf?= 这将返回“æøsdåføsdf_sdfsdf”。

The encoding is called quoted printable.

See the answers to this question.

Adapted from the accepted answer:

public string DecodeQuotedPrintable(string value)
{
        Attachment attachment = Attachment.CreateAttachmentFromString("", value);
        return attachment.Name;
}

When passed the string =?ISO-8859-1?Q?=E6=F8sd=E5f=F8sdf_sdfsdf?= this returns "æøsdåføsdf_sdfsdf".

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文