如何在 C# 中将字符串转换为 RTF?

发布于 2024-10-13 17:48:29 字数 2664 浏览 6 评论 0原文

问题

如何将字符串“Européen”转换为 RTF 格式的字符串“Europ\'e9en”?

[TestMethod]
public void Convert_A_Word_To_Rtf()
{
    // Arrange
    string word = "Européen";
    string expected = "Europ\'e9en";
    string actual = string.Empty;

    // Act
    // actual = ... // How?

    // Assert
    Assert.AreEqual(expected, actual);
}

到目前为止我发现的

RichTextBox

RichTextBox 可以用于某些事情。示例:

RichTextBox richTextBox = new RichTextBox();
richTextBox.Text = "Européen";
string rtfFormattedString = richTextBox.Rtf;

但是 rtfFormattedString 结果是整个 RTF 格式的文档,而不仅仅是字符串“Europ\'e9en”。

Stackoverflow

Google

我还找到了很多其他资源在网上,但没有什么能完全解决我的问题。

回答

Brad Christie 的回答

必须添加 Trim() 来删除 结果 中前面的空格。除此之外,布拉德·克里斯蒂的解决方案似乎有效。

我现在将使用这个解决方案,尽管我有一种不好的直觉,因为我们必须对 RichTextBox 进行 SubString 和 Trim 以获得 RTF 格式的字符串。

测试用例:

[TestMethod]
public void Test_To_Verify_Brad_Christies_Stackoverflow_Answer()
{
        Assert.AreEqual(@"Europ\'e9en", "Européen".ConvertToRtf());
        Assert.AreEqual(@"d\'e9finitif", "définitif".ConvertToRtf());
        Assert.AreEqual(@"\'e0", "à".ConvertToRtf());
        Assert.AreEqual(@"H\'e4user", "Häuser".ConvertToRtf());
        Assert.AreEqual(@"T\'fcren", "Türen".ConvertToRtf());
        Assert.AreEqual(@"B\'f6den", "Böden".ConvertToRtf());
}

逻辑作为扩展方法:

public static class StringExtensions
{
    public static string ConvertToRtf(this string value)
    {
        RichTextBox richTextBox = new RichTextBox();
        richTextBox.Text = value;
        int offset = richTextBox.Rtf.IndexOf(@"\f0\fs17") + 8; // offset = 118;
        int len = richTextBox.Rtf.LastIndexOf(@"\par") - offset;
        string result = richTextBox.Rtf.Substring(offset, len).Trim();
        return result;
    }
}

Question

How do I convert the string "Européen" to the RTF-formatted string "Europ\'e9en"?

[TestMethod]
public void Convert_A_Word_To_Rtf()
{
    // Arrange
    string word = "Européen";
    string expected = "Europ\'e9en";
    string actual = string.Empty;

    // Act
    // actual = ... // How?

    // Assert
    Assert.AreEqual(expected, actual);
}

What I have found so far

RichTextBox

RichTextBox can be used for certain things. Example:

RichTextBox richTextBox = new RichTextBox();
richTextBox.Text = "Européen";
string rtfFormattedString = richTextBox.Rtf;

But then rtfFormattedString turns out to be the entire RTF-formatted document, not just the string "Europ\'e9en".

Stackoverflow

Google

I've also found a bunch of other resources on the web, but nothing quite solved my problem.

Answer

Brad Christie's answer

Had to add Trim() to remove the preceeding space in result. Other than that, Brad Christie's solution seems to work.

I'll run with this solution for now even though I have a bad gut feeling since we have to SubString and Trim the heck out of RichTextBox to get a RTF-formatted string.

Test case:

[TestMethod]
public void Test_To_Verify_Brad_Christies_Stackoverflow_Answer()
{
        Assert.AreEqual(@"Europ\'e9en", "Européen".ConvertToRtf());
        Assert.AreEqual(@"d\'e9finitif", "définitif".ConvertToRtf());
        Assert.AreEqual(@"\'e0", "à".ConvertToRtf());
        Assert.AreEqual(@"H\'e4user", "Häuser".ConvertToRtf());
        Assert.AreEqual(@"T\'fcren", "Türen".ConvertToRtf());
        Assert.AreEqual(@"B\'f6den", "Böden".ConvertToRtf());
}

Logic as an extension method:

public static class StringExtensions
{
    public static string ConvertToRtf(this string value)
    {
        RichTextBox richTextBox = new RichTextBox();
        richTextBox.Text = value;
        int offset = richTextBox.Rtf.IndexOf(@"\f0\fs17") + 8; // offset = 118;
        int len = richTextBox.Rtf.LastIndexOf(@"\par") - offset;
        string result = richTextBox.Rtf.Substring(offset, len).Trim();
        return result;
    }
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

凡尘雨 2024-10-20 17:48:29

RichTextBox 不是总是具有相同的页眉/页脚吗?您可以根据偏移位置读取内容,然后继续使用它进行解析。 (我认为?如果我错了,请纠正我)

有可用的库,但我个人从来没有运气好过它们(尽管总是在完全穷尽可能性之前找到另一种方法)。此外,大多数更好的通常都包含象征性的费用。


编辑
有点黑客,但这应该能让你完成你需要完成的事情(我希望):

RichTextBox rich = new RichTextBox();
Console.Write(rich.Rtf);

String[] words = { "Européen", "Apple", "Carrot", "Touché", "Résumé", "A Européen eating an apple while writing his Résumé, Touché!" };
foreach (String word in words)
{
    rich.Text = word;
    Int32 offset = rich.Rtf.IndexOf(@"\f0\fs17") + 8;
    Int32 len = rich.Rtf.LastIndexOf(@"\par") - offset;
    Console.WriteLine("{0,-15} : {1}", word, rich.Rtf.Substring(offset, len).Trim());
}

编辑2

代码细分 RTF 控制代码如下:

  • 标题
    • <代码>\f0< /code> - 使用 0-index 字体(列表中的第一个字体,通常是 Microsoft Sans Serif(在标题的字体表中注明:{\fonttbl{\f0\fnil\fcharset0微软无衬线字体;}}))
    • <代码>\fs17< /code> - 字体格式,指定大小为 17(17 为半点)
  • 页脚

希望这能澄清一些事情。 ;-)

Doesn't RichTextBox always have the same header/footer? You could just read the content based on off-set location, and continue using it to parse. (I think? please correct me if I'm wrong)

There are libraries available, but I've never had good luck with them personally (though always just found another method before fully exhausting the possibilities). In addition, most of the better ones are usually include a nominal fee.


EDIT
Kind of a hack, but this should get you through what you need to get through (I hope):

RichTextBox rich = new RichTextBox();
Console.Write(rich.Rtf);

String[] words = { "Européen", "Apple", "Carrot", "Touché", "Résumé", "A Européen eating an apple while writing his Résumé, Touché!" };
foreach (String word in words)
{
    rich.Text = word;
    Int32 offset = rich.Rtf.IndexOf(@"\f0\fs17") + 8;
    Int32 len = rich.Rtf.LastIndexOf(@"\par") - offset;
    Console.WriteLine("{0,-15} : {1}", word, rich.Rtf.Substring(offset, len).Trim());
}

EDIT 2

The breakdown of the codes RTF control code are as follows:

  • Header
    • \f0 - Use the 0-index font (first font in the list, which is typically Microsoft Sans Serif (noted in the font table in the header: {\fonttbl{\f0\fnil\fcharset0 Microsoft Sans Serif;}}))
    • \fs17 - Font formatting, specify the size is 17 (17 being in half-points)
  • Footer
    • \par is specifying that it's the end of a paragraph.

Hopefully that clears some things up. ;-)

心的憧憬 2024-10-20 17:48:29

我找到了一个很好的解决方案,实际上使用 RichTextBox 本身来进行转换:

private static string FormatAsRTF(string DirtyText)
{
    System.Windows.Forms.RichTextBox rtf = new System.Windows.Forms.RichTextBox();
    rtf.Text = DirtyText;
    return rtf.Rtf;
}

http://www.baltimoreconsulting.com/blog/development/easily-convert-a-string-to-rtf-in-net/

I found a nice solution that actually uses the RichTextBox itself to do the conversion:

private static string FormatAsRTF(string DirtyText)
{
    System.Windows.Forms.RichTextBox rtf = new System.Windows.Forms.RichTextBox();
    rtf.Text = DirtyText;
    return rtf.Rtf;
}

http://www.baltimoreconsulting.com/blog/development/easily-convert-a-string-to-rtf-in-net/

庆幸我还是我 2024-10-20 17:48:29

我就是这样:

private string ConvertString2RTF(string input)
{
    //first take care of special RTF chars
    StringBuilder backslashed = new StringBuilder(input);
    backslashed.Replace(@"\", @"\\");
    backslashed.Replace(@"{", @"\{");
    backslashed.Replace(@"}", @"\}");

    //then convert the string char by char
    StringBuilder sb = new StringBuilder();
    foreach (char character in backslashed.ToString())
    {
        if (character <= 0x7f)
            sb.Append(character);
        else
            sb.Append("\\u" + Convert.ToUInt32(character) + "?");
    }
    return sb.ToString();
}

我认为使用 RichTextBox 是:
1)过度杀戮
2) 在花了几天时间尝试让它与在 Word 中创建的 RTF 文档一起使用之后,我不喜欢 RichTextBox

This is how I went:

private string ConvertString2RTF(string input)
{
    //first take care of special RTF chars
    StringBuilder backslashed = new StringBuilder(input);
    backslashed.Replace(@"\", @"\\");
    backslashed.Replace(@"{", @"\{");
    backslashed.Replace(@"}", @"\}");

    //then convert the string char by char
    StringBuilder sb = new StringBuilder();
    foreach (char character in backslashed.ToString())
    {
        if (character <= 0x7f)
            sb.Append(character);
        else
            sb.Append("\\u" + Convert.ToUInt32(character) + "?");
    }
    return sb.ToString();
}

I think using a RichTextBox is:
1) overkill
2) I don't like RichTextBox after spending days of trying to make it work with an RTF document created in Word.

一影成城 2024-10-20 17:48:29

这是改进的 @Vladislav Zalesak 的答案:

public static string ConvertToRtf(string text)
{
    // using default template from wiki
    StringBuilder sb = new StringBuilder(@"{\rtf1\ansi\ansicpg1250\deff0{\fonttbl\f0\fswiss Helvetica;}\f0\pard ");
    foreach (char character in text)
    {
        if (character <= 0x7f)
        {
            // escaping rtf characters
            switch (character)
            {
                case '\\':
                case '{':
                case '}':
                    sb.Append('\\');
                    break;
                case '\r':
                    sb.Append("\\par");
                    break;
            }

            sb.Append(character);
        }
        // converting special characters
        else
        {
            sb.Append("\\u" + Convert.ToUInt32(character) + "?");
        }
    }
    sb.Append("}");
    return sb.ToString();
}

Here's improved @Vladislav Zalesak's answer:

public static string ConvertToRtf(string text)
{
    // using default template from wiki
    StringBuilder sb = new StringBuilder(@"{\rtf1\ansi\ansicpg1250\deff0{\fonttbl\f0\fswiss Helvetica;}\f0\pard ");
    foreach (char character in text)
    {
        if (character <= 0x7f)
        {
            // escaping rtf characters
            switch (character)
            {
                case '\\':
                case '{':
                case '}':
                    sb.Append('\\');
                    break;
                case '\r':
                    sb.Append("\\par");
                    break;
            }

            sb.Append(character);
        }
        // converting special characters
        else
        {
            sb.Append("\\u" + Convert.ToUInt32(character) + "?");
        }
    }
    sb.Append("}");
    return sb.ToString();
}
怪我闹别瞎闹 2024-10-20 17:48:29

下面是一个将字符串转换为 RTF 字符串的丑陋示例:

class Program
{
    static RichTextBox generalRTF = new RichTextBox();

    static void Main()
    {
        string foo = @"Européen";
        string output = ToRtf(foo);
        Trace.WriteLine(output);
    }

    private static string ToRtf(string foo)
    {
        string bar = string.Format("!!@@!!{0}!!@@!!", foo);
        generalRTF.Text = bar;
        int pos1 = generalRTF.Rtf.IndexOf("!!@@!!");
        int pos2 = generalRTF.Rtf.LastIndexOf("!!@@!!");
        if (pos1 != -1 && pos2 != -1 && pos2 > pos1 + "!!@@!!".Length)
        {
            pos1 += "!!@@!!".Length;
            return generalRTF.Rtf.Substring(pos1, pos2 - pos1);
        }
        throw new Exception("Not sure how this happened...");
    }
}

Below is an ugly example of converting a string to an RTF string:

class Program
{
    static RichTextBox generalRTF = new RichTextBox();

    static void Main()
    {
        string foo = @"Européen";
        string output = ToRtf(foo);
        Trace.WriteLine(output);
    }

    private static string ToRtf(string foo)
    {
        string bar = string.Format("!!@@!!{0}!!@@!!", foo);
        generalRTF.Text = bar;
        int pos1 = generalRTF.Rtf.IndexOf("!!@@!!");
        int pos2 = generalRTF.Rtf.LastIndexOf("!!@@!!");
        if (pos1 != -1 && pos2 != -1 && pos2 > pos1 + "!!@@!!".Length)
        {
            pos1 += "!!@@!!".Length;
            return generalRTF.Rtf.Substring(pos1, pos2 - pos1);
        }
        throw new Exception("Not sure how this happened...");
    }
}
聚集的泪 2024-10-20 17:48:29

我知道已经有一段时间了,希望这会有所帮助..

在尝试了我可以使用的所有转换代码后,此代码对我有用:

titleText 和 contentText 是填充在常规 TextBox 中的简单文本

var rtb = new RichTextBox();
rtb.AppendText(titleText)
rtb.AppendText(Environment.NewLine);
rtb.AppendText(contentText)

rtb.Refresh();

rtb.rtf 现在保存rtf 文本。

以下代码将保存 rtf 文本并允许您打开文件、编辑它,然后再次将其加载回 RichTextBox:

rtb.SaveFile(path, RichTextBoxStreamType.RichText);

I know it has been a while, hope this helps..

This code is working for me after trying every conversion code I could put my hands on:

titleText and contentText are simple text filled in a regular TextBox

var rtb = new RichTextBox();
rtb.AppendText(titleText)
rtb.AppendText(Environment.NewLine);
rtb.AppendText(contentText)

rtb.Refresh();

rtb.rtf now holds the rtf text.

The following code will save the rtf text and allow you to open the file, edit it and than load it back into a RichTextBox back again:

rtb.SaveFile(path, RichTextBoxStreamType.RichText);
绾颜 2024-10-20 17:48:29

不是最优雅的,但相当优化和快速的方法:

public static string PlainTextToRtf(string plainText)
{
    if (string.IsNullOrEmpty(plainText))
        return "";

    string escapedPlainText = plainText.Replace(@"\", @"\\").Replace("{", @"\{").Replace("}", @"\}");
    escapedPlainText = EncodeCharacters(escapedPlainText);

    string rtf = @"{\rtf1\ansi\ansicpg1250\deff0{\fonttbl\f0\fswiss Helvetica;}\f0\pard ";
    rtf += escapedPlainText.Replace(Environment.NewLine, "\\par\r\n ") + ;
    rtf += " }";
    return rtf;
}

编码字符(波兰语)方法:

private static string EncodeCharacters(string text)
{
    if (string.IsNullOrEmpty(text))
        return "";

    return text
        .Replace("ą", @"\'b9")
        .Replace("ć", @"\'e6")
        .Replace("ę", @"\'ea")
        .Replace("ł", @"\'b3")
        .Replace("ń", @"\'f1")
        .Replace("ó", @"\'f3")
        .Replace("ś", @"\'9c")
        .Replace("ź", @"\'9f")
        .Replace("ż", @"\'bf")
        .Replace("Ą", @"\'a5")
        .Replace("Ć", @"\'c6")
        .Replace("Ę", @"\'ca")
        .Replace("Ł", @"\'a3")
        .Replace("Ń", @"\'d1")
        .Replace("Ó", @"\'d3")
        .Replace("Ś", @"\'8c")
        .Replace("Ź", @"\'8f")
        .Replace("Ż", @"\'af");
}

Not the most elegant, but quite optimal and fast method:

public static string PlainTextToRtf(string plainText)
{
    if (string.IsNullOrEmpty(plainText))
        return "";

    string escapedPlainText = plainText.Replace(@"\", @"\\").Replace("{", @"\{").Replace("}", @"\}");
    escapedPlainText = EncodeCharacters(escapedPlainText);

    string rtf = @"{\rtf1\ansi\ansicpg1250\deff0{\fonttbl\f0\fswiss Helvetica;}\f0\pard ";
    rtf += escapedPlainText.Replace(Environment.NewLine, "\\par\r\n ") + ;
    rtf += " }";
    return rtf;
}

.

Encode characters (Polish ones) method:

private static string EncodeCharacters(string text)
{
    if (string.IsNullOrEmpty(text))
        return "";

    return text
        .Replace("ą", @"\'b9")
        .Replace("ć", @"\'e6")
        .Replace("ę", @"\'ea")
        .Replace("ł", @"\'b3")
        .Replace("ń", @"\'f1")
        .Replace("ó", @"\'f3")
        .Replace("ś", @"\'9c")
        .Replace("ź", @"\'9f")
        .Replace("ż", @"\'bf")
        .Replace("Ą", @"\'a5")
        .Replace("Ć", @"\'c6")
        .Replace("Ę", @"\'ca")
        .Replace("Ł", @"\'a3")
        .Replace("Ń", @"\'d1")
        .Replace("Ó", @"\'d3")
        .Replace("Ś", @"\'8c")
        .Replace("Ź", @"\'8f")
        .Replace("Ż", @"\'af");
}
记忆之渊 2024-10-20 17:48:29
private static string ConvertToRtf(string text)
{
    // Create a regular expression pattern to match non-ASCII characters
    string pattern = "[^\x00-\x7F]";
    // Use Regex.Replace to escape non-ASCII characters
    return Regex.Replace(text, pattern, m => m.Value[0] > 255 ? @"\u" + ((int)m.Value[0]).ToString() + "?" : @"\'" + ((int)m.Value[0]).ToString("X2").ToLowerInvariant());
}
private static string ConvertToRtf(string text)
{
    // Create a regular expression pattern to match non-ASCII characters
    string pattern = "[^\x00-\x7F]";
    // Use Regex.Replace to escape non-ASCII characters
    return Regex.Replace(text, pattern, m => m.Value[0] > 255 ? @"\u" + ((int)m.Value[0]).ToString() + "?" : @"\'" + ((int)m.Value[0]).ToString("X2").ToLowerInvariant());
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文