C#有效的方法来为字符串中的某些字符添加逃生

发布于 2025-02-07 00:23:40 字数 294 浏览 1 评论 0原文

需要用\加上原始字符替换字符串中的

charachter

string origin = "words&sales -test\strange";
string[] specialChars = new string[]{"\", "&", "-", "?",......}; 

"words\&sales \-test\\strange"

一些 并更换

感谢

i need to replace some charachters in a string with a \ plus the original character

so giving thats string and array

string origin = "words&sales -test\strange";
string[] specialChars = new string[]{"\", "&", "-", "?",......}; 

i want to get

"words\&sales \-test\\strange"

notice that the \ itself is a character to find and replace

thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

昇り龍 2025-02-14 00:23:40

一般而言,即使您正在转换另一个string value,c#/。net中构建字符串值的最快方法即使。

另一个问题是确定应逃脱的“最佳”方法:如果在编译时固定了可逃逸的字符集,则使用switch()语句,因为将被编译到本机跳台上,该台式比使用运行时hashset< char>用于确定设置成员:

例如:


static String Escape( String input )
{
    StringBuilder sb = new StringBuilder( capacity: 5 * input.Length / 4 ); // Assuming 25% length increase due to escaping.

    foreach( Char c in input )
    {
        switch( c )
        {
        case '\\':
        case '&':
        case '-':
        case '?':
            _ = sb.Append( '\\' ).Append( c );
            break;
        default:
            _ = sb.Append( c );
            break;
        }
    }

    return sb.ToString();
}

如果在运行时定义了一组可逃逸的字符,则使用<代码> hashset&lt; char&gt; 可能是最佳的总体选项 - 尽管如果您知道自己只是在有限范围内使用Unicode代码点处理字符(例如,在范围内与ASCII兼容的字符0x00到0x7f)然后,您可以使用boolean [127]数组来存储逃生标志映射。

使用hashset&lt; char&gt;,就是这样:

static String Escape( String input, IEnumerable<Char> escapableChars )
{
    HashSet<Char> escapeThese = new HashSet<Char>( escapableChars );

    StringBuilder sb = new StringBuilder( capacity: 5 * input.Length / 4 ); // Assuming 25% length increase due to escaping.

    foreach( Char c in input )
    {
        if( escapeThese.Contains( c ) )
        {
            _ = sb.Append( '\\' ).Append( c );
        }
        else
        {
            _ = sb.Append( c );
        }
    }

    return sb.ToString();
}

当然,可以进一步优化上述代码:一些建议:

  • 首先检查String> String> String Input什至甚至首先有任何可逃避的字符:如果其字符都不是逃脱的,则只需直接返回input而不会创建新的StringBuilder
  • 创建一个StringBuilder实例的(按需)池,而不是在每个呼叫上创建新实例。
  • 允许readonlyspan&lt; char&gt;而不是字符串用于输入和写入输出到span&lt; char&gt; - 您需要一个初始步骤来计算首先,span&lt; char&gt;首先需要的最小尺寸,然后将该信息传递回呼叫者。
    • 可以完成相同的最小尺寸计算以具有完全正确的能力: stringbuilder而不是我的(懒惰)25%估算值。
  • 添加备忘录:使用Bloom过滤器和输出缓存由Input值键键入。

Generally speaking, the fastest way to build String values in C#/.NET is with a StringBuilder, even if you're transforming another String value.

The other problem is the "best" way to determine which char values should be escaped or not: if the set of escapable characters is fixed at compile-time, then use a switch() statement, as that will be compiled to a native jump-table, which is faster than using a runtime HashSet<Char> for determining set-membership:

e.g.:


static String Escape( String input )
{
    StringBuilder sb = new StringBuilder( capacity: 5 * input.Length / 4 ); // Assuming 25% length increase due to escaping.

    foreach( Char c in input )
    {
        switch( c )
        {
        case '\\':
        case '&':
        case '-':
        case '?':
            _ = sb.Append( '\\' ).Append( c );
            break;
        default:
            _ = sb.Append( c );
            break;
        }
    }

    return sb.ToString();
}

If the set of escapable character is defined at runtime then using a HashSet<Char> will likely be the best overall option - though if you know you're only processing chars with Unicode code-points within a limited range (say ASCII-compatible chars in the range 0x00 to 0x7F) then you could use a Boolean[127] array to store the escape flag map.

Using a HashSet<Char>, it would be like this:

static String Escape( String input, IEnumerable<Char> escapableChars )
{
    HashSet<Char> escapeThese = new HashSet<Char>( escapableChars );

    StringBuilder sb = new StringBuilder( capacity: 5 * input.Length / 4 ); // Assuming 25% length increase due to escaping.

    foreach( Char c in input )
    {
        if( escapeThese.Contains( c ) )
        {
            _ = sb.Append( '\\' ).Append( c );
        }
        else
        {
            _ = sb.Append( c );
        }
    }

    return sb.ToString();
}

Of course, the above code can be optimized further: some suggestions:

  • First check to see if the String input even has any escapable characters in the first place: if none of its characters are escapable then just return input directly without having created a new StringBuilder.
  • Create an (on-demand) pool of StringBuilder instances instead of creating new instances on every call.
  • Allow ReadOnlySpan<Char> instead of String for input and writing output to Span<Char> - you'll need an initial step to calculate the required minimum size of the Span<Char> first though, and pass that info back to the caller.
    • The same minimum-size calculation can be done to have an exactly correct capacity: value for the StringBuilder instead of my (lazy) 25% estimate.
  • Add memoization: use a Bloom filter and output cache keyed by the input value.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文