C# 清理文件名

发布于 2024-07-08 23:00:31 字数 2226 浏览 14 评论 0原文

我最近将一堆 MP3 从不同的位置移动到存储库中。 我一直在使用 ID3 标记构建新文件名(感谢 TagLib-Sharp!),并且我注意到我收到了 System.NotSupportedException:

“不支持给定路径的格式。”

这是由 File.Copy()Directory.CreateDirectory() 生成的。

没过多久我就意识到我的文件名需要清理。 所以我做了一件显而易见的事情:

public static string SanitizePath_(string path, char replaceChar)
{
    string dir = Path.GetDirectoryName(path);
    foreach (char c in Path.GetInvalidPathChars())
        dir = dir.Replace(c, replaceChar);

    string name = Path.GetFileName(path);
    foreach (char c in Path.GetInvalidFileNameChars())
        name = name.Replace(c, replaceChar);

    return dir + name;
}

令我惊讶的是,我继续遇到例外。 结果发现':'不在Path.GetInvalidPathChars()集合中,因为它在路径根中有效。 我认为这是有道理的 - 但这必须是一个非常常见的问题。 有人有一些可以清理路径的简短代码吗? 这是我想出的最彻底的方法,但感觉它可能有点矫枉过正。

    // replaces invalid characters with replaceChar
    public static string SanitizePath(string path, char replaceChar)
    {
        // construct a list of characters that can't show up in filenames.
        // need to do this because ":" is not in InvalidPathChars
        if (_BadChars == null)
        {
            _BadChars = new List<char>(Path.GetInvalidFileNameChars());
            _BadChars.AddRange(Path.GetInvalidPathChars());
            _BadChars = Utility.GetUnique<char>(_BadChars);
        }

        // remove root
        string root = Path.GetPathRoot(path);
        path = path.Remove(0, root.Length);

        // split on the directory separator character. Need to do this
        // because the separator is not valid in a filename.
        List<string> parts = new List<string>(path.Split(new char[]{Path.DirectorySeparatorChar}));

        // check each part to make sure it is valid.
        for (int i = 0; i < parts.Count; i++)
        {
            string part = parts[i];
            foreach (char c in _BadChars)
            {
                part = part.Replace(c, replaceChar);
            }
            parts[i] = part;
        }

        return root + Utility.Join(parts, Path.DirectorySeparatorChar.ToString());
    }

任何使该功能更快且更少巴洛克式的改进将不胜感激。

I recently have been moving a bunch of MP3s from various locations into a repository. I had been constructing the new file names using the ID3 tags (thanks, TagLib-Sharp!), and I noticed that I was getting a System.NotSupportedException:

"The given path's format is not supported."

This was generated by either File.Copy() or Directory.CreateDirectory().

It didn't take long to realize that my file names needed to be sanitized. So I did the obvious thing:

public static string SanitizePath_(string path, char replaceChar)
{
    string dir = Path.GetDirectoryName(path);
    foreach (char c in Path.GetInvalidPathChars())
        dir = dir.Replace(c, replaceChar);

    string name = Path.GetFileName(path);
    foreach (char c in Path.GetInvalidFileNameChars())
        name = name.Replace(c, replaceChar);

    return dir + name;
}

To my surprise, I continued to get exceptions. It turned out that ':' is not in the set of Path.GetInvalidPathChars(), because it is valid in a path root. I suppose that makes sense - but this has to be a pretty common problem. Does anyone have some short code that sanitizes a path? The most thorough I've come up with this, but it feels like it is probably overkill.

    // replaces invalid characters with replaceChar
    public static string SanitizePath(string path, char replaceChar)
    {
        // construct a list of characters that can't show up in filenames.
        // need to do this because ":" is not in InvalidPathChars
        if (_BadChars == null)
        {
            _BadChars = new List<char>(Path.GetInvalidFileNameChars());
            _BadChars.AddRange(Path.GetInvalidPathChars());
            _BadChars = Utility.GetUnique<char>(_BadChars);
        }

        // remove root
        string root = Path.GetPathRoot(path);
        path = path.Remove(0, root.Length);

        // split on the directory separator character. Need to do this
        // because the separator is not valid in a filename.
        List<string> parts = new List<string>(path.Split(new char[]{Path.DirectorySeparatorChar}));

        // check each part to make sure it is valid.
        for (int i = 0; i < parts.Count; i++)
        {
            string part = parts[i];
            foreach (char c in _BadChars)
            {
                part = part.Replace(c, replaceChar);
            }
            parts[i] = part;
        }

        return root + Utility.Join(parts, Path.DirectorySeparatorChar.ToString());
    }

Any improvements to make this function faster and less baroque would be much appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(16

我认为问题在于您首先对错误字符串调用 Path.GetDirectoryName 。 如果其中包含非文件名字符,.Net 无法判断字符串的哪些部分是目录并抛出异常。 您必须进行字符串比较。

假设只有文件名有问题,而不是整个路径,请尝试以下操作:

public static string SanitizePath(string path, char replaceChar)
{
    int filenamePos = path.LastIndexOf(Path.DirectorySeparatorChar) + 1;
    var sb = new System.Text.StringBuilder();
    sb.Append(path.Substring(0, filenamePos));
    for (int i = filenamePos; i < path.Length; i++)
    {
        char filenameChar = path[i];
        foreach (char c in Path.GetInvalidFileNameChars())
            if (filenameChar.Equals(c))
            {
                filenameChar = replaceChar;
                break;
            }

        sb.Append(filenameChar);
    }

    return sb.ToString();
}

I think the problem is that you first call Path.GetDirectoryName on the bad string. If this has non-filename characters in it, .Net can't tell which parts of the string are directories and throws. You have to do string comparisons.

Assuming it's only the filename that is bad, not the entire path, try this:

public static string SanitizePath(string path, char replaceChar)
{
    int filenamePos = path.LastIndexOf(Path.DirectorySeparatorChar) + 1;
    var sb = new System.Text.StringBuilder();
    sb.Append(path.Substring(0, filenamePos));
    for (int i = filenamePos; i < path.Length; i++)
    {
        char filenameChar = path[i];
        foreach (char c in Path.GetInvalidFileNameChars())
            if (filenameChar.Equals(c))
            {
                filenameChar = replaceChar;
                break;
            }

        sb.Append(filenameChar);
    }

    return sb.ToString();
}
那请放手 2024-07-15 23:00:32

我过去在这方面取得了成功。

漂亮、简短、静态:-)

    public static string returnSafeString(string s)
    {
        foreach (char character in Path.GetInvalidFileNameChars())
        {
            s = s.Replace(character.ToString(),string.Empty);
        }

        foreach (char character in Path.GetInvalidPathChars())
        {
            s = s.Replace(character.ToString(), string.Empty);
        }

        return (s);
    }

I have had success with this in the past.

Nice, short and static :-)

    public static string returnSafeString(string s)
    {
        foreach (char character in Path.GetInvalidFileNameChars())
        {
            s = s.Replace(character.ToString(),string.Empty);
        }

        foreach (char character in Path.GetInvalidPathChars())
        {
            s = s.Replace(character.ToString(), string.Empty);
        }

        return (s);
    }
聆听风音 2024-07-15 23:00:32

这是一个基于Andre代码的高效延迟加载扩展方法:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace LT
{
    public static class Utility
    {
        static string invalidRegStr;

        public static string MakeValidFileName(this string name)
        {
            if (invalidRegStr == null)
            {
                var invalidChars = System.Text.RegularExpressions.Regex.Escape(new string(System.IO.Path.GetInvalidFileNameChars()));
                invalidRegStr = string.Format(@"([{0}]*\.+$)|([{0}]+)", invalidChars);
            }

            return System.Text.RegularExpressions.Regex.Replace(name, invalidRegStr, "_");
        }
    }
}

Here's an efficient lazy loading extension method based on Andre's code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace LT
{
    public static class Utility
    {
        static string invalidRegStr;

        public static string MakeValidFileName(this string name)
        {
            if (invalidRegStr == null)
            {
                var invalidChars = System.Text.RegularExpressions.Regex.Escape(new string(System.IO.Path.GetInvalidFileNameChars()));
                invalidRegStr = string.Format(@"([{0}]*\.+$)|([{0}]+)", invalidChars);
            }

            return System.Text.RegularExpressions.Regex.Replace(name, invalidRegStr, "_");
        }
    }
}
忆悲凉 2024-07-15 23:00:32

如果您将目录和文件名附加在一起并对其进行清理,而不是单独清理它们,您的代码会更干净。 至于清理 :,只需取出字符串中的第二个字符即可。 如果等于“replacechar”,则用冒号替换。 由于这个应用程序是供您自己使用的,因此这样的解决方案应该已经足够了。

Your code would be cleaner if you appended the directory and filename together and sanitized that rather than sanitizing them independently. As for sanitizing away the :, just take the 2nd character in the string. If it is equal to "replacechar", replace it with a colon. Since this app is for your own use, such a solution should be perfectly sufficient.

故事还在继续 2024-07-15 23:00:32

堆上只分配了两个对象。

public static class FilePathHelper
{
  public static string SanitizeFileName(string fileName, char replaceSymbol = '_')
  {
    var sb = fileName.ToCharArray();

    for (int i = 0; i < sb.Length; i++)
    {
      foreach (var invalidChar in InvalidFileNameCharsArray)
      {
        if (sb[i] == invalidChar)
          sb[i] = replaceSymbol;
      }
    }

    return new string(sb);
  }

  private readonly static char[] InvalidFileNameCharsArray = Path.GetInvalidFileNameChars();
}

Only two objects is allocated on the heap.

public static class FilePathHelper
{
  public static string SanitizeFileName(string fileName, char replaceSymbol = '_')
  {
    var sb = fileName.ToCharArray();

    for (int i = 0; i < sb.Length; i++)
    {
      foreach (var invalidChar in InvalidFileNameCharsArray)
      {
        if (sb[i] == invalidChar)
          sb[i] = replaceSymbol;
      }
    }

    return new string(sb);
  }

  private readonly static char[] InvalidFileNameCharsArray = Path.GetInvalidFileNameChars();
}
情何以堪。 2024-07-15 23:00:32

对于 .NET7+ 项目,还可以使用带有生成的正则表达式的扩展方法,如下所示:

public static class IOExtensions {
    [GeneratedRegex("^CON$|^PRN$|^AUX$|^CLOCK\\$|^NUL$|^COM0$|^COM1$|^COM2$|^COM3$|^COM4$|^COM5$|^COM6$|^COM7$|^COM8$|^COM9$|^LPT0$|^LPT1$|^LPT2$|^LPT3$|^LPT4$|^LPT5$|^LPT6$|^LPT7$|^LPT8$|^LPT9$", RegexOptions.Compiled | RegexOptions.IgnoreCase)]
    private static partial Regex GetReservedFilenamesRegex();

    public static string ToEscapedFilename(this string name, string replacer = "_") {
        return GetReservedFilenamesRegex().Replace(
            string.Join(
                replacer,
                name.Split(
                    Path.GetInvalidFileNameChars(),
                    StringSplitOptions.RemoveEmptyEntries
                )
            ),
            replacer
        );
    }
}

例如,

"Order * for AUX at 12/03/2023.csv".ToEscapedFileName()

将返回

在 12_03_2023.csv 订购 _ 购买 _

For .NET7+ projects it's also possible to use extensions methods with generated regexes like this:

public static class IOExtensions {
    [GeneratedRegex("^CON$|^PRN$|^AUX$|^CLOCK\\$|^NUL$|^COM0$|^COM1$|^COM2$|^COM3$|^COM4$|^COM5$|^COM6$|^COM7$|^COM8$|^COM9$|^LPT0$|^LPT1$|^LPT2$|^LPT3$|^LPT4$|^LPT5$|^LPT6$|^LPT7$|^LPT8$|^LPT9
quot;, RegexOptions.Compiled | RegexOptions.IgnoreCase)]
    private static partial Regex GetReservedFilenamesRegex();

    public static string ToEscapedFilename(this string name, string replacer = "_") {
        return GetReservedFilenamesRegex().Replace(
            string.Join(
                replacer,
                name.Split(
                    Path.GetInvalidFileNameChars(),
                    StringSplitOptions.RemoveEmptyEntries
                )
            ),
            replacer
        );
    }
}

For example,

"Order * for AUX at 12/03/2023.csv".ToEscapedFileName()

Will return

Order _ for _ at 12_03_2023.csv

放我走吧 2024-07-15 23:00:32

根据 Valamas 的回答phuclv对Valamas的答案的评论data的答案,我想出了以下解决方案。

using System.Collections.Generic;
using System.IO;

public class FileUtils
{
    private static readonly char[] _illegalChars = Path.GetInvalidFileNameChars();
    private static readonly Dictionary<char, char> _characterMap = new()
    {
        {'!', '!'}, {'"', '"'}, {'#', '#'}, {'
, '$'},
        {'%', '%'}, {'&', '&'}, {'\'', '''}, {'(', '('},
        {')', ')'}, {'*', '*'}, {'+', '+'}, {',', ','},
        {'-', '-'}, {'/', '/'}, {':', ':'}, {';', ';'},
        {'<', '<'}, {'=', '='}, {'>', '>'}, {'?', '?'},
        {'@', '@'}, {'{', '{'}, {'|', '|'}, {'}', '}'},
        {'~', '~'},
    };
    
    public static string SanitizeFileName(string name)
    {
        name = name.TrimEnd('.');
        var lastDotIdx = name.LastIndexOf('.');
        
        var filename = lastDotIdx == -1 ? name : name[..lastDotIdx];
        var ext = lastDotIdx == -1 ? "" : name[lastDotIdx..];
        
        foreach (var (search, replacement) in _characterMap)
        {
            filename = filename.Replace(search, replacement);
            ext = ext.Replace(search, replacement);
        }
        
        filename = string.Concat(filename.Split(_illegalChars));
        ext = string.Concat(ext.Split(_illegalChars));

        return filename + ext;
    }
}

Based on Valamas's answer, phuclv's comment on Valamas's answer and data's answer, I came up with the following solution.

using System.Collections.Generic;
using System.IO;

public class FileUtils
{
    private static readonly char[] _illegalChars = Path.GetInvalidFileNameChars();
    private static readonly Dictionary<char, char> _characterMap = new()
    {
        {'!', '!'}, {'"', '"'}, {'#', '#'}, {'
, '$'},
        {'%', '%'}, {'&', '&'}, {'\'', '''}, {'(', '('},
        {')', ')'}, {'*', '*'}, {'+', '+'}, {',', ','},
        {'-', '-'}, {'/', '/'}, {':', ':'}, {';', ';'},
        {'<', '<'}, {'=', '='}, {'>', '>'}, {'?', '?'},
        {'@', '@'}, {'{', '{'}, {'|', '|'}, {'}', '}'},
        {'~', '~'},
    };
    
    public static string SanitizeFileName(string name)
    {
        name = name.TrimEnd('.');
        var lastDotIdx = name.LastIndexOf('.');
        
        var filename = lastDotIdx == -1 ? name : name[..lastDotIdx];
        var ext = lastDotIdx == -1 ? "" : name[lastDotIdx..];
        
        foreach (var (search, replacement) in _characterMap)
        {
            filename = filename.Replace(search, replacement);
            ext = ext.Replace(search, replacement);
        }
        
        filename = string.Concat(filename.Split(_illegalChars));
        ext = string.Concat(ext.Split(_illegalChars));

        return filename + ext;
    }
}
魔法少女 2024-07-15 23:00:32
using System;
using System.IO;
using System.Linq;
using System.Text;

public class Program
{
    public static void Main()
    {
        try
        {
            var badString = "ABC\\DEF/GHI<JKL>MNO:PQR\"STU\tVWX|YZA*BCD?EFG";
            Console.WriteLine(badString);
            Console.WriteLine(SanitizeFileName(badString, '.'));
            Console.WriteLine(SanitizeFileName(badString));
        }
        catch (Exception ex)
        {
            Console.WriteLine(ex.ToString());
        }
    }

    private static string SanitizeFileName(string fileName, char? replacement = null)
    {
        if (fileName == null) { return null; }
        if (fileName.Length == 0) { return ""; }

        var sb = new StringBuilder();
        var badChars = Path.GetInvalidFileNameChars().ToList();

        foreach (var @char in fileName)
        {
            if (badChars.Contains(@char)) 
            {
                if (replacement.HasValue)
                {
                    sb.Append(replacement.Value);
                }
                continue; 
            }
            sb.Append(@char);
        }
        return sb.ToString();
    }
}
using System;
using System.IO;
using System.Linq;
using System.Text;

public class Program
{
    public static void Main()
    {
        try
        {
            var badString = "ABC\\DEF/GHI<JKL>MNO:PQR\"STU\tVWX|YZA*BCD?EFG";
            Console.WriteLine(badString);
            Console.WriteLine(SanitizeFileName(badString, '.'));
            Console.WriteLine(SanitizeFileName(badString));
        }
        catch (Exception ex)
        {
            Console.WriteLine(ex.ToString());
        }
    }

    private static string SanitizeFileName(string fileName, char? replacement = null)
    {
        if (fileName == null) { return null; }
        if (fileName.Length == 0) { return ""; }

        var sb = new StringBuilder();
        var badChars = Path.GetInvalidFileNameChars().ToList();

        foreach (var @char in fileName)
        {
            if (badChars.Contains(@char)) 
            {
                if (replacement.HasValue)
                {
                    sb.Append(replacement.Value);
                }
                continue; 
            }
            sb.Append(@char);
        }
        return sb.ToString();
    }
}
独自唱情﹋歌 2024-07-15 23:00:32

基于@fiat和@Andre的方法,我也想分享我的解决方案。
主要区别:

  • 它的扩展方法
  • 大量执行保留字
  • 正则表达式在第一次使用时进行编译,以节省一些时间,并保留
public static class StringPathExtensions
{
    private static Regex _invalidPathPartsRegex;
    
    static StringPathExtensions()
    {
        var invalidReg = System.Text.RegularExpressions.Regex.Escape(new string(Path.GetInvalidFileNameChars()));
        _invalidPathPartsRegex = new Regex($"(?<reserved>^(CON|PRN|AUX|CLOCK\\$|NUL|COM0|COM1|COM2|COM3|COM4|COM5|COM6|COM7|COM8|COM9|LPT0|LPT1|LPT2|LPT3|LPT4|LPT5|LPT6|LPT7|LPT8|LPT9))|(?<invalid>[{invalidReg}:]+|\\.$)", RegexOptions.Compiled);
    }

    public static string SanitizeFileName(this string path)
    {
        return _invalidPathPartsRegex.Replace(path, m =>
        {
            if (!string.IsNullOrWhiteSpace(m.Groups["reserved"].Value))
                return string.Concat("_", m.Groups["reserved"].Value);
            return "_";
        });
    }
}

Based @fiat's and @Andre's approach, I'd like to share my solution too.
Main difference:

  • its an extension method
  • regex is compiled at first use to save some time with a lot executions
  • reserved words are preserved
public static class StringPathExtensions
{
    private static Regex _invalidPathPartsRegex;
    
    static StringPathExtensions()
    {
        var invalidReg = System.Text.RegularExpressions.Regex.Escape(new string(Path.GetInvalidFileNameChars()));
        _invalidPathPartsRegex = new Regex(
quot;(?<reserved>^(CON|PRN|AUX|CLOCK\\$|NUL|COM0|COM1|COM2|COM3|COM4|COM5|COM6|COM7|COM8|COM9|LPT0|LPT1|LPT2|LPT3|LPT4|LPT5|LPT6|LPT7|LPT8|LPT9))|(?<invalid>[{invalidReg}:]+|\\.$)", RegexOptions.Compiled);
    }

    public static string SanitizeFileName(this string path)
    {
        return _invalidPathPartsRegex.Replace(path, m =>
        {
            if (!string.IsNullOrWhiteSpace(m.Groups["reserved"].Value))
                return string.Concat("_", m.Groups["reserved"].Value);
            return "_";
        });
    }
}
猫七 2024-07-15 23:00:31

要清理文件名,你可以这样做

private static string MakeValidFileName( string name )
{
   string invalidChars = System.Text.RegularExpressions.Regex.Escape( new string( System.IO.Path.GetInvalidFileNameChars() ) );
   string invalidRegStr = string.Format( @"([{0}]*\.+$)|([{0}]+)", invalidChars );

   return System.Text.RegularExpressions.Regex.Replace( name, invalidRegStr, "_" );
}

To clean up a file name you could do this

private static string MakeValidFileName( string name )
{
   string invalidChars = System.Text.RegularExpressions.Regex.Escape( new string( System.IO.Path.GetInvalidFileNameChars() ) );
   string invalidRegStr = string.Format( @"([{0}]*\.+$)|([{0}]+)", invalidChars );

   return System.Text.RegularExpressions.Regex.Replace( name, invalidRegStr, "_" );
}
葬心 2024-07-15 23:00:31

更短的解决方案:

var invalids = System.IO.Path.GetInvalidFileNameChars();
var newName = String.Join("_", origFileName.Split(invalids, StringSplitOptions.RemoveEmptyEntries) ).TrimEnd('.');

A shorter solution:

var invalids = System.IO.Path.GetInvalidFileNameChars();
var newName = String.Join("_", origFileName.Split(invalids, StringSplitOptions.RemoveEmptyEntries) ).TrimEnd('.');
醉城メ夜风 2024-07-15 23:00:31

基于安德烈的出色回答,但考虑到斯普德对保留字的评论,我制作了这个版本:

/// <summary>
/// Strip illegal chars and reserved words from a candidate filename (should not include the directory path)
/// </summary>
/// <remarks>
/// http://stackoverflow.com/questions/309485/c-sharp-sanitize-file-name
/// </remarks>
public static string CoerceValidFileName(string filename)
{
    var invalidChars = Regex.Escape(new string(Path.GetInvalidFileNameChars()));
    var invalidReStr = string.Format(@"[{0}]+", invalidChars);

    var reservedWords = new []
    {
        "CON", "PRN", "AUX", "CLOCK$", "NUL", "COM0", "COM1", "COM2", "COM3", "COM4",
        "COM5", "COM6", "COM7", "COM8", "COM9", "LPT0", "LPT1", "LPT2", "LPT3", "LPT4",
        "LPT5", "LPT6", "LPT7", "LPT8", "LPT9"
    };

    var sanitisedNamePart = Regex.Replace(filename, invalidReStr, "_");
    foreach (var reservedWord in reservedWords)
    {
        var reservedWordPattern = string.Format("^{0}\\.", reservedWord);
        sanitisedNamePart = Regex.Replace(sanitisedNamePart, reservedWordPattern, "_reservedWord_.", RegexOptions.IgnoreCase);
    }

    return sanitisedNamePart;
}

这些是我的单元测试

[Test]
public void CoerceValidFileName_SimpleValid()
{
    var filename = @"thisIsValid.txt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual(filename, result);
}

[Test]
public void CoerceValidFileName_SimpleInvalid()
{
    var filename = @"thisIsNotValid\3\\_3.txt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual("thisIsNotValid_3__3.txt", result);
}

[Test]
public void CoerceValidFileName_InvalidExtension()
{
    var filename = @"thisIsNotValid.t\xt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual("thisIsNotValid.t_xt", result);
}

[Test]
public void CoerceValidFileName_KeywordInvalid()
{
    var filename = "aUx.txt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual("_reservedWord_.txt", result);
}

[Test]
public void CoerceValidFileName_KeywordValid()
{
    var filename = "auxillary.txt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual("auxillary.txt", result);
}

Based on Andre's excellent answer but taking into account Spud's comment on reserved words, I made this version:

/// <summary>
/// Strip illegal chars and reserved words from a candidate filename (should not include the directory path)
/// </summary>
/// <remarks>
/// http://stackoverflow.com/questions/309485/c-sharp-sanitize-file-name
/// </remarks>
public static string CoerceValidFileName(string filename)
{
    var invalidChars = Regex.Escape(new string(Path.GetInvalidFileNameChars()));
    var invalidReStr = string.Format(@"[{0}]+", invalidChars);

    var reservedWords = new []
    {
        "CON", "PRN", "AUX", "CLOCK$", "NUL", "COM0", "COM1", "COM2", "COM3", "COM4",
        "COM5", "COM6", "COM7", "COM8", "COM9", "LPT0", "LPT1", "LPT2", "LPT3", "LPT4",
        "LPT5", "LPT6", "LPT7", "LPT8", "LPT9"
    };

    var sanitisedNamePart = Regex.Replace(filename, invalidReStr, "_");
    foreach (var reservedWord in reservedWords)
    {
        var reservedWordPattern = string.Format("^{0}\\.", reservedWord);
        sanitisedNamePart = Regex.Replace(sanitisedNamePart, reservedWordPattern, "_reservedWord_.", RegexOptions.IgnoreCase);
    }

    return sanitisedNamePart;
}

And these are my unit tests

[Test]
public void CoerceValidFileName_SimpleValid()
{
    var filename = @"thisIsValid.txt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual(filename, result);
}

[Test]
public void CoerceValidFileName_SimpleInvalid()
{
    var filename = @"thisIsNotValid\3\\_3.txt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual("thisIsNotValid_3__3.txt", result);
}

[Test]
public void CoerceValidFileName_InvalidExtension()
{
    var filename = @"thisIsNotValid.t\xt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual("thisIsNotValid.t_xt", result);
}

[Test]
public void CoerceValidFileName_KeywordInvalid()
{
    var filename = "aUx.txt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual("_reservedWord_.txt", result);
}

[Test]
public void CoerceValidFileName_KeywordValid()
{
    var filename = "auxillary.txt";
    var result = PathHelper.CoerceValidFileName(filename);
    Assert.AreEqual("auxillary.txt", result);
}
为人所爱 2024-07-15 23:00:31
string clean = String.Concat(dirty.Split(Path.GetInvalidFileNameChars()));
string clean = String.Concat(dirty.Split(Path.GetInvalidFileNameChars()));
未央 2024-07-15 23:00:31

这里有很多可行的解决方案。 只是为了完整起见,这里有一种不使用正则表达式,而是使用 LINQ 的方法:

var invalids = Path.GetInvalidFileNameChars();
filename = invalids.Aggregate(filename, (current, c) => current.Replace(c, '_'));

此外,这是一个非常简短的解决方案;)

there are a lot of working solutions here. just for the sake of completeness, here's an approach that doesn't use regex, but uses LINQ:

var invalids = Path.GetInvalidFileNameChars();
filename = invalids.Aggregate(filename, (current, c) => current.Replace(c, '_'));

Also, it's a very short solution ;)

三五鸿雁 2024-07-15 23:00:31

我想以某种方式保留这些字符,而不仅仅是简单地用下划线替换该字符。

我想到的一种方法是将这些字符替换为外观相似的字符(在我的情况下),这些字符不太可能用作常规字符。 所以我列出了无效字符并找到了相似的字符。

以下是使用相似项进行编码和解码的函数。

此代码不包含所有 System.IO.Path.GetInvalidFileNameChars() 字符的完整列表。 因此,您可以扩展或使用下划线替换任何剩余的字符。

private static Dictionary<string, string> EncodeMapping()
{
    //-- Following characters are invalid for windows file and folder names.
    //-- \/:*?"<>|
    Dictionary<string, string> dic = new Dictionary<string, string>();
    dic.Add(@"\", "Ì"); // U+OOCC
    dic.Add("/", "Í"); // U+OOCD
    dic.Add(":", "¦"); // U+00A6
    dic.Add("*", "¤"); // U+00A4
    dic.Add("?", "¿"); // U+00BF
    dic.Add(@"""", "ˮ"); // U+02EE
    dic.Add("<", "«"); // U+00AB
    dic.Add(">", "»"); // U+00BB
    dic.Add("|", "│"); // U+2502
    return dic;
}

public static string Escape(string name)
{
    foreach (KeyValuePair<string, string> replace in EncodeMapping())
    {
        name = name.Replace(replace.Key, replace.Value);
    }

    //-- handle dot at the end
    if (name.EndsWith(".")) name = name.CropRight(1) + "°";

    return name;
}

public static string UnEscape(string name)
{
    foreach (KeyValuePair<string, string> replace in EncodeMapping())
    {
        name = name.Replace(replace.Value, replace.Key);
    }

    //-- handle dot at the end
    if (name.EndsWith("°")) name = name.CropRight(1) + ".";

    return name;
}

您可以选择自己喜欢的字符。 我在 Windows 中使用字符映射表应用程序来选择我的 %windir%\system32\charmap.exe

当我通过发现进行调整时,我将更新此代码。

I wanted to retain the characters in some way, not just simply replace the character with an underscore.

One way I thought was to replace the characters with similar looking characters which are (in my situation), unlikely to be used as regular characters. So I took the list of invalid characters and found look-a-likes.

The following are functions to encode and decode with the look-a-likes.

This code does not include a complete listing for all System.IO.Path.GetInvalidFileNameChars() characters. So it is up to you to extend or utilize the underscore replacement for any remaining characters.

private static Dictionary<string, string> EncodeMapping()
{
    //-- Following characters are invalid for windows file and folder names.
    //-- \/:*?"<>|
    Dictionary<string, string> dic = new Dictionary<string, string>();
    dic.Add(@"\", "Ì"); // U+OOCC
    dic.Add("/", "Í"); // U+OOCD
    dic.Add(":", "¦"); // U+00A6
    dic.Add("*", "¤"); // U+00A4
    dic.Add("?", "¿"); // U+00BF
    dic.Add(@"""", "ˮ"); // U+02EE
    dic.Add("<", "«"); // U+00AB
    dic.Add(">", "»"); // U+00BB
    dic.Add("|", "│"); // U+2502
    return dic;
}

public static string Escape(string name)
{
    foreach (KeyValuePair<string, string> replace in EncodeMapping())
    {
        name = name.Replace(replace.Key, replace.Value);
    }

    //-- handle dot at the end
    if (name.EndsWith(".")) name = name.CropRight(1) + "°";

    return name;
}

public static string UnEscape(string name)
{
    foreach (KeyValuePair<string, string> replace in EncodeMapping())
    {
        name = name.Replace(replace.Value, replace.Key);
    }

    //-- handle dot at the end
    if (name.EndsWith("°")) name = name.CropRight(1) + ".";

    return name;
}

You can select your own look-a-likes. I used the Character Map app in windows to select mine %windir%\system32\charmap.exe

As I make adjustments through discovery, I will update this code.

朕就是辣么酷 2024-07-15 23:00:31

我正在使用 System.IO.Path.GetInvalidFileNameChars() 方法来检查无效字符,没有遇到任何问题。

我正在使用以下代码:

foreach( char invalidchar in System.IO.Path.GetInvalidFileNameChars())
{
    filename = filename.Replace(invalidchar, '_');
}

I'm using the System.IO.Path.GetInvalidFileNameChars() method to check invalid characters and I've got no problems.

I'm using the following code:

foreach( char invalidchar in System.IO.Path.GetInvalidFileNameChars())
{
    filename = filename.Replace(invalidchar, '_');
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文