C# 匹配两个文本文件,区分大小写问题

发布于 2024-08-30 10:50:57 字数 625 浏览 6 评论 0原文

我有两个文件,sourcecolumns.txtdestcolumns.txt。我需要做的是将源与目标进行比较,如果目标不包含源值,则将其写入新文件。下面的代码有效,除非我有区分大小写的问题,如下所示:

来源:CPI
dest: Cpi

由于大写字母,这些不匹配,因此我得到不正确的输出。随时欢迎任何帮助!

string[] sourcelinestotal =
    File.ReadAllLines("C:\\testdirectory\\" + "sourcecolumns.txt");
string[] destlinestotal =
    File.ReadAllLines("C:\\testdirectory\\" + "destcolumns.txt");

foreach (string sline in sourcelinestotal)
{
    if (destlinestotal.Contains(sline))
    {
    }
    else
    {
        File.AppendAllText("C:\\testdirectory\\" + "missingcolumns.txt", sline);
    }
}

What I have is two files, sourcecolumns.txt and destcolumns.txt. What I need to do is compare source to dest and if the dest doesn't contain the source value, write it out to a new file. The code below works except I have case sensitive issues like this:

source: CPI
dest: Cpi

These don't match because of captial letters, so I get incorrect outputs. Any help is always welcome!

string[] sourcelinestotal =
    File.ReadAllLines("C:\\testdirectory\\" + "sourcecolumns.txt");
string[] destlinestotal =
    File.ReadAllLines("C:\\testdirectory\\" + "destcolumns.txt");

foreach (string sline in sourcelinestotal)
{
    if (destlinestotal.Contains(sline))
    {
    }
    else
    {
        File.AppendAllText("C:\\testdirectory\\" + "missingcolumns.txt", sline);
    }
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

向地狱狂奔 2024-09-06 10:50:57

您可以使用 IEnumerable 的扩展方法来执行此操作,例如:

public static class EnumerableExtensions
{
    public static bool Contains( this IEnumerable<string> source, string value, StringComparison comparison )
    {
         if (source == null)
         {
             return false; // nothing is a member of the empty set
         }
         return source.Any( s => string.Equals( s, value, comparison ) );
    }
}

然后更改

if (destlinestotal.Contains( sline ))

if (destlinestotal.Contains( sline, StringComparison.OrdinalIgnoreCase ))

但是,如果集合很大和/或您要经常执行此操作,则按照您的方式进行关于它是非常低效的。本质上,您正在执行 O(n2) 操作 - 对于源中的每一行,您都将其与目标中的所有行进行比较。最好使用不区分大小写的比较器从目标列创建一个 HashSet,然后迭代源列,检查目标列的 HashSet 中是否存在每个列。这将是一个 O(n) 算法。请注意,HashSet 上的 Contains 将使用您在构造函数中提供的比较器。

string[] sourcelinestotal = 
    File.ReadAllLines("C:\\testdirectory\\" + "sourcecolumns.txt"); 
HashSet<string> destlinestotal = 
                new HashSet<string>(
                  File.ReadAllLines("C:\\testdirectory\\" + "destcolumns.txt"),
                  StringComparer.OrdinalIgnoreCase
                );

foreach (string sline in sourcelinestotal) 
{ 
    if (!destlinestotal.Contains(sline)) 
    { 
        File.AppendAllText("C:\\testdirectory\\" + "missingcolumns.txt", sline); 
    } 
}

回想起来,我实际上更喜欢这个解决方案,而不是简单地为 IEnumerable编写自己的不区分大小写的包含,除非您需要其他方法。使用 HashSet 实现实际上需要维护的(您自己的)代码更少。

You could do this using an extension method for IEnumerable<string> like:

public static class EnumerableExtensions
{
    public static bool Contains( this IEnumerable<string> source, string value, StringComparison comparison )
    {
         if (source == null)
         {
             return false; // nothing is a member of the empty set
         }
         return source.Any( s => string.Equals( s, value, comparison ) );
    }
}

then change

if (destlinestotal.Contains( sline ))

to

if (destlinestotal.Contains( sline, StringComparison.OrdinalIgnoreCase ))

However, if the sets are large and/or you are going to do this very often, the way you're going about it is very inefficient. Essentially, you're doing an O(n2) operation -- for each line in the source you compare it with, potentially, all lines in the destination. It would be better to create a HashSet from the destination columns with a case insenstivie comparer and then iterate through your source columns checking if each one exists in the HashSet of the destination columns. This would be an O(n) algorithm. note that Contains on the HashSet will use the comparer you provide in the constructor.

string[] sourcelinestotal = 
    File.ReadAllLines("C:\\testdirectory\\" + "sourcecolumns.txt"); 
HashSet<string> destlinestotal = 
                new HashSet<string>(
                  File.ReadAllLines("C:\\testdirectory\\" + "destcolumns.txt"),
                  StringComparer.OrdinalIgnoreCase
                );

foreach (string sline in sourcelinestotal) 
{ 
    if (!destlinestotal.Contains(sline)) 
    { 
        File.AppendAllText("C:\\testdirectory\\" + "missingcolumns.txt", sline); 
    } 
}

In retrospect, I actually prefer this solution over simply writing your own case insensitive contains for IEnumerable<string> unless you need the method for something else. There's actually less code (of your own) to maintain by using the HashSet implementation.

霊感 2024-09-06 10:50:57

为您的 Contains 使用扩展方法。 在堆栈溢出上找到了一个出色的示例代码不是我的,但我'将在下面发布。

public static bool Contains(this string source, string toCheck, StringComparison comp) 
{
    return source.IndexOf(toCheck, comp) >= 0;
}

string title = "STRING";
bool contains = title.Contains("string", StringComparison.OrdinalIgnoreCase);

Use an extension method for your Contains. A brilliant example was found here on stack overflow Code isn't mine, but I'll post it below.

public static bool Contains(this string source, string toCheck, StringComparison comp) 
{
    return source.IndexOf(toCheck, comp) >= 0;
}

string title = "STRING";
bool contains = title.Contains("string", StringComparison.OrdinalIgnoreCase);
往事随风而去 2024-09-06 10:50:57

如果不需要区分大小写,请使用 将行转换为大写比较之前的 string.ToUpper

If you do not need case sensitivity, convert your lines to upper case using string.ToUpper before comparison.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文