C# 匹配两个文本文件,区分大小写问题
我有两个文件,sourcecolumns.txt
和 destcolumns.txt
。我需要做的是将源与目标进行比较,如果目标不包含源值,则将其写入新文件。下面的代码有效,除非我有区分大小写的问题,如下所示:
来源:CPI
dest: Cpi
由于大写字母,这些不匹配,因此我得到不正确的输出。随时欢迎任何帮助!
string[] sourcelinestotal =
File.ReadAllLines("C:\\testdirectory\\" + "sourcecolumns.txt");
string[] destlinestotal =
File.ReadAllLines("C:\\testdirectory\\" + "destcolumns.txt");
foreach (string sline in sourcelinestotal)
{
if (destlinestotal.Contains(sline))
{
}
else
{
File.AppendAllText("C:\\testdirectory\\" + "missingcolumns.txt", sline);
}
}
What I have is two files, sourcecolumns.txt
and destcolumns.txt
. What I need to do is compare source to dest and if the dest doesn't contain the source value, write it out to a new file. The code below works except I have case sensitive issues like this:
source: CPI
dest: Cpi
These don't match because of captial letters, so I get incorrect outputs. Any help is always welcome!
string[] sourcelinestotal =
File.ReadAllLines("C:\\testdirectory\\" + "sourcecolumns.txt");
string[] destlinestotal =
File.ReadAllLines("C:\\testdirectory\\" + "destcolumns.txt");
foreach (string sline in sourcelinestotal)
{
if (destlinestotal.Contains(sline))
{
}
else
{
File.AppendAllText("C:\\testdirectory\\" + "missingcolumns.txt", sline);
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可以使用
IEnumerable
的扩展方法来执行此操作,例如:然后更改
为
但是,如果集合很大和/或您要经常执行此操作,则按照您的方式进行关于它是非常低效的。本质上,您正在执行 O(n2) 操作 - 对于源中的每一行,您都将其与目标中的所有行进行比较。最好使用不区分大小写的比较器从目标列创建一个 HashSet,然后迭代源列,检查目标列的 HashSet 中是否存在每个列。这将是一个 O(n) 算法。请注意,HashSet 上的 Contains 将使用您在构造函数中提供的比较器。
回想起来,我实际上更喜欢这个解决方案,而不是简单地为 IEnumerable编写自己的不区分大小写的包含,除非您需要其他方法。使用 HashSet 实现实际上需要维护的(您自己的)代码更少。
You could do this using an extension method for
IEnumerable<string>
like:then change
to
However, if the sets are large and/or you are going to do this very often, the way you're going about it is very inefficient. Essentially, you're doing an O(n2) operation -- for each line in the source you compare it with, potentially, all lines in the destination. It would be better to create a HashSet from the destination columns with a case insenstivie comparer and then iterate through your source columns checking if each one exists in the HashSet of the destination columns. This would be an O(n) algorithm. note that Contains on the HashSet will use the comparer you provide in the constructor.
In retrospect, I actually prefer this solution over simply writing your own case insensitive contains for
IEnumerable<string>
unless you need the method for something else. There's actually less code (of your own) to maintain by using the HashSet implementation.为您的 Contains 使用扩展方法。 在堆栈溢出上找到了一个出色的示例代码不是我的,但我'将在下面发布。
Use an extension method for your Contains. A brilliant example was found here on stack overflow Code isn't mine, but I'll post it below.
如果不需要区分大小写,请使用
将行转换为大写比较之前的 string.ToUpper
。If you do not need case sensitivity, convert your lines to upper case using
string.ToUpper
before comparison.