C# 比较字符串 - 不同的代码页
我从文本文件中读取两个字符串进行比较,当我尝试将这些文件与 winmerge 或 pspad 进行比较时,它们都显示为相同的文本字符串。 如果我将它们与以下函数进行比较,它会失败:
string string1 = File.ReadAllText(@"c:\file1.txt");
string string2 = File.ReadAllText(@"c:\file2.txt");
bool stringMatch = false;
if (string1.Equals(string2, StringComparison.InvariantCulture)){
stringMatch = true;
}
//stringMatch is false here
经过一番搜索后,“ ” 和 ' 似乎不同:
Content of file1.txt: é"'(§è!çà)- Content of file2.txt: é”’(§è!çà)-
任何方式我都可以正确比较这两个字符串并匹配那些“ & ”。 ' 人物?
I have two strings read in from textfiles to compare and when I try to compare these files with winmerge or pspad, they both show as the same text strings. If I compare them with the following function, it fails:
string string1 = File.ReadAllText(@"c:\file1.txt");
string string2 = File.ReadAllText(@"c:\file2.txt");
bool stringMatch = false;
if (string1.Equals(string2, StringComparison.InvariantCulture)){
stringMatch = true;
}
//stringMatch is false here
After some searching it seems to be that a " and ' are different:
Content of file1.txt: é"'(§è!çà)- Content of file2.txt: é”’(§è!çà)-
Any way I can properly compare these two strings and match those " & ' characters?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您可以使用
System.Text.Encoding
下的方法将它们都转换为byte[]
然后比较
byte[]
数组You could convert them both to
byte[]
using the methods underSystem.Text.Encoding
and then compare the
byte[]
arrays看起来您想使用需要 StringComparison 的重载。
我猜想考虑到当前的情况,您需要“序数”值,但您可能需要其他值之一,具体取决于您正在做什么。
http://msdn.microsoft.com/en-us/library/系统.字符串比较.aspx
It looks like you want to use the overload which takes StringComparison.
I'd guess given the current senario you want the "Ordinal" value but you may want one of the others depdending on what you are doing.
http://msdn.microsoft.com/en-us/library/system.stringcomparison.aspx
好吧,WinMerge 或 pspad 中没有 .NET 字符串,因此解码时很可能会出现问题。 您需要解释您的确切场景:
编辑:好的,根据评论 - 文件的编码是什么? 您是否在 WinMerge 中的任何地方指定了它? .NET 将使用 UTF-8(因为您没有指定任何其他编码)。
Well, you don't have the .NET strings in WinMerge or pspad, so something could well be going wrong while decoding. You need to explain your exact scenario:
EDIT: Okay, based on the comment - what is the encoding of the file meant to be? Are you specifying it in WinMerge anywhere? .NET will be using UTF-8 (because you haven't specified any other encoding).
阅读“每个软件开发人员绝对必须了解 Unicode 和字符集的绝对最低要求(没有任何借口!) ”你应该有能力自己解决你的问题。
After reading "The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)" you should be well equipped to solve your problem yourself.