替换包含#0 的字符串?

发布于 2024-09-07 23:12:07 字数 582 浏览 8 评论 0原文

我使用此函数将文件读取为字符串

function LoadFile(const FileName: TFileName): string;
begin
  with TFileStream.Create(FileName,
      fmOpenRead or fmShareDenyWrite) do begin
    try
      SetLength(Result, Size);
      Read(Pointer(Result)^, Size);
    except
      Result := '';  
      Free;
      raise;
    end;
    Free;
  end;
end;

这是文件的文本:

version  

这是 LoadFile 的返回值:

'ÿþv'#0'e'#0'r'#0's'#0'i'#0'o'#0'n'#0

我想创建一个包含“verabc”的新文件。问题是我仍然无法用“abc”替换“sion”。我用的是D2007。如果我删除所有#0,那么结果就变成了汉字。

I use this function to read file to string

function LoadFile(const FileName: TFileName): string;
begin
  with TFileStream.Create(FileName,
      fmOpenRead or fmShareDenyWrite) do begin
    try
      SetLength(Result, Size);
      Read(Pointer(Result)^, Size);
    except
      Result := '';  
      Free;
      raise;
    end;
    Free;
  end;
end;

Here's the text of file :

version  

Here's the return value of LoadFile :

'ÿþv'#0'e'#0'r'#0's'#0'i'#0'o'#0'n'#0

I want to make a new file contain "verabc". The problem is I still have a problem to replace "sion" with "abc". I am using D2007. If I remove all #0 then the result become Chinese character.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

一人独醉 2024-09-14 23:12:07

您认为的文件文本并不是真正的文件文本。您读入字符串变量的内容是准确的。您有一个编码为小端 UTF-16 的 Unicode 文本文件。前两个字节表示字节顺序标记,之后的每对字节表示字符串的另一个字符。

如果您正在读取 Unicode 文件,则应使用 Unicode 数据类型,例如 WideString。设置字符串长度时,您需要将文件大小除以二,并且需要丢弃前两个字节。

如果您不知道正在读取哪种文件,那么您需要先读取前两个或三个字节。如果前两个字节是 $ff $fe,如上所述,那么您可能有一个小端 UTF-16 文件;将文件的其余部分读入 WideStringUnicodeString(如果您有该类型)。如果它们是 $fe $ff,那么它可能是大端字节序;将文件的其余部分读取到 WideString 中,然后交换每对字节的顺序。如果前两个字节是 $ef $bb,则检查第三个字节。如果是$bf,那么它们可能是UTF-8字节顺序标记。丢弃所有三个并将文件的其余部分读入 AnsiString 或字节数组,然后使用 UTF8Decode 等函数将其转换为 WideString< /代码>。

将数据放入 WideString 后,调试器将显示它包含 version,并且使用支持 Unicode 的 StringReplace< 版本应该不会有任何问题。 /code> 进行替换。

What you think is the text of the file isn't really the text of the file. What you've read into your string variable is accurate. You have a Unicode text file encoded as little-endian UTF-16. The first two bytes represent the byte-order mark, and each pair of bytes after that are another character of the string.

If you're reading a Unicode file, you should use a Unicode data type, such as WideString. You'll want to divide the file size by two when setting the length of the string, and you'll want to discard the first two bytes.

If you don't know what kind of file you're reading, then you need to read the first two or three bytes first. If the first two bytes are $ff $fe, as above, then you might have a little-endian UTF-16 file; read the rest of the file into a WideString, or UnicodeString if you have that type. If they're $fe $ff, then it might be big-endian; read the remainder of the file into a WideString and then swap the order of each pair of bytes. If the first two bytes are $ef $bb, then check the third byte. If it's $bf, then they are probably the UTF-8 byte-order mark. Discard all three and read the rest of the file into an AnsiString or an array of bytes, and then use a function like UTF8Decode to convert it into a WideString.

Once you have your data in a WideString, the debugger will show that it contains version, and you should have no trouble using a Unicode-enabled version of StringReplace to do your replacement.

我纯我任性 2024-09-14 23:12:07

您似乎加载了 unicode 编码的文本文件。 0 表示拉丁字符。

如果您不想处理 unicode 文本,请在保存文件时在编辑器中选择 ANSI 编码。

如果您需要 unicode 编码,请使用 WideCharToString 将其转换为 ANSI 字符串,或者直接删除 0,尽管后者不是最佳解决方案。同时删除 2 个前导字符 ÿþ
编辑器将这些字节标记文件为unicode。

It seems that you load a unicode encoded text file. 0 indicates Latin character.

If you don't want to deal with unicode text, choose ANSI encoding in your editor when you save the file.

If you need unicode encoding, use WideCharToString to convert it to an ANSI string, or just remove yourself the 0s, though the latter isn't the best solution. Also remove the 2 leading characters, ÿþ.
The editor put those bytes to mark the file as unicode.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文