StringOf 是否复制传递给它的数据?
我正在读取一个文件,尝试通过检查前 n 字节中的 NUL 字节来检查它是否是二进制文件,如果未确定它是二进制文件,则将其操作为一个字符串。我尝试循环一个字符串并检查第一个 n 索引是否为 NUL,但这会产生误报,而检查 TBytes
则不会。
我使用 TFile.ReadAllBytes,它返回一个 TBytes 并对其执行 NUL 检查。然后,如果没有找到 NUL,我会在 TBytes
上使用 StringOf
来获取字符串。我想知道 StringOf
是否必须复制数据才能从中生成字符串(这些是大文件,所以我想避免这种情况),如果是这样,有什么更好的方法我正在尝试做什么。
I am reading in a file, attempting to check if it is a binary file by checking the first n bytes for a NUL byte, and if it is not determined to be binary that way, it is manipulated as a string. I tried to loop over a string and check the first n indices for a NUL, but that would give false positives that checking a TBytes
does not.
I use TFile.ReadAllBytes
, which returns a TBytes
and perform the NUL check on that. Then if no NUL is found, I use StringOf
on the TBytes
to get a string. I was wondering if StringOf
has to make a copy of the data to make a string out of it (these are large files so I want to avoid that) and if so, what is a better way to do what I am trying to do.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
是的,根据文档:
'转换字节数组使用默认系统区域设置转换为 Unicode 字符串。'
如果您只想将 TByte 作为字符串访问,为什么不将其转换为 PChar(如果它是 Unicode)或PAnsiChar 如果它是 AnsiString?
示例代码:
编辑
我有点困惑,为什么你不只是使用
TFile.OpenRead
来获取 FileStream。假设您有千兆字节的数据并且您很着急。
文件流将允许您只读取一小块数据,从而加快速度。
此示例代码读取整个文件,但可以轻松修改为仅获取一小部分:
请注意,最后一个示例仍然首先从磁盘读取所有数据(这可能是也可能不是您想要的)。
不过,您也可以分块读取数据。
使用 AnsiStrings 所有这些都更简单,因为 1 个字符 = 1 个字节:-)。
Yes, according to the docs:
'Converts a byte array into a Unicode string using the default system locale.'
If you just want to access the TBytes as a string, why not cast it to a PChar (if it's Unicode) or PAnsiChar if it's an AnsiString?
Example code:
EDIT
I'm a bit puzzled, why you're not just using
TFile.OpenRead
to get a FileStream.Let's assume you've got gigabyte(s) of data and you're in a hurry.
The Filestream will allow you to just read a small chunk of the data speeding things up.
This example code reads the whole file, but can easily be modified to only get a small part:
Note that the last example still reads all data from disk first (which may or may not be what your want).
However you can also read the data in chunks.
All of this is simpler with AnsiStrings because 1 char = 1 byte there :-).
如果您认为
StringOf
只是就地类型转换,那您就错了。StringOf
将其参数视为默认系统 ANSI 代码页编码中的字符数组,并将其转换为 UTF16 unicode 编码。当然,您会在结果字符串中发现很多零字节(WideChar 的高字节)。If you think that
StringOf
is just an in-place typecasting, you are wrong.StringOf
treats its argument as an array of characters in default system ANSI codepage encoding and converts it to UTF16 unicode encoding. Sure you will find a lot of zero bytes in the resulting string (upper bytes of WideChar's).您可以通过查看找到编码在 BOM 处。当然,这取决于您的输入文件的编码方式。
然而,
SetLength
可能会复制数据。TFile.ReadAllBytes
You could find the encoding by looking at the BOM. This depends on the way your input files are encoded of course.
However
SetLength
may make a copy of the data.