将二进制文件读入结构体
我正在尝试使用 C# 读取二进制数据。 我拥有有关我想要读取的文件中的数据布局的所有信息。 我能够“逐块”读取数据,即将前 40 个字节的数据转换为字符串,然后获取接下来的 40 个字节。
由于数据至少有三个略有不同的版本,我想将数据直接读入结构中。 感觉比“逐行”阅读要正确得多。
我尝试了以下方法,但没有效果:
StructType aStruct;
int count = Marshal.SizeOf(typeof(StructType));
byte[] readBuffer = new byte[count];
BinaryReader reader = new BinaryReader(stream);
readBuffer = reader.ReadBytes(count);
GCHandle handle = GCHandle.Alloc(readBuffer, GCHandleType.Pinned);
aStruct = (StructType) Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(StructType));
handle.Free();
该流是一个打开的 FileStream,我已开始从中读取数据。 使用 Marshal.PtrToStructure
时出现 AccessViolationExceptio
n。
该流包含的信息比我试图读取的信息多,因为我对文件末尾的数据不感兴趣。
结构定义如下:
[StructLayout(LayoutKind.Explicit)]
struct StructType
{
[FieldOffset(0)]
public string FileDate;
[FieldOffset(8)]
public string FileTime;
[FieldOffset(16)]
public int Id1;
[FieldOffset(20)]
public string Id2;
}
示例代码已从原始代码更改为使该问题更短。
如何将二进制数据从文件读取到结构中?
I'm trying to read binary data using C#. I have all the information about the layout of the data in the files I want to read. I'm able to read the data "chunk by chunk", i.e. getting the first 40 bytes of data converting it to a string, get the next 40 bytes.
Since there are at least three slightly different version of the data, I would like to read the data directly into a struct. It just feels so much more right than by reading it "line by line".
I have tried the following approach but to no avail:
StructType aStruct;
int count = Marshal.SizeOf(typeof(StructType));
byte[] readBuffer = new byte[count];
BinaryReader reader = new BinaryReader(stream);
readBuffer = reader.ReadBytes(count);
GCHandle handle = GCHandle.Alloc(readBuffer, GCHandleType.Pinned);
aStruct = (StructType) Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(StructType));
handle.Free();
The stream is an opened FileStream from which I have began to read from. I get an AccessViolationExceptio
n when using Marshal.PtrToStructure
.
The stream contains more information than I'm trying to read since I'm not interested in data at the end of the file.
The struct is defined like:
[StructLayout(LayoutKind.Explicit)]
struct StructType
{
[FieldOffset(0)]
public string FileDate;
[FieldOffset(8)]
public string FileTime;
[FieldOffset(16)]
public int Id1;
[FieldOffset(20)]
public string Id2;
}
The examples code is changed from original to make this question shorter.
How would I read binary data from a file into a struct?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
这是我正在使用的。
这对我阅读可移植可执行格式来说很成功。
它是一个通用函数,所以
T
是你的结构
类型。Here is what I am using.
This worked successfully for me for reading Portable Executable Format.
It's a generic function, so
T
is yourstruct
type.我没有看到你的代码有任何问题。
就在我的脑海中,如果你尝试手动执行怎么办? 有效吗?
还可以尝试
在 BinaryReader 中使用 buffer[] 而不是从 FileStream 读取数据,看看是否仍然遇到 AccessViolation 异常。
这是有道理的,BinaryFormatter 有自己的数据格式,与您的完全不兼容。
I don't see any problem with your code.
just out of my head, what if you try to do it manually? does it work?
also try
then use buffer[] in your BinaryReader instead of reading data from FileStream to see whether you still get AccessViolation exception.
That makes sense, BinaryFormatter has its own data format, completely incompatible with yours.
直接读入结构是邪恶的——许多 C 程序已经失败,因为不同的字节顺序、不同的编译器实现的字段、打包、字大小......
你最好逐字节地序列化和反序列化。 如果您想要或只是习惯 BinaryReader,请使用内置的东西。
Reading straight into structs is evil - many a C program has fallen over because of different byte orderings, different compiler implementations of fields, packing, word size.......
You are best of serialising and deserialising byte by byte. Use the build in stuff if you want or just get used to BinaryReader.
我有结构:
并且我收到“不正确对齐或被非对象重叠”。
基于此我发现:
https://social.msdn.microsoft.com/Forums/vstudio/en-US/2f9ffce5-4c64-4ea7-a994-06b372b28c39/strange-issue-with-layoutkindexplicit?forum=clr
因此,我的结构被定义为显式:
因此我的字段已指定,
但是当您将结构更改为顺序结构时,您可以摆脱这些偏移量,并且错误将会消失。 就像是:
I had structure:
and I received "incorrectly aligned or overlapped by non-object".
Based on that I found:
https://social.msdn.microsoft.com/Forums/vstudio/en-US/2f9ffce5-4c64-4ea7-a994-06b372b28c39/strange-issue-with-layoutkindexplicit?forum=clr
So my struct was defined as explicit with:
and thus my fields had specified
but when you change your struct to Sequentional, you can get rid of those offsets and the error will disappear. Something like:
我没有使用 BinaryFormatter,我想我必须有一个与文件内容完全匹配的完整结构。 我意识到最终我对文件内容并不感兴趣,所以我采用了将部分流读取到字节缓冲区中的解决方案,然后使用
字符串和
整数将其转换。
稍后我需要能够解析更多文件,但对于这个版本,我只用了几行代码。
I had no luck using the BinaryFormatter, I guess I have to have a complete struct that matches the content of the file exactly. I realised that in the end I wasn't interested in very much of the file content anyway so I went with the solution of reading part of stream into a bytebuffer and then converting it using
for strings and
for the integers.
I will need to be able to parse more of the file later on but for this version I got away with just a couple of lines of code.
尝试这个:
Try this:
问题是结构中的字符串。 我发现像 byte/short/int 这样的封送类型不是问题; 但是当您需要编组为复杂类型(例如字符串)时,您需要结构显式模仿非托管类型。 您可以使用 MarshalAs 属性来执行此操作。
对于您的示例,以下内容应该有效:
The problem is the strings in your struct. I found that marshaling types like byte/short/int is not a problem; but when you need to marshal into a complex type such as a string, you need your struct to explicitly mimic an unmanaged type. You can do this with the MarshalAs attrib.
For your example, the following should work:
正如罗尼所说,我会使用 BinaryReader 并单独读取每个字段。 我找不到包含此信息的文章的链接,但据观察,如果结构包含的字段少于 30-40 个左右,则使用 BinaryReader 读取每个单独的字段可能比 Marshal.PtrToStruct 更快。 当我找到该文章时,我会发布该文章的链接。
该文章的链接位于:http://www. codeproject.com/Articles/10750/Fast-Binary-File-Reading-with-C
当编组结构体数组时,PtrToStruct 更快地占据上风,因为您可以将字段计数视为字段 *数组长度。
As Ronnie said, I'd use BinaryReader and read each field individually. I can't find the link to the article with this info, but it's been observed that using BinaryReader to read each individual field can be faster than Marshal.PtrToStruct, if the struct contains less than 30-40 or so fields. I'll post the link to the article when I find it.
The article's link is at: http://www.codeproject.com/Articles/10750/Fast-Binary-File-Reading-with-C
When marshaling an array of structs, PtrToStruct gains the upper-hand more quickly, because you can think of the field count as fields * array length.