将二进制文件读入结构体

发布于 2024-07-04 01:00:54 字数 1133 浏览 9 评论 0原文

我正在尝试使用 C# 读取二进制数据。 我拥有有关我想要读取的文件中的数据布局的所有信息。 我能够“逐块”读取数据,即将前 40 个字节的数据转换为字符串,然后获取接下来的 40 个字节。

由于数据至少有三个略有不同的版本,我想将数据直接读入结构中。 感觉比“逐行”阅读要正确得多。

我尝试了以下方法,但没有效果:

StructType aStruct;
int count = Marshal.SizeOf(typeof(StructType));
byte[] readBuffer = new byte[count];
BinaryReader reader = new BinaryReader(stream);
readBuffer = reader.ReadBytes(count);
GCHandle handle = GCHandle.Alloc(readBuffer, GCHandleType.Pinned);
aStruct = (StructType) Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(StructType));
handle.Free();

该流是一个打开的 FileStream,我已开始从中读取数据。 使用 Marshal.PtrToStructure 时出现 AccessViolationException。

该流包含的信息比我试图读取的信息多,因为我对文件末尾的数据不感兴趣。

结构定义如下:

[StructLayout(LayoutKind.Explicit)]
struct StructType
{
    [FieldOffset(0)]
    public string FileDate;
    [FieldOffset(8)]
    public string FileTime;
    [FieldOffset(16)]
    public int Id1;
    [FieldOffset(20)]
    public string Id2;
}

示例代码已从原始代码更改为使该问题更短。

如何将二进制数据从文件读取到结构中?

I'm trying to read binary data using C#. I have all the information about the layout of the data in the files I want to read. I'm able to read the data "chunk by chunk", i.e. getting the first 40 bytes of data converting it to a string, get the next 40 bytes.

Since there are at least three slightly different version of the data, I would like to read the data directly into a struct. It just feels so much more right than by reading it "line by line".

I have tried the following approach but to no avail:

StructType aStruct;
int count = Marshal.SizeOf(typeof(StructType));
byte[] readBuffer = new byte[count];
BinaryReader reader = new BinaryReader(stream);
readBuffer = reader.ReadBytes(count);
GCHandle handle = GCHandle.Alloc(readBuffer, GCHandleType.Pinned);
aStruct = (StructType) Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(StructType));
handle.Free();

The stream is an opened FileStream from which I have began to read from. I get an AccessViolationException when using Marshal.PtrToStructure.

The stream contains more information than I'm trying to read since I'm not interested in data at the end of the file.

The struct is defined like:

[StructLayout(LayoutKind.Explicit)]
struct StructType
{
    [FieldOffset(0)]
    public string FileDate;
    [FieldOffset(8)]
    public string FileTime;
    [FieldOffset(16)]
    public int Id1;
    [FieldOffset(20)]
    public string Id2;
}

The examples code is changed from original to make this question shorter.

How would I read binary data from a file into a struct?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

暮年 2024-07-11 01:00:54

这是我正在使用的。
这对我阅读可移植可执行格式来说很成功。
它是一个通用函数,所以 T 是你的 结构类型。

public static T ByteToType<T>(BinaryReader reader)
{
    byte[] bytes = reader.ReadBytes(Marshal.SizeOf(typeof(T)));

    GCHandle handle = GCHandle.Alloc(bytes, GCHandleType.Pinned);
    T theStructure = (T)Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(T));
    handle.Free();

    return theStructure;
}

Here is what I am using.
This worked successfully for me for reading Portable Executable Format.
It's a generic function, so T is your struct type.

public static T ByteToType<T>(BinaryReader reader)
{
    byte[] bytes = reader.ReadBytes(Marshal.SizeOf(typeof(T)));

    GCHandle handle = GCHandle.Alloc(bytes, GCHandleType.Pinned);
    T theStructure = (T)Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(T));
    handle.Free();

    return theStructure;
}
我早已燃尽 2024-07-11 01:00:54

我没有看到你的代码有任何问题。

就在我的脑海中,如果你尝试手动执行怎么办? 有效吗?

BinaryReader reader = new BinaryReader(stream);
StructType o = new StructType();
o.FileDate = Encoding.ASCII.GetString(reader.ReadBytes(8));
o.FileTime = Encoding.ASCII.GetString(reader.ReadBytes(8));
...
...
...

还可以尝试

StructType o = new StructType();
byte[] buffer = new byte[Marshal.SizeOf(typeof(StructType))];
GCHandle handle = GCHandle.Alloc(buffer, GCHandleType.Pinned);
Marshal.StructureToPtr(o, handle.AddrOfPinnedObject(), false);
handle.Free();

在 BinaryReader 中使用 buffer[] 而不是从 FileStream 读取数据,看看是否仍然遇到 AccessViolation 异常。

我没有运气使用
BinaryFormatter,我想我必须这样做
有一个完整的结构匹配
文件的内容准确无误。

这是有道理的,BinaryFormatter 有自己的数据格式,与您的完全不兼容。

I don't see any problem with your code.

just out of my head, what if you try to do it manually? does it work?

BinaryReader reader = new BinaryReader(stream);
StructType o = new StructType();
o.FileDate = Encoding.ASCII.GetString(reader.ReadBytes(8));
o.FileTime = Encoding.ASCII.GetString(reader.ReadBytes(8));
...
...
...

also try

StructType o = new StructType();
byte[] buffer = new byte[Marshal.SizeOf(typeof(StructType))];
GCHandle handle = GCHandle.Alloc(buffer, GCHandleType.Pinned);
Marshal.StructureToPtr(o, handle.AddrOfPinnedObject(), false);
handle.Free();

then use buffer[] in your BinaryReader instead of reading data from FileStream to see whether you still get AccessViolation exception.

I had no luck using the
BinaryFormatter, I guess I have to
have a complete struct that matches
the content of the file exactly.

That makes sense, BinaryFormatter has its own data format, completely incompatible with yours.

像极了他 2024-07-11 01:00:54

直接读入结构是邪恶的——许多 C 程序已经失败,因为不同的字节顺序、不同的编译器实现的字段、打包、字大小......

你最好逐字节地序列化和反序列化。 如果您想要或只是习惯 BinaryReader,请使用内置的东西。

Reading straight into structs is evil - many a C program has fallen over because of different byte orderings, different compiler implementations of fields, packing, word size.......

You are best of serialising and deserialising byte by byte. Use the build in stuff if you want or just get used to BinaryReader.

ι不睡觉的鱼゛ 2024-07-11 01:00:54

我有结构:

[StructLayout(LayoutKind.Explicit, Size = 21)]
    public struct RecordStruct
    {
        [FieldOffset(0)]
        public double Var1;

        [FieldOffset(8)]
        public byte var2

        [FieldOffset(9)]
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 12)]
        public string String1;
    }
}

并且我收到“不正确对齐或被非对象重叠”
基于此我发现:
https://social.msdn.microsoft.com/Forums/vstudio/en-US/2f9ffce5-4c64-4ea7-a994-06b372b28c39/strange-issue-with-layoutkindexplicit?forum=clr

好的。 我想我明白这里发生了什么。 似乎是
问题与数组类型(它是一个对象)有关
type)必须存储在内存中的 4 字节边界处。 然而,什么
你真正想做的是分别序列化 6 个字节。

我认为问题在于 FieldOffset 和序列化之间的混合
规则。 我认为 structlayout.sequential 可能适合你,
因为它实际上并没有修改内存中的表示
结构。 我认为 FieldOffset 实际上是修改内存中的
类型的布局。 这会导致问题,因为 .NET 框架
要求对象引用在适当的边界上对齐(它
似乎)。

因此,我的结构被定义为显式:

[StructLayout(LayoutKind.Explicit, Size = 21)]

因此我的字段已指定,

[FieldOffset(<offset_number>)]

但是当您将结构更改为顺序结构时,您可以摆脱这些偏移量,并且错误将会消失。 就像是:

[StructLayout(LayoutKind.Sequential, Size = 21)]
    public struct RecordStruct
    {
        public double Var1;

        public byte var2;

        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 12)]
        public string String1;
    }
}

I had structure:

[StructLayout(LayoutKind.Explicit, Size = 21)]
    public struct RecordStruct
    {
        [FieldOffset(0)]
        public double Var1;

        [FieldOffset(8)]
        public byte var2

        [FieldOffset(9)]
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 12)]
        public string String1;
    }
}

and I received "incorrectly aligned or overlapped by non-object".
Based on that I found:
https://social.msdn.microsoft.com/Forums/vstudio/en-US/2f9ffce5-4c64-4ea7-a994-06b372b28c39/strange-issue-with-layoutkindexplicit?forum=clr

OK. I think I understand what's going on here. It seems like the
problem is related to the fact that the array type (which is an object
type) must be stored at a 4-byte boundary in memory. However, what
you're really trying to do is serialize the 6 bytes separately.

I think the problem is the mix between FieldOffset and serialization
rules. I'm thinking that structlayout.sequential may work for you,
since it doesn't actually modify the in-memory representation of the
structure. I think FieldOffset is actually modifying the in-memory
layout of the type. This causes problems because the .NET framework
requires object references to be aligned on appropriate boundaries (it
seems).

So my struct was defined as explicit with:

[StructLayout(LayoutKind.Explicit, Size = 21)]

and thus my fields had specified

[FieldOffset(<offset_number>)]

but when you change your struct to Sequentional, you can get rid of those offsets and the error will disappear. Something like:

[StructLayout(LayoutKind.Sequential, Size = 21)]
    public struct RecordStruct
    {
        public double Var1;

        public byte var2;

        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 12)]
        public string String1;
    }
}
落墨 2024-07-11 01:00:54

我没有使用 BinaryFormatter,我想我必须有一个与文件内容完全匹配的完整结构。 我意识到最终我对文件内容并不感兴趣,所以我采用了将部分流读取到字节缓冲区中的解决方案,然后使用

Encoding.ASCII.GetString()

字符串和

BitConverter.ToInt32()

整数将其转换。

稍后我需要能够解析更多文件,但对于这个版本,我只用了几行代码。

I had no luck using the BinaryFormatter, I guess I have to have a complete struct that matches the content of the file exactly. I realised that in the end I wasn't interested in very much of the file content anyway so I went with the solution of reading part of stream into a bytebuffer and then converting it using

Encoding.ASCII.GetString()

for strings and

BitConverter.ToInt32()

for the integers.

I will need to be able to parse more of the file later on but for this version I got away with just a couple of lines of code.

古镇旧梦 2024-07-11 01:00:54

尝试这个:

using (FileStream stream = new FileStream(fileName, FileMode.Open))
{
    BinaryFormatter formatter = new BinaryFormatter();
    StructType aStruct = (StructType)formatter.Deserialize(filestream);
}

Try this:

using (FileStream stream = new FileStream(fileName, FileMode.Open))
{
    BinaryFormatter formatter = new BinaryFormatter();
    StructType aStruct = (StructType)formatter.Deserialize(filestream);
}
喜爱皱眉﹌ 2024-07-11 01:00:54

问题是结构中的字符串。 我发现像 byte/short/int 这样的封送类型不是问题; 但是当您需要编组为复杂类型(例如字符串)时,您需要结构显式模仿非托管类型。 您可以使用 MarshalAs 属性来执行此操作。

对于您的示例,以下内容应该有效:

[StructLayout(LayoutKind.Explicit)]
struct StructType
{
    [FieldOffset(0)]
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 8)]
    public string FileDate;

    [FieldOffset(8)]
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 8)]
    public string FileTime;

    [FieldOffset(16)]
    public int Id1;

    [FieldOffset(20)]
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 66)] //Or however long Id2 is.
    public string Id2;
}

The problem is the strings in your struct. I found that marshaling types like byte/short/int is not a problem; but when you need to marshal into a complex type such as a string, you need your struct to explicitly mimic an unmanaged type. You can do this with the MarshalAs attrib.

For your example, the following should work:

[StructLayout(LayoutKind.Explicit)]
struct StructType
{
    [FieldOffset(0)]
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 8)]
    public string FileDate;

    [FieldOffset(8)]
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 8)]
    public string FileTime;

    [FieldOffset(16)]
    public int Id1;

    [FieldOffset(20)]
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 66)] //Or however long Id2 is.
    public string Id2;
}
聊慰 2024-07-11 01:00:54

正如罗尼所说,我会使用 BinaryReader 并单独读取每个字段。 我找不到包含此信息的文章的链接,但据观察,如果结构包含的字段少于 30-40 个左右,则使用 BinaryReader 读取每个单独的字段可能比 Marshal.PtrToStruct 更快。 当我找到该文章时,我会发布该文章的链接。

该文章的链接位于:http://www. codeproject.com/Articles/10750/Fast-Binary-File-Reading-with-C

当编组结构体数组时,PtrToStruct 更快地占据上风,因为您可以将字段计数视为字段 *数组长度。

As Ronnie said, I'd use BinaryReader and read each field individually. I can't find the link to the article with this info, but it's been observed that using BinaryReader to read each individual field can be faster than Marshal.PtrToStruct, if the struct contains less than 30-40 or so fields. I'll post the link to the article when I find it.

The article's link is at: http://www.codeproject.com/Articles/10750/Fast-Binary-File-Reading-with-C

When marshaling an array of structs, PtrToStruct gains the upper-hand more quickly, because you can think of the field count as fields * array length.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文