从 C++ 编组结构数组到 C#?

发布于 2024-08-11 15:43:27 字数 2282 浏览 4 评论 0原文

在我的 C# 代码中,我尝试从旧版 C++ DLL 中获取结构数组(我无法更改该代码)。

在该 C++ 代码中,结构的定义如下:

struct MyStruct
{
    char* id;
    char* description;
};

我调用的方法 (get_my_structs) 返回一个指向 MyStruct 结构数组的指针:

MyStruct* get_my_structures()
{
    ...
}

还有另一个方法返回结构的数量,因此我知道获得了多少个结构回来了。

在我的 C# 代码中,我像这样定义了 MyStruct:

[StructLayout(LayoutKind.Sequential)]  
public class MyStruct
{
  [MarshalAsAttribute(UnmanagedType.LPStr)]    // <-- also tried without this
  private string _id;
  [MarshalAsAttribute(UnmanagedType.LPStr)]
  private string _description;
}

互操作签名如下所示:

[DllImport("legacy.dll", EntryPoint="get_my_structures")]
public static extern IntPtr GetMyStructures();

最后,获取 MyStruct 结构数组的代码如下所示:

int structuresCount = ...;
IntPtr myStructs = GetMyStructures();
int structSize = Marshal.SizeOf(typeof(MyStruct));    // <- returns 8 in my case
for (int i = 0; i < structuresCount; i++)
{
    IntPtr data = new IntPtr(myStructs.ToInt64() + structSize * i);
    MyStruct ms = (MyStruct) Marshal.PtrToStructure(data, typeof(MyStruct));
    ...
}

问题是,只有第一个结构(偏移量为零处的一个)得到正确编组。后续的 _id 和 _description 成员中有虚假值。这些值并没有完全被丢弃,或者看起来是这样:它们是来自其他内存位置的字符串。代码本身不会崩溃。

我已经验证 get_my_structs() 中的 C++ 代码确实返回了正确的数据。在通话期间或之后,数据不会被意外删除或修改。

在调试器中查看,返回数据的 C++ 内存布局如下所示:

0: id (char*)           <---- [MyStruct 1]
4: description (char*)
8: id (char*)           <---- [MyStruct 2]
12: description (char*)
16: id (char*)          <---- [MyStruct 3]
...

[Update 18/11/2009]

下面是 C++ 代码如何准备这些结构(实际代码要丑陋得多,但是这个是一个足够接近的近似值):

static char buffer[12345] = {0};
MyStruct* myStructs = (MyStruct*) &buffer;
for (int i = 0; i < structuresCount; i++)
{
    MyStruct* ms = <some other permanent address where the struct is>;
    myStructs[i].id = (char*) ms->id;
    myStructs[i].description = (char*) ms->description;
}
return myStructs;

诚然,上面的代码做了一些丑陋的转换并复制了原始指针,但它似乎仍然正确地做到了这一点。至少这是我在调试器中看到的:上面的(静态)缓冲区确实包含一个接一个存储的所有这些裸 char* 指针,并且它们指向内存中的有效(非本地)位置。

帕维尔的例子表明,这确实是唯一可能出错的地方。我将尝试分析字符串真正所在的“结束”位置(而不是存储指针的位置)会发生什么。

In my C# code I'm trying to fetch an array of structures from a legacy C++ DLL (the code I cannot change).

In that C++ code, the structure is defined like this:

struct MyStruct
{
    char* id;
    char* description;
};

The method that I'm calling (get_my_structures) returns a pointer to an array of MyStruct structures:

MyStruct* get_my_structures()
{
    ...
}

There is another method that returns the number of stuctures so I do know how many structures get returned.

In my C# code, I have defined MyStruct like this:

[StructLayout(LayoutKind.Sequential)]  
public class MyStruct
{
  [MarshalAsAttribute(UnmanagedType.LPStr)]    // <-- also tried without this
  private string _id;
  [MarshalAsAttribute(UnmanagedType.LPStr)]
  private string _description;
}

The interop signature looks like this:

[DllImport("legacy.dll", EntryPoint="get_my_structures")]
public static extern IntPtr GetMyStructures();

Finally, the code that fetches the array of MyStruct structures looks like this:

int structuresCount = ...;
IntPtr myStructs = GetMyStructures();
int structSize = Marshal.SizeOf(typeof(MyStruct));    // <- returns 8 in my case
for (int i = 0; i < structuresCount; i++)
{
    IntPtr data = new IntPtr(myStructs.ToInt64() + structSize * i);
    MyStruct ms = (MyStruct) Marshal.PtrToStructure(data, typeof(MyStruct));
    ...
}

The trouble is, only the very first structure (one at the offset zero) gets marshaled correctly. Subsequent ones have bogus values in _id and _description members. The values are not completely trashed, or so it seems: they are strings from some other memory locations. The code itself does not crash.

I have verified that the C++ code in get_my_structures() does return correct data. The data is not accidentally deleted or modified during or after the call.

Viewed in a debugger, C++ memory layout of the returned data looks like this:

0: id (char*)           <---- [MyStruct 1]
4: description (char*)
8: id (char*)           <---- [MyStruct 2]
12: description (char*)
16: id (char*)          <---- [MyStruct 3]
...

[Update 18/11/2009]

Here is how the C++ code prepares these structures (the actual code is much uglier, but this is a close enough approximation):

static char buffer[12345] = {0};
MyStruct* myStructs = (MyStruct*) &buffer;
for (int i = 0; i < structuresCount; i++)
{
    MyStruct* ms = <some other permanent address where the struct is>;
    myStructs[i].id = (char*) ms->id;
    myStructs[i].description = (char*) ms->description;
}
return myStructs;

Admittedly, the code above does some ugly casting and copies raw pointers around, but it still does seem to do that correctly. At least that's what I see in the debugger: the above (static) buffer does contain all these naked char* pointers stored one after another, and they point to valid (non-local) locations in memory.

Pavel's example shows that this is really the only place where things can go wrong. I will try to analyze what happens with those 'end' locations where the strings really are, not the locations where the pointers get stored.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

街道布景 2024-08-18 15:43:27

我无法重现你的问题,这让我怀疑它确实是在 C++ 方面。这是我尝试的完整源代码。

dll.cpp - 使用 cl.exe /LD 编译:

extern "C" {

struct MyStruct
{
    char* id;
    char* description;
};

__declspec(dllexport)
MyStruct* __stdcall get_my_structures()
{
    static MyStruct a[] =
    {
        { "id1", "desc1" },
        { "id2", "desc2" },
        { "id3", "desc3" }
    };
    return a;

}

}

test.cs - 使用 csc.exe /platform:x86< 编译/code>:

using System;
using System.Runtime.InteropServices;


[StructLayout(LayoutKind.Sequential)]  
public class MyStruct
{
  [MarshalAsAttribute(UnmanagedType.LPStr)]
  public string _id;
  [MarshalAsAttribute(UnmanagedType.LPStr)]
  public string _description;
}


class Program
{
    [DllImport("dll")]
    static extern IntPtr get_my_structures();

    static void Main()
    {
        int structSize = Marshal.SizeOf(typeof(MyStruct));
        Console.WriteLine(structSize);

        IntPtr myStructs = get_my_structures();
        for (int i = 0; i < 3; ++i)
        {
            IntPtr data = new IntPtr(myStructs.ToInt64() + structSize * i);
            MyStruct ms = (MyStruct) Marshal.PtrToStructure(data, typeof(MyStruct));

            Console.WriteLine();
            Console.WriteLine(ms._id);
            Console.WriteLine(ms._description);
        }
    }
}

这会正确打印出所有 3 个结构。

您能显示填充结构的 C++ 代码吗?事实上,您可以直接从 C++ 调用它并获得正确的结果并不一定意味着它是正确的。例如,您可以返回指向堆栈分配结构的指针。那么,当进行直接调用时,您会得到一个技术上无效的指针,但数据可能会保留下来。在进行 P/Invoke 编组时,当堆栈尝试从那里读取值时,堆栈可能会被 P/Invoke 数据结构覆盖。

I cannot reproduce your problem, which leads me to suspect that it's really on C++ side of things. Here's the complete source code for my attempt.

dll.cpp - compile with cl.exe /LD:

extern "C" {

struct MyStruct
{
    char* id;
    char* description;
};

__declspec(dllexport)
MyStruct* __stdcall get_my_structures()
{
    static MyStruct a[] =
    {
        { "id1", "desc1" },
        { "id2", "desc2" },
        { "id3", "desc3" }
    };
    return a;

}

}

test.cs - compile with csc.exe /platform:x86:

using System;
using System.Runtime.InteropServices;


[StructLayout(LayoutKind.Sequential)]  
public class MyStruct
{
  [MarshalAsAttribute(UnmanagedType.LPStr)]
  public string _id;
  [MarshalAsAttribute(UnmanagedType.LPStr)]
  public string _description;
}


class Program
{
    [DllImport("dll")]
    static extern IntPtr get_my_structures();

    static void Main()
    {
        int structSize = Marshal.SizeOf(typeof(MyStruct));
        Console.WriteLine(structSize);

        IntPtr myStructs = get_my_structures();
        for (int i = 0; i < 3; ++i)
        {
            IntPtr data = new IntPtr(myStructs.ToInt64() + structSize * i);
            MyStruct ms = (MyStruct) Marshal.PtrToStructure(data, typeof(MyStruct));

            Console.WriteLine();
            Console.WriteLine(ms._id);
            Console.WriteLine(ms._description);
        }
    }
}

This correctly prints out all 3 structs.

Can you show your C++ code that fills the structs? The fact that you can call it from C++ directly and get correct results does not necessarily mean it's correct. For example, you could be returning a pointer to a stack-allocated struct. When doing a direct call, then, you'd get a technically invalid pointer, but the data would likely remain preserved. When doing P/Invoke marshalling, the stack could be overwritten by P/Invoke data structures by the point it tries to read values from there.

坦然微笑 2024-08-18 15:43:27

我会改变结构。使用 IntPtr 代替字符串等:

[StructLayout(LayoutKind.Sequential)]  
public class MyStruct
{
  private IntPtr _id;
  private IntPtr _description;
}

然后可以使用 Marshal.PtrToString 手动将 C# 数组的每个值编组为字符串,同时考虑到字符集等。

I would change the structure. Instead of strings etc. , use IntPtr:

[StructLayout(LayoutKind.Sequential)]  
public class MyStruct
{
  private IntPtr _id;
  private IntPtr _description;
}

Then each value of the C# array could be manually marshalled to string using Marshal.PtrToString taking into account charset etc.

睫毛上残留的泪 2024-08-18 15:43:27

我通常会通过反复试验来解决这些问题。确保您在 StructLayout 上设置了 CharSet 属性,我会尝试 UnmanagedType.LPTStr,似乎对 char * 效果更好,尽管我不确定为什么。

[StructLayout(LayoutKind.Sequential, CharSet=CharSet.Auto)]  
public class MyStruct
{
    [MarshalAsAttribute(UnmanagedType.LPTStr)]
    private string _id;
    [MarshalAsAttribute(UnmanagedType.LPTStr)]
    private string _description;
}

I usually end up working these things out by trial and error. Make sure you have the CharSet property set on your StructLayout, and I would try UnmanagedType.LPTStr, seems to work better for char *, even though I am not sure why.

[StructLayout(LayoutKind.Sequential, CharSet=CharSet.Auto)]  
public class MyStruct
{
    [MarshalAsAttribute(UnmanagedType.LPTStr)]
    private string _id;
    [MarshalAsAttribute(UnmanagedType.LPTStr)]
    private string _description;
}
初见终念 2024-08-18 15:43:27

您必须对 char* 使用 UnmanagedType.LPTStr对于非常量 char*,还建议使用 StringBuilder
还有一个 CharSet 规范:

[StructLayout(LayoutKind.Sequential, Charset = CharSet.Auto)]  
public class MyStruct
{
  [MarshalAsAttribute(UnmanagedType.LPTStr)]
  private StringBuilder _id;
  [MarshalAsAttribute(UnmanagedType.LPTStr)]
  private StringBuilder _description;
}

至于 DllImport 声明,您尝试过吗

[DllImport("legacy.dll", EntryPoint="get_my_structures")]
public static extern MarshalAs(UnmanagedType.LPArray) MyStruct[] GetMyStructures();

另外,如果前面的方法不起作用,请将其保留在 IntPtr 中并尝试对返回的结构进行 Mashal,如下所示:

for (int i = 0; i < structuresCount; i++)
{
    MyStruct ms = (MyStruct) Marshal.PtrToStructure(myStructs, typeof(MyStruct));
    ...
    myStructs += Marshal.SizeOf(ms);
}

You have to use UnmanagedType.LPTStr for char*. Also a StringBuilder is recommended for a non const char*:
And a CharSet specification:

[StructLayout(LayoutKind.Sequential, Charset = CharSet.Auto)]  
public class MyStruct
{
  [MarshalAsAttribute(UnmanagedType.LPTStr)]
  private StringBuilder _id;
  [MarshalAsAttribute(UnmanagedType.LPTStr)]
  private StringBuilder _description;
}

As for the DllImport declaration, have you tried

[DllImport("legacy.dll", EntryPoint="get_my_structures")]
public static extern MarshalAs(UnmanagedType.LPArray) MyStruct[] GetMyStructures();

?

Also, if the previous doesn't work, leave it at IntPtr and try to Mashal the returned structs like this:

for (int i = 0; i < structuresCount; i++)
{
    MyStruct ms = (MyStruct) Marshal.PtrToStructure(myStructs, typeof(MyStruct));
    ...
    myStructs += Marshal.SizeOf(ms);
}
夜还是长夜 2024-08-18 15:43:27

我还认为,除了给出的答案之外,您还需要提供长度,即
[MarshalAsAttribute(UnmanagedType.LPTStr), SizeConst = , ArraySubType = System.Runtime.InteropServices.UnmanagedType.AnsiBStr)]

这是一个尝试和错误,以实现这一点,另外,在一些需要字符串的 WinAPI 调用中,需要考虑另一件事参数,通常是一个 ref 参数,也可能值得您尝试一下 StringBuilder 类...除了我在这里提到的几点之外,没有其他想到的...希望这会有所帮助,Tom

I think, also, in addition to the answers given, that you need to supply the length as well, ie
[MarshalAsAttribute(UnmanagedType.LPTStr), SizeConst = , ArraySubType = System.Runtime.InteropServices.UnmanagedType.AnsiBStr)]

This is a trial and error to get this right, also, another thing to consider, in some WinAPI calls that expect a string parameter, usually a ref parameter, it might be worth your while to try the StringBuilder class also...Nothing else comes to mind on this other than the points I have mentioned here... Hope this helps, Tom

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文