从本机结构复制字符串的说明

发布于 2024-12-25 22:49:08 字数 1036 浏览 0 评论 0原文

我使用 PInvoke 的东西是为了利用 C++ 中的 SetupAPI 函数。我使用它来获取符合 HID 规范的 USB 设备的路径。我一切正常，但有些我不明白的事情让我感到困惑。使用来自 SetupAPI 的此结构：

typedef struct _SP_DEVICE_INTERFACE_DETAIL_DATA {
    DWORD cbSize;
    TCHAR DevicePath[ANYSIZE_ARRAY];
} SP_DEVICE_INTERFACE_DETAIL_DATA, *PSP_DEVICE_INTERFACE_DETAIL_DATA;

我没有得到与我正在使用的示例代码相同的结果。首先，我使用 IntPtr 并使用 Marshal.AllocHGlobal() 来分配内存来来回传递。我调用 SetupDiGetDeviceInterfaceDetail() 两次，第一次是为了获取我需要的缓冲区的大小，第二次是为了实际获取我感兴趣的数据。我正在寻找该设备的路径，它存储在这个结构中。

我要使用的代码是这样做的：

IntPtr pDevPath = new IntPtr(pDevInfoDetail.ToInt32() + 4);
string path = Marshal.PtrToStringAuto(pDevPath);

效果很好。我这样做了，但得到的字符串是乱码。我必须将其更改为

IntPtr pDevPath = new IntPtr(pDevInfoDetail.ToInt32() + 4);
string path = Marshal.PtrToStringAnsi(pDevPath);

才能使其正常工作。这是为什么呢？我是否缺少项目/解决方案的一些设置来告知这个野兽如何处理字符串和字符？到目前为止，PtrToStringAuto() 的 MSDN 文章并没有告诉我太多相关信息。事实上，看起来这个方法应该做出适当的决定，根据我的需要调用 Unicode 或 Ansi 版本，一切都会好起来的。

请解释一下。

原文

I'm using the PInvoke stuff in order to make use of the SetupAPI functions from C++. I'm using this to get paths to USB devices conforming to the HID spec. I've got everything working but something I don't understand has me puzzled. Using this structure from the SetupAPI:

typedef struct _SP_DEVICE_INTERFACE_DETAIL_DATA {
    DWORD cbSize;
    TCHAR DevicePath[ANYSIZE_ARRAY];
} SP_DEVICE_INTERFACE_DETAIL_DATA, *PSP_DEVICE_INTERFACE_DETAIL_DATA;

I don't get the same results as the example code I'm using. First off, I'm using an IntPtr and allocating memory using Marshal.AllocHGlobal() to pass this back and forth. I call SetupDiGetDeviceInterfaceDetail() twice, first to get the size of the buffer I need, and second to actually get the data I'm interested in. I'm looking to get the Path to this device, which is stored in this struct.

The code I'm going off of does this:

IntPtr pDevPath = new IntPtr(pDevInfoDetail.ToInt32() + 4);
string path = Marshal.PtrToStringAuto(pDevPath);

Which works just fine. I did that and the string I got was gibberish. I had to change it to

IntPtr pDevPath = new IntPtr(pDevInfoDetail.ToInt32() + 4);
string path = Marshal.PtrToStringAnsi(pDevPath);

to make it work. Why is this? Am I missing some setting for the project/solution that informs this beast how to treat strings and chars? So far, the MSDN article for PtrToStringAuto() doesn't tell me much about it. In fact, it looks like this method should have made the appropriate decision, called either the Unicode or Ansi version for my needs, and all would be well.

Please explain.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

谜泪 2025-01-01 22:49:08

首先，使用真正的 P/Invoke 互操作类型而不是手动编组数据，+10000。但既然你问了，这就是你的琴弦发生的事情。

运行时根据应用于互操作声明的属性、使用互操作的上下文、调用的方法等，决定如何根据具体情况处理字符串和字符。每种类型的 P/Invoke 声明（ extern 方法、委托或结构）允许您指定该定义范围的默认字符大小。共有三个选项：

使用 CharSet.Ansi，将托管 Unicode 字符串转换为 8 位字符
使用 CharSet.Unicode，将字符串数据作为 16 位字符传递
使用 CharSet.Auto，它在运行时根据主机操作系统决定使用哪一个。

一般来说，我讨厌 CharSet.Auto 因为它几乎毫无意义。由于该框架甚至不支持 Windows 95，唯一一次“Auto”并不意味着“Unicode”是在 Windows 98 上运行时。但是这里有一个更大的问题，即运行时决定如何编组字符串在“错误的时间”。

您调用的非托管代码在编译时做出了该决定，因为编译器必须决定 TCHAR 是否表示 char 或 wchar - 该决定基于是否存在_UNICODE 预处理器宏。这意味着，对于大多数库来说，它将始终使用其中之一，并且让 CLR“选择一个”是没有意义的。

对于 Windows 系统组件，情况要好一些，因为支持 Unicode 的构建实际上包括大多数系统功能的两个版本。例如，Setup API 有两个方法：SetupDiGetDeviceInterfaceDetailA 和 SetupDiGetDeviceInterfaceDetailW。 *A 版本使用 8 位“ANSI”字符串，*W 版本使用 16 位宽“Unicode”字符串。它同样具有 ANSI 和 Wide 版本的任何具有字符串的结构。

如果您正确使用的话，这就是 CharSet.Auto 发挥作用的情况。当您将DllImport应用于函数时，您可以指定字符集。如果您指定 Ansi 作为字符集，如果运行时找不到与您的函数名称完全匹配的内容，它会附加 A 并重试。（奇怪的是，如果您指定 Unicode，它会首先调用 *W 函数，并且只有在失败时才尝试完全匹配。）

这里有一个问题：如果您不这样做不要在 DllImport 上指定字符集，默认为 CharSet.Ansi。这意味着您将获得该函数的 ANSI 版本，除非您专门覆盖字符集。这很可能是这里发生的情况：默认情况下，您正在调用 SetupDiGetDeviceInterfaceDetail 的 ANSI 版本，从而返回 ANSI 字符串，但 PtrToStringAuto 希望使用 Unicode，因为您'可能至少运行 Windows XP。

假设我们可以忽略 Windows 98，最好的选择是在各处指定 CharSet.Unicode，因为 SetupAPI 支持它，但至少，您需要指定相同的内容 CharSet 值随处可见。

First of all, +10000 on using a real P/Invoke interop type and not marshalling data by hand. But since you asked, here's what's going on with your strings.

The runtime decides how to treat strings and chars on a per-case basis, based on the attributes you apply to your interop declaractions, the context in which you use interop, the methods you call, etc. Every type of P/Invoke declaration (extern method, delegate, or structure) allows you to specify the default character size for the scope of that definision. There are three options:

Use CharSet.Ansi, which converts the managed Unicode strings to 8-bit characters
Use CharSet.Unicode, which passes the string data as 16-bit characters
Use CharSet.Auto, which decides at runtime, based on the host OS, which one to use.

In general, I hate CharSet.Auto because it's mostly pointless. Since the Framework doesn't even support Windows 95, the only time "Auto" doesn't mean "Unicode" is when running on Windows 98. But there's a bigger problem here, which is that the runtime's decision on how to marshal strings happens at the "wrong time".

The unmanaged code you are calling made that decision at compile time, since the compiler had to decide if TCHAR meant char or wchar -- that decision is based on the presence of the _UNICODE preprocessor macro. That means that, for most libraries, it's going to always use one or the other, and there's no point in letting the CLR "pick one".

For Windows system components, things are a bit better, because the Unicode-aware builds actually include two versions of most system function. The Setup API, for example, has two methods: SetupDiGetDeviceInterfaceDetailA and SetupDiGetDeviceInterfaceDetailW. The *A version uses 8-bit "ANSI" strings and the *W version uses 16-bit wide "Unicode" strings. It similarly has ANSI and Wide version of any structure which has a string.

This is the kind of situation where CharSet.Auto shines, assuming you use it properly. When you apply a DllImport to a function, you can specify the character set. If you specify Ansi for the character set, if the runtime doesn't find an exact match to your function name, it appends the A and tries again. (Oddly, if you specify Unicode, it will call the *W function first, and only try an exact match if that fails.)

Here's the catch: if you don't specify a character set on your DllImport, the default is CharSet.Ansi. This means you are going to get the ANSI version of the function, unless you specifically override the charset. That's most likely what is happening here: you are calling the ANSI version of SetupDiGetDeviceInterfaceDetail by default, and thus getting an ANSI string back, but PtrToStringAuto wants to use Unicode because you're probably running at least Windows XP.

The BEST option, assuming we can ignore Windows 98, would be to specify CharSet.Unicode all over the place, since SetupAPI supports it, but at the very least, you need to specify the same CharSet value everywhere.

回复收藏 0 原文

~没有更多了~