如何有效地将 BSTR 复制到 wchar_t[]?
我有一个 BSTR 对象,我想将其转换为复制到 wchar__t 对象。 棘手的是 BSTR 对象的长度可能是几千字节到几百千字节。 有没有一种有效的方法来复制数据? 我知道我可以声明一个 wchar_t 数组并始终分配它需要保存的最大可能数据。 然而,这意味着为可能只需要几千字节的东西分配数百千字节的数据。 有什么建议么?
I have a BSTR object that I would like to convert to copy to a wchar__t object. The tricky thing is the length of the BSTR object could be anywhere from a few kilobytes to a few hundred kilobytes. Is there an efficient way of copying the data across? I know I could just declare a wchar_t array and alway allocate the maximum possible data it would ever need to hold. However, this would mean allocating hundreds of kilobytes of data for something that potentially might only require a few kilobytes. Any suggestions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
从来没有任何转换的需要。
BSTR
指针指向字符串的第一个字符,并且以 null 结尾。 长度存储在内存中第一个字符之前。BSTR
始终为 Unicode (UTF-16/UCS-2)。 在某个阶段存在一种称为“ANSI BSTR”的东西 - 遗留 API 中有一些参考 - 但您可以在当前的开发中忽略它们。这意味着您可以将
BSTR
安全地传递给任何需要wchar_t
的函数。在 Visual Studio 2008 中,您可能会收到编译器错误,因为
BSTR
被定义为指向unsigned Short
的指针,而wchar_t
是本机类型。 您可以强制执行或关闭wchar_t
与/Zc:wchar_t
的合规性。There is never any need for conversion. A
BSTR
pointer points to the first character of the string and it is null-terminated. The length is stored before the first character in memory.BSTR
s are always Unicode (UTF-16/UCS-2). There was at one stage something called an 'ANSI BSTR' - there are some references in legacy APIs - but you can ignore these in current development.This means you can pass a
BSTR
safely to any function expecting awchar_t
.In Visual Studio 2008 you may get a compiler error, because
BSTR
is defined as a pointer tounsigned short
, whilewchar_t
is a native type. You can either cast or turn offwchar_t
compliance with/Zc:wchar_t
.BSTR 对象包含一个长度前缀,因此找出长度很便宜。 找出长度,分配一个足够大的新数组来保存结果,对其进行处理,并记住在完成后释放它。
BSTR objects contain a length prefix, so finding out the length is cheap. Find out the length, allocate a new array big enough to hold the result, process into that, and remember to free it when you're done.
首先,如果您需要做的只是阅读内容,那么您实际上可能根本不需要做任何事情。 BSTR 类型已经是指向以 null 结尾的 wchar_t 数组的指针。 事实上,如果你检查头文件,你会发现 BSTR 本质上定义为:
因此,编译器无法区分它们,即使它们具有不同的语义。
有两个重要的警告。
BSTR 应该是不可变的。 BSTR 初始化后,您不应该更改它的内容。 如果您“更改它”,则必须创建一个新指针并分配新指针并释放旧指针(如果您拥有它)。[更新:这不是真的; 对不起! 您可以就地修改BSTR; 我很少有这种需要。]
BSTR 允许包含嵌入的空字符,而传统的 C/C++ 字符串则不允许。
如果您对 BSTR 的源有相当多的控制,并且可以保证 BSTR 没有嵌入 NULL,则可以从 BSTR 中读取,就好像它是 wchar_t 一样,并使用传统的字符串方法(wcscpy 等)来读取 BSTR。访问它。 如果没有,你的生活会变得更加艰难。 您必须始终将数据作为更多 BSTR 或动态分配的 wchar_t 数组来操作。 大多数与字符串相关的函数将无法正常工作。
假设您控制数据,或者不担心 NULL。 我们还假设您确实需要复制一份并且不能直接读取现有的 BSTR。 在这种情况下,您可以执行以下操作:
如果您为 BSTR 使用类包装器,则包装器应该有一种方法为您调用 SysStringLen()。 例如:
更新:这是一篇关于该主题的好文章,作者是比我知识渊博得多的人:
“Eric [Lippert] 的 BSTR 语义完整指南”< /a>
更新:在示例中将
strcpy()
替换为wcscpy()
。First, you might not actually have to do anything at all, if all you need to do is read the contents. A BSTR type is a pointer to a null-terminated wchar_t array already. In fact, if you check the headers, you will find that BSTR is essentially defined as:
So, the compiler can't distinguish between them, even though they have different semantics.
There is are two important caveat.
BSTRs are supposed to be immutable. You should never change the contents of a BSTR after it has been initialized. If you "change it", you have to create a new one assign the new pointer and release the old one (if you own it).[UPDATE: this is not true; sorry! You can modify BSTRs in place; I very rarely have had the need.]
BSTRs are allowed to contain embedded null characters, whereas traditional C/C++ strings are not.
If you have a fair amount of control of the source of the BSTR, and can guarantee that the BSTR does not have embedded NULLs, you can read from the BSTR as if it was a wchar_t and use conventional string methods (wcscpy, etc) to access it. If not, your life gets harder. You will have to always manipulate your data as either more BSTRs, or as a dynamically-allocated array of wchar_t. Most string-related functions will not work correctly.
Let's assume you control your data, or don't worry about NULLs. Let's assume also that you really need to make a copy and can't just read the existing BSTR directly. In that case, you can do something like this:
If you are using class wrappers for your BSTR, the wrapper should have a way to call SysStringLen() for you. For example:
UPDATE: This is a good article on the subject by someone far more knowledgeable than me:
"Eric [Lippert]'s Complete Guide To BSTR Semantics"
UPDATE: Replaced
strcpy()
withwcscpy()
in the example.需要记住的一件事是
BSTR
字符串可以而且经常包含嵌入的空值。 null 并不意味着字符串结束。One thing to keep in mind is that
BSTR
strings can, and often do, contain embedded nulls. A null does not mean the end of the string.使用ATL和CStringT那么你可以只使用赋值运算符。 或者您可以使用 USES_CONVERSION 宏,这些宏使用堆分配,因此您将确保不会泄漏内存。
Use ATL, and CStringT then you can just use the assignment operator. Or you can use the USES_CONVERSION macros, these use heap alloc, so you will be sure that you won't leak memory.