为什么 COM 不使用静态空 BSTR?
当分配一个空的BSTR
时,无论是通过SysAllocString(L"")
还是通过SysAllocStringLen(str, 0)
,你总是会得到一个新的BSTR
(至少根据我所做的测试)。 BSTR 通常不被共享(如 Java/.NET 埋葬),因为它们是可变的,但空字符串无论出于何种意图和目的都是不可变的。
我的问题(最后)是为什么 COM 在创建空的 BSTR
时不使用总是返回相同字符串的简单优化(并在 SysFreeString
中忽略它)?是否有令人信服的理由不这样做(因为我的推理有缺陷),或者只是因为它被认为不够重要?
When allocating an empty BSTR
, either by SysAllocString(L"")
or by SysAllocStringLen(str, 0)
you always get a new BSTR
(at least by the test I made). BSTR
s aren't typically shared (like Java/.NET interment) since they are mutable but an empty string is, for all intents and purposes, immutable.
My question (at long last) is why doesn't COM use the trivial optimization of always returning the same string when creating an empty BSTR
(and ignoring it in SysFreeString
)? Is there a compelling reason not to do so (because my reasoning is flawed) or is it just that it wasn't thought to be important enough?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我无法谈论 COM 中的惯用用法,但在 C++、Java 等中,人们期望如果您
new
一个对象,它不会比较相等(就地址/对象标识而言)涉及)任何其他对象。当您使用基于身份的映射时,此属性非常有用,例如,作为 Java 中的IdentityHashMap
中的键。因此,我认为空字符串/对象不应成为此规则的例外。编写良好的 COM 对象将允许您将
NULL
传递给BSTR
参数,并将其视为等同于空字符串。 (不过,这不适用于 MSXML,因为我已经学到了惨痛的教训。:-P)I can't speak about what's idiomatic in COM, but in C++, Java, etc., there's an expectation that if you
new
an object, it will not compare equal (as far as address/object identity is concerned) to any other object. This property is useful when you use identity-based mapping, e.g., as keys in anIdentityHashMap
in Java. For that reason, I don't think empty strings/objects should be an exception to this rule.Well-written COM objects will allow you to pass
NULL
to aBSTR
parameter and treat it as equivalent to an empty string. (This will not work with MSXML though, as I've learnt the hard way. :-P)我猜测(是的,这只是猜测)这种优化没有被认为足够重要而无法执行。
虽然对于 Windows 过去的许多事情而言,内存消耗是 API 设计中的一个主要因素(参见 Raymond Chen 的文章),但与 Java 或 .NET 的字符串驻留不同,其好处相当小,因为它们仅适用于只有 6 个字节的单个字符串长的。程序在任何单个时间点必须在内存中保留多少个空字符串?这个数字是否足够大以保证优化,或者实际上可以忽略不计?
I'd guess (and yes, it's only a guess) that this optimization wasn't deemed important enough to perform.
While for many things from Windows's past memory consumption was a major factor in the API design (cf. Raymond Chen's articles), unlike Java's or .NET's string interning the benefits are rather small since they only apply to a single string which is only six bytes long. And how many empty strings a program has to keep in memory at any single point in time? Is that number large enough to warrant that optimization or is it actually negligible?
与其说 COM 分配了 BSTR,不如说是 Windows 子系统提供了 BSTR。
空 BSTR 不能共享静态实例,因为有一些函数可以重新分配/调整 BSTR 的大小。请参阅SysReAllocString。虽然没有提到乐观分配行为,但不能假设调用者在调用后永远不会收到原始 BSTR。
SysReAllocString @ MSDN
编辑:
经过反思,我意识到,即使在考虑 SysReAllocString 时,也可以从共享的空 BSTR 开始,调用 SysReAllocString,并接收一个新的 BSTR,而不会出现任何中断行为。因此,为了争论,可以打折扣。我的错。
然而,我认为空 BSTR 的想法所带来的负担比人们想象的要多。我编写了一些测试程序,看看是否能得到一些相互矛盾或有趣的结果。在运行我的测试并统计结果之后,我认为对你的问题的最佳答案是,如果所有请求都有自己的 BSTR,那么对每个相关人员来说都是最安全的。有很多时髦的方法来获取 BSTR,报告不同风格的零长度,包括面向字符串和字节的。即使有一些优化在某些地方返回共享实例,当口头描述空 BSTR 与具有空字符串长度和实际分配长度的实际 BSTR 时,仍然有很大的混淆空间。例如,诸如“没有字符串分配长度的 BSTR 可能会被遗忘”之类的语句可能容易导致一些严重的内存泄漏(请参阅下面有关字节分配 BSTR 的测试)。
此外,尽管某些 COM 组件允许 NULL 指针(0 值)BSTR 作为参数,但假设所有 COM 组件都支持它是不安全的。只有当调用者和被调用者都同意允许时,这才是安全的。对每个人来说最安全的行为是假设如果 BSTR 被移交,它可能具有零定义长度,需要处理零定义长度的情况,并且需要一些不是 NULL 指针的值。至少,这使得编写代理/存根代码和其他棘手的任务变得更加容易。
我的第一个测试程序尝试了一些不常见的分配方法。请注意,您可以获得报告的 SysStringLen 长度为 0 的 BSTR,但具有实际的字节分配。另外,我承认 bstr5 和 bstr6 不是干净的分配方法。
这是来源:
这是我收到的结果。
我的下一个测试程序显示,减小大小的更改可能会返回相同的 BSTR。这是一个简短的片段,可以为您演示这一点,以及我收到的输出。我也将其增加到超出其原始长度,但仍然收到相同的 BSTR。这至少表明,我们不能假设没有长度的 BSTR 的大小就不能增加。
在我的工作站 (Windows XP) 上运行该程序,返回以下结果。我很想知道是否还有其他人在此过程中获得了新的 BSTR。
我再次尝试了这个程序,但这次以空的宽字符字符串(L"")开始。这应该涵盖从没有定义字符串长度的 BSTR 开始,并查看它是否确实具有隐式大小的情况。当我运行它时,我发现我仍然收到相同的 BSTR。不过,我预计这里的结果可能会有所不同。
来源如下:
结果:
It isn't COM allocating the BSTR so much as it is the windows subsystem providing it.
Empty BSTRs cannot share a static instance because there are functions that can reallocate/resize BSTRs. See SysReAllocString. Although no optimistic allocation behavior is mentioned, it cannot be assumed that the caller will never receive the original BSTR after a call.
SysReAllocString @ MSDN
edit:
Upon some reflection, I realize that even in accounting for SysReAllocString, one could start with an empty BSTR that is shared, call SysReAllocString, and receive a new BSTR without any breaking behavior. So that can be discounted for the sake of argument. My fault.
However, I figure that the idea of an empty BSTR carries more baggage than one might think. I wrote some test programs to see if I could get some conflicting or interesting results. After running my tests and tallying up the results, I think the best answer to your question is that it is simply safest for everyone involved if all requests get their own BSTRs. There are lots of funky ways to get BSTRs that report different flavors of zero-lengths, both string and byte-oriented. Even if there were some optimization that returned shared instances in some places, there's plenty of room for confusion when verbally describing an empty BSTR versus an actual BSTR that has empty string length and real allocation length. For example, a statement such as "a BSTR that has no string-allocated length may be forgotten", could be apt to lead to some aggravating memory leaks (see tests below regarding byte-allocated BSTRs).
Also, despite some COM components that allow NULL-pointer (0-valued) BSTRs as arguments, it is unsafe to assume that all COM components support it. This can only be safe if both the caller and the callee agree to allow this. The safest behavior for everyone is to assume that if a BSTR is handed over, that it may have zero-definition length, require handling the case of zero-definition length, and to require some value that isn't a NULL-pointer. At the very least, this makes it much easier to write proxy/stub code and other tricky tasks.
My first test program attempted some uncommon allocation methods. Note that you can get BSTRs with reported SysStringLen-lengths of 0, but with real byte allocations. Also, I admit forehand that bstr5 and bstr6 are not clean methods of allocation.
Here's the source:
Here are the results I received.
My next test program revealed that alterations downward in size may return the same BSTR. Here is a short snippet that can demonstrate this for you, along with the output I received. I also increased it beyond its original length as well, and still received the same BSTR back. This suggests, at the very least, that one cannot assume that a BSTR with no length cannot be increased in size.
Running this program, on my workstation (Windows XP), returned the following results. I'd be interested in knowing if anyone else gets a new BSTR anywhere along the way.
I tried this program again, but this time starting with an empty widechar string (L""). This should cover the case of starting with a BSTR with no string-length defined, and seeing if it actually has implicit size. When I ran it, I found that I still received the same BSTR back. I expect, though, that results may vary here.
Here's the source:
The results: