0x80010100:系统调用失败”异常,ContextSwitch死锁
长话短说:在使用 COM inproc-server (dll) 的 C# 应用程序中,我遇到“0x80010100:系统调用失败”异常,并且在调试模式下也遇到 ContextSwitchDeadlock 异常。
现在更详细:
1)C#应用程序初始化STA,创建一个COM对象(注册为“公寓”);然后订阅其连接点,并开始处理该对象。
2) 在某个阶段,COM 对象会生成大量事件,并传递一个非常大的 COM 对象集合作为参数,这些对象是在同一单元中创建的。
3) C#端的事件处理程序处理上述集合,偶尔调用对象的一些方法。在某个阶段,后面的调用开始失败,并出现上述异常。
在 COM 端,公寓使用一个隐藏窗口,其 winproc 如下所示:
typedef std::function<void(void)> Functor;
LRESULT CALLBACK WndProc(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam)
{
switch(msg)
{
case AM_FUNCTOR:
{
Functor *f = reinterpret_cast<Functor *>(lParam);
(*f)();
delete f;
}
break;
case WM_CLOSE:
DestroyWindow(hwnd);
break;
default:
return DefWindowProc(hwnd, msg, wParam, lParam);
}
return 0;
}
事件从 COM 服务器的其他部分发布到此窗口:
void post(const Functor &func)
{
Functor *f = new Functor(func);
PostMessage(hWind_, AM_FUNCTOR, 0, reinterpret_cast<LPARAM>(f));
}
事件是与实际参数绑定的标准 ATL CP 实现,并且它们归结为某些内容像这样:
pConnection->Invoke(id, IID_NULL, LOCALE_USER_DEFAULT, DISPATCH_METHOD, ¶ms, &varResult, NULL, NULL);
在 C# 中,处理程序如下所示:
private void onEvent(IMyCollection objs)
{
int len = objs.Count; // usually 10000 - 25000
foreach (IMyObj obj in objs)
{
// some of the following calls fail with 0x80010100
int id = obj.id;
string name = obj.name;
// etc...
}
}
==================
那么,上述问题是否会因为公寓的消息队列负载过多而发生?它试图传递的事件?或者应该完全阻止消息循环才能导致这种行为?
让我们假设消息队列有 2 个顺序事件,其计算结果为“onEvent”调用。第一个输入 C# 托管代码,它尝试重新输入非托管代码(同一单元)。通常,这是允许的,而且我们经常这样做。什么时候、什么情况下会失败?
谢谢。
Long story short: in a C# application that works with COM inproc-server (dll), I encounter "0x80010100: System call failed" exception, and in debug mode also ContextSwitchDeadlock exception.
Now more in details:
1) C# app initializes STA, creates a COM object (registered as "Apartment"); then in subscribes to its connection-point, and begins working with the object.
2) At some stage the COM object generates a lot of events, passing as an argument a very big collection of COM objects, which are created in the same apartment.
3) The event-handler on C# side processes the above collection, occasionally calling some methods of the objects. At some stage the latter calls begin to fail with the above exceptions.
On the COM side the apartment uses a hidden window whose winproc looks like this:
typedef std::function<void(void)> Functor;
LRESULT CALLBACK WndProc(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam)
{
switch(msg)
{
case AM_FUNCTOR:
{
Functor *f = reinterpret_cast<Functor *>(lParam);
(*f)();
delete f;
}
break;
case WM_CLOSE:
DestroyWindow(hwnd);
break;
default:
return DefWindowProc(hwnd, msg, wParam, lParam);
}
return 0;
}
The events are posted to this window from other parts of the COM server:
void post(const Functor &func)
{
Functor *f = new Functor(func);
PostMessage(hWind_, AM_FUNCTOR, 0, reinterpret_cast<LPARAM>(f));
}
The events are standard ATL CP implementations bound with the actual params, and they boil down to something like this:
pConnection->Invoke(id, IID_NULL, LOCALE_USER_DEFAULT, DISPATCH_METHOD, ¶ms, &varResult, NULL, NULL);
In C# the handler looks like this:
private void onEvent(IMyCollection objs)
{
int len = objs.Count; // usually 10000 - 25000
foreach (IMyObj obj in objs)
{
// some of the following calls fail with 0x80010100
int id = obj.id;
string name = obj.name;
// etc...
}
}
==================
So, can the above problem happen just because the message-queue of the apartment is too loaded with the events it tries to deliver? Or the message loop should be totally blocked to cause such a behaviour?
Lets assume that the message-queue has 2 sequential events that evaluate to "onEvent" call. The first one enters C# managed code, which attempts to re-enter the unmanaged code, the same apartment. Usually, this is allowed, and we do this a lot. When, under what circumstances can it fail?
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
即使有多个单元,这也应该可以工作,前提是:
并且
首先:
看起来某些物体与其他物体不在同一公寓中。您确定所有对象都在 STA 中创建吗?
您所描述的是一个经典的死锁 - 两个独立的线程,每个线程都在等待另一个。这就是我期望在 C# 和 COM 端在不同线程上运行的设计中发生的情况。
如果所有对象都在同一线程上,并且隐藏窗口位于该线程上,那么您应该没问题,所以我认为您需要检查这一点。 (显然,这包括由 COM 端创建并传递到 C# 端的任何其他对象。)
您可以通过在调试器中按“暂停”并检查每个线程中的代码来尝试调试它(如果您看到 RPCRT* .DLL,这意味着您正在查看代理)。或者,您可以从 C# 和 COM 端以及 WndProc 中的各个关键点 DebugPrint 当前线程 ID - 它们应该都是相同的。
第二:它应该与多个线程一起工作,前提是只有一个线程生成工作项,而另一个线程除了响应调用的主机 COM 对象之外什么都不做(即不生成来自计时器的调用,网络流量、发布的消息等),在这种情况下,可能是线程队列已满并且 COM 无法回复调用。
您应该使用受临界区保护的双端队列,而不是使用线程队列。
<块引用>
每个消息队列的发布消息数上限为 10,000 条。这个限制应该足够大。如果您的应用程序超出了限制,则应重新设计以避免消耗过多的系统资源。
You might maintain a counter of items on/off the queue to see if this is the issue.
This ought to work even with multiple apartments provided that:
AND
Firstly:
It looks like some objects are not in the same apartment as other objects. Are you sure that all objects are being created in the STA?
What you are describing is a classic deadlock - two independent threads, each waiting on the other. That is what I would expect to occur with that design operating with the C# and COM sides on different threads.
You should be OK if all the objects are on the same thread, as well as the hidden window being on that thread, so I think you need to check that. (Obviously this includes any other objects which are created by the COM side and passed over to the C# side.)
You could try debugging this by pressing "pause" in the debugger and checking what code was in each thread (if you see RPCRT*.DLL this means you are looking at a proxy). Alternately you could DebugPrint the current thread ID from various critical points in both C# and COM sides and your WndProc - they should all be the same.
Secondly: it ought to work with multiple threads provided that only one of the threads generates work items, and the other does nothing but host COM objects which respond to calls (i.e. doesn't generate calls from timers, network traffic, posted messages etc), in this case it may be that the thread queue is full and COM cannot reply to a call.
Instead of using the thread queue, you should use a deque protected by a critical section.
You might maintain a counter of items on/off the queue to see if this is the issue.