如何在 ASP.Net Web 应用程序中使用 MODI?
我已经围绕 Microsoft Office Document Imaging COM API 编写了一个 OCR 包装器库,并且在本地运行的控制台应用程序中,它在每次测试中都能完美运行。
遗憾的是,当我们尝试将其与在 IIS6 下作为 ASP.Net Web 应用程序运行的 WCF 服务集成时,事情开始变得糟糕。我们在尝试释放 MODI COM 对象时遇到了问题,网络上有很多示例可以帮助我们。
然而,问题仍然存在。如果我重新启动 IIS,并重新部署 Web 应用程序,前几次 OCR 尝试效果很好。如果我将其保留 30 分钟左右,然后执行另一个请求,我会收到如下服务器故障错误:
服务器抛出异常。 (HRESULT 异常:0x80010105 (RPC_E_SERVERFAULT)):位于 MODI.DocumentClass.Create(String FileOpen)
从此时起,每个请求都将无法执行 OCR,直到我重置 IIS,循环再次开始。
我们在它自己的应用程序池中运行此应用程序,并且它在具有本地管理员权限的身份下运行。
更新:这个问题可以通过在进程外执行 OCR 操作来解决。看起来 MODI 库在自行清理时不能很好地处理托管代码,因此在我的情况下为每个 OCR 请求生成新进程效果很好。
这是执行以下操作的函数: OCR:
public class ImageReader : IDisposable
{
private MODI.Document _document;
private MODI.Images _images;
private MODI.Image _image;
private MODI.Layout _layout;
private ManualResetEvent _completedOCR = new ManualResetEvent(false);
// SNIP - Code removed for clarity
private string PerformMODI(string fileName)
{
_document = new MODI.Document();
_document.OnOCRProgress += new MODI._IDocumentEvents_OnOCRProgressEventHandler(_document_OnOCRProgress);
_document.Create(fileName);
_document.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true);
_completedOCR.WaitOne(5000);
_document.Save();
_images = _document.Images;
_image = (MODI.Image)_images[0];
_layout = _image.Layout;
string text = _layout.Text;
_document.Close(false);
return text;
}
void _document_OnOCRProgress(int Progress, ref bool Cancel)
{
if (Progress == 100)
{
_completedOCR.Set();
}
}
private static void SetComObjectToNull(params object[] objects)
{
for (int i = 0; i < objects.Length; i++)
{
object o = objects[i];
if (o != null)
{
Marshal.FinalReleaseComObject(o);
o = null;
}
}
}
[MethodImpl(MethodImplOptions.NoInlining)]
public void Dispose()
{
SetComObjectToNull(_layout, _image, _images, _document);
GC.Collect();
GC.WaitForPendingFinalizers();
}
}
然后我在 using 块内实例化 ImageReader 的实例(退出时将调用 IDisposable.Dispose)
调用 Marshal.FinalReleaseComObject 应该指示 CLR 释放 COM 对象,因此我不知道会发生什么导致我们出现的症状。
无论如何,在 IIS 之外运行此代码(例如在控制台应用程序中),一切似乎都是无懈可击的。每次都有效。
任何帮助我诊断和解决这个问题的提示都将是一个巨大的帮助,我会疯狂地投票! ;-)
谢谢!
I've written an OCR wrapper library around the Microsoft Office Document Imaging COM API, and in a Console App running locally, it works flawlessly, with every test.
Sadly, things start going badly when we attempt to integrate it with a WCF service running as an ASP.Net Web Application, under IIS6. We had issues around trying to free up the MODI COM Objects, and there were plenty of examples on the web that helped us.
However, problems still remain. If I restart IIS, and do a fresh deployment of the web app, the first few OCR attempts work great. If I leave it for 30 minutes or so, and then do another request, I get server failure errors like this:
The server threw an exception. (Exception from HRESULT: 0x80010105 (RPC_E_SERVERFAULT)): at MODI.DocumentClass.Create(String FileOpen)
From this point on, every request will fail to do the OCR, until I reset IIS, and the cycle begins again.
We run this application in it's own App Pool, and it runs under an identity with Local Admin rights.
UPDATE: This issue can be solved by doing the OCR stuff out of process. It appears as though the MODI library doesn't play well with managed code, when it comes to cleaning up after itself, so spawning new processes for each OCR request worked well in my situation.
Here is the function that performs the OCR:
public class ImageReader : IDisposable
{
private MODI.Document _document;
private MODI.Images _images;
private MODI.Image _image;
private MODI.Layout _layout;
private ManualResetEvent _completedOCR = new ManualResetEvent(false);
// SNIP - Code removed for clarity
private string PerformMODI(string fileName)
{
_document = new MODI.Document();
_document.OnOCRProgress += new MODI._IDocumentEvents_OnOCRProgressEventHandler(_document_OnOCRProgress);
_document.Create(fileName);
_document.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true);
_completedOCR.WaitOne(5000);
_document.Save();
_images = _document.Images;
_image = (MODI.Image)_images[0];
_layout = _image.Layout;
string text = _layout.Text;
_document.Close(false);
return text;
}
void _document_OnOCRProgress(int Progress, ref bool Cancel)
{
if (Progress == 100)
{
_completedOCR.Set();
}
}
private static void SetComObjectToNull(params object[] objects)
{
for (int i = 0; i < objects.Length; i++)
{
object o = objects[i];
if (o != null)
{
Marshal.FinalReleaseComObject(o);
o = null;
}
}
}
[MethodImpl(MethodImplOptions.NoInlining)]
public void Dispose()
{
SetComObjectToNull(_layout, _image, _images, _document);
GC.Collect();
GC.WaitForPendingFinalizers();
}
}
I then instantiate an instance of ImageReader inside a using block (which will call IDisposable.Dispose on exit)
Calling Marshal.FinalReleaseComObject should instruct the CLR to release the COM objects, and so I'm at a loss to figure out what would be causing the symptoms we have.
For what it's worth, running this code outside of IIS, in say a Console App, everything seems bullet proof. It works every time.
Any tips that help me diagnose and solve this issue would be an immense help and I'll upvote like crazy! ;-)
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您是否考虑过进程外托管应用程序的 OCR 部分。
拥有服务可以给您带来巨大的灵活性:
就我个人而言,我过去发现 COM 互操作 + IIS = 悲痛。
Have you thought of hosting the OCR portion of your app out-of-process.
Having a service can give you tons of flexibility:
Personally I have found in the past the COM interop + IIS = grief.
MODI 在摆脱自身时非常不稳定,尤其是在 IIS 中运行。根据我的经验,我发现虽然它会减慢一切速度,但消除这些错误的唯一方法是在 GC.Collect() 调用之后添加 GC.WaitForPendingFinalizers() 。如果您有兴趣,我写了一篇关于此的文章。
MODI is incredibly wonky when it comes to getting rid of itself, especially running in IIS. In my experience, I've found that although it slows everything down, the only way to get rid of these errors is to add a GC.WaitForPendingFinalizers() after your GC.Collect() call. If you're interested, I wrote an article about this.
您可以在小型控制台应用程序中复制该问题吗?也许让其睡眠 30 分钟然后再回来?
解决此类问题的最佳方法就是将其完全隔离。我很想看看它是如何工作的。
Can you replicate the problem in a small console application? Perhaps leaving it sleep for 30 mins and coming back to it?
Best way to solve things like this is to isolate it down totally. I'd be interested to see how that works.
一周前我不得不处理这个错误,在测试了这里给出的一些解决方案后,我终于解决了这个问题。我将在这里解释我是如何做到的。
就我而言,我有一个 Windows 服务正在运行并处理文件夹中的文档,当文档超过 20 个时就会出现问题,抛出错误:来自 HRESULT 的异常:0x80010105 (RPC_E_SERVERFAULT)。
在我的代码中,每次检测到文件夹中的文档时,我都会调用一个方法,我会创建一个 MODI 文档的实例(MODI.Document _document = new MODI.Document();)并处理该文件,这就是导致错误!!
解决方案是只有一个 MODI.Document 全局实例,并用它处理所有文档,这样我的服务始终只有一个实例运行。
我希望这能帮助那些面临同样问题的人。
I had to deal with this error a week ago, and after testing some solutions giving here, i finally resolved the problem. I'll explain here how i did it.
In my case i have a windows service runing and processing documents from a folder, the problem occurs when there are more than 20 documents, throwing the error: Exception from HRESULT: 0x80010105 (RPC_E_SERVERFAULT).
In my code i was calling a method each time i detect a document in the folder, i make an instance of MODI document (MODI.Document _document = new MODI.Document();) and i process the file, and that was what causes the error!!
The solution was to have just one global instance of MODI.Document, and process all documents whit it, this way i have only one instance runing for my service all time.
I hope that will help those who are facing the same problem.