Delphi / Tesseract OCR:有人可以帮助我让这个新的 DLL 在 Delphi 中工作吗?

发布于 2024-09-30 06:59:42 字数 766 浏览 12 评论 0原文

Google 一直在开发这个很棒的开源 OCR 组件: http://code.google.com/p/tesseract-ocr/

他们有2010 年 10 月开始发布新版本(版本 3)。

但是这个新版本不再有可用的 C 包装器,需要由 Delphi 社区中的某个人让它在 Delphi 内部工作——我正在尝试这样做是因为我非常需要它,而且没有其他人急于这样做,但我不知道在涉及 DLL 并将 C 转换为 Delphi 时我在做什么。这就是我需要你帮助的地方。

我得到的线索是,我需要 Dependency Walker 来以某种方式防止“名称修改”(不知道这意味着什么)。 实际的 DLL API 方法位于 C 文件中 - 并且您在 Dependency-Walker 中看到的 DLL 函数名称可能与 API 文件中的函数匹配。

以下是您需要帮助的一切: 您将需要一个包含 tessdll.dll 的文件夹,并且 leptonlib.dll 也只需要在那里。您需要一个名为“tessdata”的子文件夹,该文件夹内将是您的“语言数据文件” - [检查网站上的下载页面]

这是 Windows 安装程序,以便您可以看到正在运行的 DLL: [查看网站上的下载页面]

要使其适用于 Delphi,您需要将可执行文件放在与 DLL 相同的文件夹中。 然后,您需要知道在 DLL 中调用什么,为此您可以查看 C 源文件: [检查网站下载页面上的源文件]

感谢您的帮助。

There is this great open-source OCR component that Google has been developing:
http://code.google.com/p/tesseract-ocr/

They have a new version out (version 3) at the beginning of October 2010.

But this new version no longer has a working C wrapper, and it's up to somebody in the Delphi community to get it to work from inside Delphi -- I'm trying to do it because I badly need it and nobody else is in a hurry to do it but I have no idea what I'm doing when it comes to DLLs and converting C to Delphi. That's where I could use your help.

The clues I have picked up are that I need Dependency Walker to somehow prevent 'name-mangling' (no idea what that means).
The actual DLL API methods are in the C files - and presumably the DLL function-names you see in Dependency-Walker will match the functions in the API file.

Here's everything you'll need to help:
You will need a folder with the tessdll.dll in it and also leptonlib.dll just needs to be there. You'll need a subfolder called 'tessdata', and inside the folder will be your 'language data files' - [check the downloads page on the site]

Here's the Windows installer just so you can see the DLL in action:
[check the downloads page on the site]

To get this working for Delphi, you'll have your executable in the same folder as the DLL.
You'll then need to know what to call in the DLL, and for that you can look in the C Source files:
[check the source files on the downloads page on the site]

Thanks for any assistance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

dawn曙光 2024-10-07 06:59:42

乍一看这可能很困难。由于 API 表面上封装在 C++ 类中,唯一干净的方法是:

用 C 实现一个包装器 DLL,公开该类的扁平化接口,以便您可以编写一个 Delphi 单元来使用它。

其原理概述如下:

http://rvelthuis.de/articles/articles-cppobjs.html

直接使用 C++ API 需要一些巧妙的汇编程序。这里的问题不仅是名称修改,而且是用于创建 DLL 的 C++ 编译器(即 Visual Studio 2008 Express)的调用约定。

因此,必须首先使用 Visual C++ 2008 Express 编写带有 C API 的 DLL。

关于您的评论的一些澄清:

当您想在应用程序中使用外部库时,您需要知道需要导入哪些符号。

正常的符号是 kernel32.dll 中的“SetDllDirectory”。在 Delphi 中导入它没有问题,但 C++ 通常使用更人为的方式来命名其符号。例如“_ZN·9wikipedia·7article·6format·E”(摘自本文:http:// en.wikipedia.org/wiki/Name_mangling

虽然可以导入损坏的符号,但这只是问题的一小部分。

您可以使用 extern "C" { 指令告诉 C++ 编译器不要使用名称修改。

至少还有两个额外的问题:

  • 您没有方法从 Delphi 中确定 C++ 对象实例的大小
  • C++ 对象的所有方法都采用隐藏的 this 参数(如 Delphi 中的 Self)

这些问题可以通过编写一个包装器来避免,如鲁迪的文章中所述。

你必须编写一个简单的 C++ Dll,它导出一个普通的 C API(没有损坏并且具有普通的 C 函数),在伪代码中它看起来像这样:

extern "C" {

void* MakeAnInstanceOfDesiredClass(void)
{
    return new DesiredClass();
}

void DestroyInstanceOfDesiredClass(void* instance)
{
    delete instance;
}

int SomeMethodOfDesiredClass(void* instance)
{
    return reinterpret_cast<DesiredClass*>(instance)->SomeMethod();
}

}

我会尝试一下,但我的互联网连接很慢,我不这样做抱歉,这里没有 Visual Studio。

From the first look this could be difficult. Since the API is appearantly encapsulated in a C++ class the only clean way to do it would be:

Implement a wrapper DLL in C that exposes a flattented interface of the class so that you can write a Delphi unit to use it.

The principle is outlined here:

http://rvelthuis.de/articles/articles-cppobjs.html

Directly using the C++ API would require some clever assembler hacking. It's not only the name mangling that is a problem here, but also the calling convention of the C++ compiler that was used to create the DLL (which is Visual Studio 2008 Express).

So somebody has to write a DLL with a C API using Visual C++ 2008 Express first.

Some clarification concering your comments:

When you want to use an external library in your application you need to know what symbols you need to import.

A normal symbol would be 'SetDllDirectory' in kernel32.dll. No problem to import that in Delphi, but C++ normally uses a more contrived way to name its symbols. An example would be '_ZN·9wikipedia·7article·6format·E' (taken from this article: http://en.wikipedia.org/wiki/Name_mangling)

While it is possible to import a mangled symbol that's only a minor part of the problem.

You can tell the C++ compiler to not use name mangling using the extern "C" { directive.

There are still at least two additional problems:

  • You do not have a method to determine the size of a C++ object instance from Delphi
  • All methods of a C++ object take a hidden this argument (like Self in Delphi)

These problems can be circumvented by writing a wrapper like explained in Rudy's article.

You have to write a simple C++ Dll that exports a normal C API (without mangling and with normal C functions), in pseudo-code it looks like that:

extern "C" {

void* MakeAnInstanceOfDesiredClass(void)
{
    return new DesiredClass();
}

void DestroyInstanceOfDesiredClass(void* instance)
{
    delete instance;
}

int SomeMethodOfDesiredClass(void* instance)
{
    return reinterpret_cast<DesiredClass*>(instance)->SomeMethod();
}

}

I would give it a try, but my internet connection is quite slow and I don't have Visual Studio here, sorry.

森末i 2024-10-07 06:59:42

实际上,仔细查看文档后,可能有一个函数子集仍然是 C API,因此可以直接从 Delphi 访问:

BOOL APIENTRY  DllMain (HANDLE hModule, DWORD ul_reason_for_call, LPVOID lpReserved) 
TESSDLL_API void __cdecl  TessDllRelease () 
TESSDLL_API void *__cdecl  TessDllInit (const char *lang) 
TESSDLL_API int __cdecl  TessDllBeginPageBPP (uinT32 xsize, uinT32 ysize, unsigned char *buf, uinT8 bpp) 
TESSDLL_API int __cdecl  TessDllBeginPageLangBPP (uinT32 xsize, uinT32 ysize, unsigned char *buf, const char *lang, uinT8 bpp) 
TESSDLL_API int __cdecl  TessDllBeginPageUprightBPP (uinT32 xsize, uinT32 ysize, unsigned char *buf, const char *lang, uinT8 bpp) 
TESSDLL_API int __cdecl  TessDllBeginPage (uinT32 xsize, uinT32 ysize, unsigned char *buf) 
TESSDLL_API int __cdecl  TessDllBeginPageLang (uinT32 xsize, uinT32 ysize, unsigned char *buf, const char *lang) 
TESSDLL_API int __cdecl  TessDllBeginPageUpright (uinT32 xsize, uinT32 ysize, unsigned char *buf, const char *lang) 
TESSDLL_API void __cdecl  TessDllEndPage (void) 
TESSDLL_API ETEXT_DESC *__cdecl  TessDllRecognize_a_Block (uinT32 left, uinT32 right, uinT32 top, uinT32 bottom) 
TESSDLL_API ETEXT_DESC *__cdecl  TessDllRecognize_all_Words (void) 
TESSDLL_API void __cdecl  ReleaseRecognize () 
TESSDLL_API void *__cdecl  InitRecognize () 
TESSDLL_API int __cdecl  CreateRecognize (uinT32 xsize, uinT32 ysize, unsigned char *buf) 
TESSDLL_API ETEXT_DESC *__cdecl  reconize_a_word (uinT32 left, uinT32 right, uinT32 top, uinT32 bottom) 

我不知道这些函数是否足够,但它们可以直接访问。

Actually after taking a closer look at the documentation there might be a subset of function that are still C API and thus accessible directly from Delphi:

BOOL APIENTRY  DllMain (HANDLE hModule, DWORD ul_reason_for_call, LPVOID lpReserved) 
TESSDLL_API void __cdecl  TessDllRelease () 
TESSDLL_API void *__cdecl  TessDllInit (const char *lang) 
TESSDLL_API int __cdecl  TessDllBeginPageBPP (uinT32 xsize, uinT32 ysize, unsigned char *buf, uinT8 bpp) 
TESSDLL_API int __cdecl  TessDllBeginPageLangBPP (uinT32 xsize, uinT32 ysize, unsigned char *buf, const char *lang, uinT8 bpp) 
TESSDLL_API int __cdecl  TessDllBeginPageUprightBPP (uinT32 xsize, uinT32 ysize, unsigned char *buf, const char *lang, uinT8 bpp) 
TESSDLL_API int __cdecl  TessDllBeginPage (uinT32 xsize, uinT32 ysize, unsigned char *buf) 
TESSDLL_API int __cdecl  TessDllBeginPageLang (uinT32 xsize, uinT32 ysize, unsigned char *buf, const char *lang) 
TESSDLL_API int __cdecl  TessDllBeginPageUpright (uinT32 xsize, uinT32 ysize, unsigned char *buf, const char *lang) 
TESSDLL_API void __cdecl  TessDllEndPage (void) 
TESSDLL_API ETEXT_DESC *__cdecl  TessDllRecognize_a_Block (uinT32 left, uinT32 right, uinT32 top, uinT32 bottom) 
TESSDLL_API ETEXT_DESC *__cdecl  TessDllRecognize_all_Words (void) 
TESSDLL_API void __cdecl  ReleaseRecognize () 
TESSDLL_API void *__cdecl  InitRecognize () 
TESSDLL_API int __cdecl  CreateRecognize (uinT32 xsize, uinT32 ysize, unsigned char *buf) 
TESSDLL_API ETEXT_DESC *__cdecl  reconize_a_word (uinT32 left, uinT32 right, uinT32 top, uinT32 bottom) 

I don't know if these functions are enough, but they are directly accessible.

梦晓ヶ微光ヅ倾城 2024-10-07 06:59:42

转换 Jens 引用的代码应该不会太难。您可以尝试一些C到Delphi转换器(http://www.drbob42.com/delphi/headconv.htm, http://cc.embarcadero.com/item/26951)。请注意,他们只能转换 60-80% 的代码,因此接下来是手动工作。如果您仍然遇到困难,请搜索标头的 VB 转换是否存在。它比从 C 进行转换要容易得多,特别是因为 VB2Delphi 转换器可以做到这一点,可能无需事后手动工作(http://www.marcocantu.com/tools/vb2delphi.htm)。

Converting the code that Jens quoted should not be too hard. You can try some C to Delphi convertor (http://www.drbob42.com/delphi/headconv.htm, http://cc.embarcadero.com/item/26951). Beware that they can only convert 60-80% of the code, so manual work follows. If you are still stuck after all this, then search if VB conversion of the header exists. It will be much easier then conversion from C, especially since VB2Delphi convertor can do this probably without manual work afterwards (http://www.marcocantu.com/tools/vb2delphi.htm).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文