从 dll 加载 dll?
从 dll 加载 dll 的最佳方法是什么?
我的问题是我无法在process_attach上加载dll,并且无法从主程序加载dll,因为我不控制主程序源。因此我也不能调用非 dllmain 函数。
What's the best way for loading a dll from a dll ?
My problem is I can't load a dll on process_attach, and I cannot load the dll from the main program, because I don't control the main program source. And therefore I cannot call a non-dllmain function, too.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
在评论中进行了所有辩论之后,我认为最好用“真实”答案来总结我的立场。
首先,目前还不清楚为什么需要使用 LoadLibrary 在 DllMain 中加载 dll。这绝对是一个坏主意,因为您的 DllMain 正在另一个对 LoadLibrary 的调用中运行,该调用持有加载程序锁,如 DllMain 文档:
(emphasis added)
那么,这就是为什么它被禁止;有关清晰、更深入的解释,请参阅此 和这个,了解一些其他示例如果您在 DllMain 中不遵守这些规则,会发生什么情况另请参阅 Raymond Chen 的博客。
现在,关于拉基斯的回答。
正如我已经重复多次的,您认为的 DllMain,并不是 dll 的真实 DllMain;相反,它只是由 dll 的真实入口点调用的函数。反过来,CRT 自动执行此任务来执行其附加的初始化/清理任务,其中包括全局对象和类的静态字段的构造(实际上,从编译器的角度来看,所有这些几乎都是相同的)事物)。在完成此类任务之后(或之前,用于清理),它会调用您的 DllMain。
它在某种程度上是这样的(显然我没有编写所有的错误检查逻辑,它只是为了展示它是如何工作的):
这没有什么特别的:它也会发生在普通的可执行文件中,你的 main 被调用真正的入口点,由 CRT 出于完全相同的目的而保留。
现在,从这里可以清楚为什么 Rakis 的解决方案不起作用:全局对象的构造函数由真正的 DllMain 调用(即 dll 的实际入口点,这是有关 DllMain 上的 MSDN 页面的入口点)谈论),因此从那里调用 LoadLibrary 与从 fake-DllMain 调用它具有完全相同的效果。
因此,遵循该建议,您将获得与在 DllMain 中直接调用 LoadLibrary 相同的负面影响,并且您还将问题隐藏在看似不相关的位置,这将使下一个维护人员很难找到此错误的位置位于。
至于延迟加载:这可能是一个想法,但您必须非常小心,不要在 DllMain 中调用引用的 dll 的任何函数:事实上,如果您这样做,您将触发对 LoadLibrary 的隐藏调用,这将具有相同的效果直接调用它的负面影响。
无论如何,在我看来,如果你需要引用 dll 中的某些函数,最好的选择是静态链接到它的导入库,这样加载器就会自动加载它,不会给你带来任何问题,并且它会自动解决任何奇怪的依赖关系可能出现的链条。
即使在这种情况下,您也不能在 DllMain 中调用该 dll 的任何函数,因为不能保证它已经被加载;实际上,在 DllMain 中,您只能依赖于正在加载的 kernel32,并且也许依赖于您绝对确定的 dll,您的调用者在加载您的 dll 的 LoadLibrary 发出之前已经加载了(但您仍然应该这样做)不要依赖于此,因为您的 dll 也可能由与这些假设不匹配的应用程序加载,并且只想例如 加载 dll 的资源而不调用您的代码)。
正如我之前链接的文章所指出的,
(再次强调)
顺便说一句,关于 Linux 与 Windows 的问题:我不是 Linux 系统编程专家,但我不认为在这方面事情有那么不同。
还有一些 DllMain 的等效项(_init 和 _fini 函数),它们是 - 多么巧合! - 由 CRT 自动获取,CRT 又从 _init 调用全局对象的所有构造函数以及标有 __attribute__ 构造函数 的函数(这在某种程度上相当于在 Win32 中提供给程序员的“假”DllMain)。 _fini 中的析构函数也有类似的过程。
由于 _init 也在 dll 加载仍在进行时被调用(dlopen 尚未返回),我认为您在可以执行的操作方面也会受到类似的限制在那里做。尽管如此,我认为在 Linux 上这个问题感觉较少,因为 (1) 你必须明确选择使用类似 DllMain 的函数,这样你就不会立即想滥用它,并且 (2) Linux 应用程序,据我所知,倾向于使用较少的 dll 动态加载。
简而言之
No "correct" method will allow you to reference to any dll other than kernel32.dll in DllMain.
,因此,不要从 DllMain 做任何重要的事情,既不直接(即在 CRT 调用的“您的”DllMain 中)也不间接(在全局类/静态字段构造函数中),尤其 再次强调,不要加载其他 dll,既不要直接加载(通过 LoadLibrary)也不要间接加载(调用延迟加载的 dll 中的函数,这会触发 LoadLibrary 调用)。
将另一个 dll 作为依赖项加载的正确方法是 - 哦! - 将其标记为静态依赖项。只需链接其静态导入库并至少引用其函数之一:链接器会将其添加到可执行映像的依赖关系表中,加载器将自动加载它(在调用 DllMain 之前或之后初始化它,您可以在不需要知道它,因为您不能从 DllMain 调用它)。
如果由于某种原因这不可行,仍然有延迟加载选项(具有我之前所说的限制)。
如果您仍然,由于某种未知的原因,有莫名其妙的需要在 DllMain 中调用 LoadLibrary,那么,继续吧,射你的脚,这是你的能力。但别告诉我我没有警告过你。
I was forgetting: another fundamental source of information on the topic is the [Best Practices for Creating DLLs][6] document from Microsoft, which actually talks almost only about the loader, DllMain, the loader lock and their interactions; have a look at it for additional information on the topic.
附录
Which *is* an answer to your question: under the conditions you imposed, you can't do what you want. In a nutshell of a nutshell, from DllMain you can't call *anything other than kernel32 functions*. Period.
You should, instead, because understanding why the rules are made in that way makes you avoid big mistakes.
No, my dear, the loader does its job correctly, because *after* LoadLibrary has returned, all the dependencies are loaded and everything is ready to be used. The loader tries to call the DllMain in dependency order (to avoid problems with broken dlls which rely on other dlls in DllMain), but there are cases in which this is simply impossible.
例如,可能有两个相互依赖的 dll(例如 A.dll 和 B.dll):现在,首先调用谁的 DllMain?如果加载程序首先初始化 A.dll,并且在其 DllMain 中调用 B.dll 中的函数,则任何事情都可能发生,因为 B.dll 尚未初始化(其 DllMain 尚未调用)。如果我们扭转局势,同样适用。
可能还有其他情况会出现类似的问题,因此简单的规则是:不要在 DllMain 中调用任何外部函数,DllMain 只是用于初始化 dll 的内部状态。
这个讨论是这样进行的:你说“我想在实域中求解像 x^2+1=0 这样的方程”。每个人都说你不可能;你说这不是答案,并责怪数学。
有人告诉你:嘿,你可以,这里有一个技巧,解决方案就是+/-sqrt(-1);每个人都对这个答案投反对票(因为这对你的问题来说是错误的,我们超出了真实的领域),而你责怪谁投了反对票。我根据您的问题向您解释为什么该解决方案不正确,以及为什么这个问题无法在真实领域中解决。你说你不关心为什么它不能做到,你只能在实数域中做到这一点,并再次归咎于数学。
现在,正如解释和重申一百万次一样,在你的条件下你的答案没有解决方案,你能解释一下我们为什么你“必须”做这样一件愚蠢的事情吗?在 DllMain 中加载 dll?经常出现“不可能”的问题是因为我们选择了一条奇怪的路线来解决另一个问题,这使我们陷入僵局。如果您解释了更大的情况,我们可以建议一个更好的解决方案,该解决方案不涉及在 DllMain 中加载 dll。
The one that is present (obviously I'm assuming you're compiling for 32 bit); if an exported function needed by your application isn't present in the found dll, your dll is simply not loaded (LoadLibrary fails).
附录 (2)
Adding the dll as a static dependency (what has been suggested since the beginning) makes it to be loaded by the loader exactly as Linux/Mac do, but the problem is still there, since, as I explained, in DllMain you still cannot rely on anything other than kernel32.dll (even if the loader in general intelligent enough to init first the dependencies).
不过,问题可以得到解决。使用 CreateRemoteThread 创建线程(实际上调用 LoadLibrary 来加载 dll);在 DllMain 中使用一些 IPC 方法(例如命名的共享内存,其句柄将保存在 init 函数中关闭的某个地方)将 dll 将提供的“真实”init 函数的地址传递给注入器程序。然后 DllMain 将退出而不执行任何其他操作。相反,注入器应用程序将使用 CreateRemoteThread 提供的句柄通过 WaitForSingleObject 等待远程线程的结束。然后,当远程线程结束后(从而LoadLibrary将完成,并且所有依赖项将被初始化),注入器将从DllMain创建的命名共享内存中读取远程进程中init函数的地址,并启动它与CreateRemoteThread。
问题:在 Windows 2000 上,禁止使用 DllMain 中的命名对象,因为
So, this address may have to be passed in another manner. A quite clean solution would be to create a shared data segment in the dll, load it both in the injector application and in the target one and have it put in such data segment the required address. The dll would obviously have to be loaded first in the injector and then in the target, because otherwise the "correct" address would be overwritten.
另一个非常有趣的方法是在另一个进程内存中编写一个小函数(直接在汇编中),该函数调用 LoadLibrary 并返回 init 函数的地址;既然我们在那里写了它,我们也可以用 CreateRemoteThread 调用它,因为我们知道它在哪里。
在我看来,这是最好的方法,也是最简单的方法,因为代码已经存在,写在这个 好文章。看看它,它非常有趣,它可能会解决你的问题。
After all the debate that went on in the comments, I think that it's better to summarize my positions in a "real" answer.
First of all, it's still not clear why you need to load a dll in DllMain with LoadLibrary. This is definitely a bad idea, since your DllMain is running inside another call to LoadLibrary, which holds the loader lock, as explained by the documentation of DllMain:
(emphasis added)
So, this on why it is forbidden; for a clear, more in-depth explanation, see this and this, for some other examples about what can happen if you don't stick to these rules in DllMain see also some posts in Raymond Chen's blog.
Now, on Rakis answer.
As I already repeated several times, what you think that is DllMain, isn't the real DllMain of the dll; instead, it's just a function that is called by the real entrypoint of the dll. This one, in turn, is automatically took by the CRT to perform its additional initialization/cleanup tasks, among which there is the construction of global objects and of the static fields of the classes (actually all these from the compiler's perspective are almost the same thing). After (or before, for the cleanup) it completes such tasks, it calls your DllMain.
It goes somehow like this (obviously I didn't write all the error checking logic, it's just to show how it works):
There isn't anything special about this: it also happens with normal executables, with your main being called by the real entrypoint, which is reserved by the CRT for the exact same purposes.
Now, from this it will be clear why the Rakis' solution isn't going to work: the constructors for global objects are called by the real DllMain (i.e. the actual entrypoint of the dll, which is the one about the MSDN page on DllMain talks about), so calling LoadLibrary from there has exactly the same effect as calling it from your fake-DllMain.
Thus, following that advice you'll obtain the same negative effects of calling directly LoadLibrary in the DllMain, and you'll also hide the problem in a seemingly-unrelated position, which will make the next maintainer work hard to find where this bug is located.
As for delayload: it may be an idea, but you must be really careful not to call any function of the referenced dll in your DllMain: in fact, if you did that you would trigger a hidden call to LoadLibrary, which would have the same negative effects of calling it directly.
Anyhow, in my opinion, if you need to refer to some functions in a dll the best option is to link statically against its import library, so the loader will automatically load it without giving you any problem, and it will resolve automatically any strange dependency chain that may arise.
Even in this case you mustn't call any function of this dll in DllMain, since it's not guaranteed that it's already been loaded; actually, in DllMain you can rely only on kernel32 being loaded, and maybe on dlls you're absolutely sure that your caller has already loaded before the LoadLibrary that is loading your dll was issued (but still you shouldn't rely on this, because your dll may also be loaded by applications that don't match these assumptions, and just want to, e.g., load a resource of your dll without calling your code).
As pointed out by the article I linked before,
(again, emphasis added)
By the way, on the Linux vs Windows question: I'm not a Linux system programming expert, but I don't think that things are so different there in this respect.
There are still some equivalents of DllMain (the _init and _fini functions), which are - what a coincidence! - automatically took by the CRT, which in turn, from _init, calls all the constructors for the global objects and the functions marked with __attribute__ constructor (which are somehow the equivalent of the "fake" DllMain provided to the programmer in Win32). A similar process goes on with destructors in _fini.
Since _init too is called while the dll loading is still taking place (dlopen didn't return yet), I think that you're subject to similar limitations in what you can do in there. Still, in my opinion on Linux the problem is felt less, because (1) you have to explicitly opt-in for a DllMain-like function, so you aren't immediately tempted to abuse of it and (2), Linux applications, as far as I saw, tend to use less dynamic loading of dlls.
In a nutshell
No "correct" method will allow you to reference to any dll other than kernel32.dll in DllMain.
Thus, don't do anything important from DllMain, neither directly (i.e. in "your" DllMain called by the CRT) neither indirectly (in global class/static fields constructors), especially don't load other dlls, again, neither directly (via LoadLibrary) neither indirectly (with calls to functions in delay-loaded dlls, which trigger a LoadLibrary call).
The right way to have another dll loaded as a dependency is to - doh! - mark it as a static dependency. Just link against its static import library and reference at least one of its functions: the linker will add it to the dependency table of the executable image, and the loader will load it automatically (initializing it before or after the call to your DllMain, you don't need to know about it because you mustn't call it from DllMain).
If this isn't viable for some reason, there's still the delayload options (with the limits I said before).
If you still, for some unknown reason, have the inexplicable need to call LoadLibrary in DllMain, well, go ahead, shoot in your foot, it's in your faculties. But don't tell me I didn't warn you.
I was forgetting: another fundamental source of information on the topic is the [Best Practices for Creating DLLs][6] document from Microsoft, which actually talks almost only about the loader, DllMain, the loader lock and their interactions; have a look at it for additional information on the topic.
Addendum
Which *is* an answer to your question: under the conditions you imposed, you can't do what you want. In a nutshell of a nutshell, from DllMain you can't call *anything other than kernel32 functions*. Period.
You should, instead, because understanding why the rules are made in that way makes you avoid big mistakes.
No, my dear, the loader does its job correctly, because *after* LoadLibrary has returned, all the dependencies are loaded and everything is ready to be used. The loader tries to call the DllMain in dependency order (to avoid problems with broken dlls which rely on other dlls in DllMain), but there are cases in which this is simply impossible.
For example, there may be two dlls (say, A.dll and B.dll) that depend on each other: now, whose DllMain is to call first? If the loader initialized A.dll first, and this, in its DllMain, called a function in B.dll, anything could happen, since B.dll isn't initialized yet (its DllMain hasn't been called yet). The same applies if we reverse the situation.
There may be other cases in which similar problems may arise, so the simple rule is: don't call any external functions in DllMain, DllMain is just for initializing the internal state of your dll.
This discussion is going on like this: you say "I want to solve an equation like x^2+1=0 in the real domain". Everybody says you that it's not possible; you say that it's not an answer, and blame the math.
Someone tells you: hey, you can, here's a trick, the solution is just +/-sqrt(-1); everybody downvotes this answer (because it's wrong for your question, we're going outside the real domain), and you blame who downvotes. I explain you why that solution is not correct according to your question and why this problem can't be solved in the real domain. You say that you don't care about why it can't be done, that you can only do that in the real domain and again blame math.
Now, since, as explained and restated a million times, under your conditions your answer has no solution, can you explain us why on earth do you "have" to do such an idiotic thing as loading a dll in DllMain? Often "impossible" problems arise because we've chosen a strange route to solve another problem, which brings us to deadlock. If you explained the bigger picture, we could suggest a better solution to it which does not involve loading dlls in DllMain.
The one that is present (obviously I'm assuming you're compiling for 32 bit); if an exported function needed by your application isn't present in the found dll, your dll is simply not loaded (LoadLibrary fails).
Addendum (2)
Adding the dll as a static dependency (what has been suggested since the beginning) makes it to be loaded by the loader exactly as Linux/Mac do, but the problem is still there, since, as I explained, in DllMain you still cannot rely on anything other than kernel32.dll (even if the loader in general intelligent enough to init first the dependencies).
Still, the problem can be solved. Create the thread (that actually calls LoadLibrary to load your dll) with CreateRemoteThread; in DllMain use some IPC method (for example named shared memory, whose handle will be saved somewhere to be closed in the init function) to pass to the injector program the address of the "real" init function that your dll will provide. DllMain then will exit without doing anything else. The injector application, instead, will wait for the end of the remote thread with WaitForSingleObject using the handle provided by CreateRemoteThread. Then, after the remote thread will be ended (thus LoadLibrary will be completed, and all the dependencies will be initialized), the injector will read from the named shared memory created by DllMain the address of the init function in the remote process, and start it with CreateRemoteThread.
Problem: on Windows 2000 using named objects from DllMain is prohibited because
So, this address may have to be passed in another manner. A quite clean solution would be to create a shared data segment in the dll, load it both in the injector application and in the target one and have it put in such data segment the required address. The dll would obviously have to be loaded first in the injector and then in the target, because otherwise the "correct" address would be overwritten.
Another really interesting method that can be done is to write in the other process memory a little function (directly in assembly) that calls LoadLibrary and returns the address of our init function; since we wrote it there, we can also call it with CreateRemoteThread because we know where it is.
In my opinion, this is the best approach, and is also the simplest, since the code is already there, written in this nice article. Have a look at it, it is quite interesting and it probably will do the trick for your problem.
最可靠的方法是将第一个 DLL 链接到第二个 DLL 的导入库。这样,第二个 DLL 的实际加载将由 Windows 本身完成。听起来很微不足道,但并不是每个人都知道 DLL 可以链接到其他 DLL。 Windows 甚至可以处理循环依赖。如果 A.DLL 加载需要 A.DLL 的 B.DLL,则解析 B.DLL 中的导入,而无需再次加载 A.DLL。
The most robust way is to link the first DLL against the import lib of the second. This way, the actual loading of the second DLL will be done by Windows itself. Sounds very trivial, but not everyone knows that DLLs can link against other DLLs. Windows can even deal with cyclic dependencies. If A.DLL loads B.DLL which needs A.DLL, the imports in B.DLL are resolved without loading A.DLL again.
我建议你使用延迟加载机制。 DLL 将在您第一次调用导入函数时加载。此外,您可以修改加载函数和错误处理。有关详细信息,请参阅延迟加载 DLL 的链接器支持。
I suggest you to use delay-loading mechanism. The DLL will be loaded at the fisrt time you call imported function. Moreover you can modify load function and error handling. See Linker Support for Delay-Loaded DLLs for more info.
一种可能的答案是通过使用 LoadLibrary 和 GetProcAddress 来访问指向在加载的 dll 中找到/位于的函数的指针 - 但您的意图/需求还不够明确,无法确定这是否是合适的答案。
One possible answer is through the use of LoadLibrary and GetProcAddress to access pointers to functions found/located inside the loaded dll - but your intentions/needs aren't clear enough to determine if this is a suitable answer.