如何计算 GetModuleFileName 的完整缓冲区大小?
GetModuleFileName()
需要一个缓冲区和缓冲区大小作为输入; 但是它的返回值只能告诉我们已经复制了多少个字符,以及大小是否不够(ERROR_INSUFFICIENT_BUFFER
)。
如何确定保存 GetModuleFileName()
的整个文件名所需的实际缓冲区大小?
大多数人使用 MAX_PATH ,但我记得路径可以超过该值(默认定义为 260)...
(使用零作为缓冲区大小的技巧对此 API 不起作用 - 我已经尝试过前)
The GetModuleFileName()
takes a buffer and size of buffer as input; however its return value can only tell us how many characters is has copied, and if the size is not enough (ERROR_INSUFFICIENT_BUFFER
).
How do I determine the real required buffer size to hold entire file name for GetModuleFileName()
?
Most people use MAX_PATH
but I remember the path can exceed that (260 by default definition)...
(The trick of using zero as size of buffer does not work for this API - I've already tried before)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
通常的方法是调用它将大小设置为零,并且保证失败并提供分配足够缓冲区所需的大小。 分配一个缓冲区(不要忘记空终止的空间)并再次调用它。
在很多情况下,MAX_PATH 就足够了,因为许多文件系统限制路径名的总长度。 但是,可以构造超出 MAX_PATH 的合法且有用的文件名,因此查询所需的缓冲区可能是个好建议。
不要忘记最终从提供缓冲区的分配器返回缓冲区。
编辑:弗朗西斯在评论中指出,通常的方法不适用于
GetModuleFileName()
。 不幸的是,弗朗西斯在这一点上是绝对正确的,我唯一的借口是我在提供“通常”的解决方案之前没有去查证。我不知道该 API 的作者在想什么,只是有可能在引入它时,
MAX_PATH
确实是最大的可能路径,使正确的配方变得容易。 只需在长度不小于 MAX_PATH 字符的缓冲区中执行所有文件名操作即可。哦,是的,不要忘记自 1995 年左右开始路径名就允许使用 Unicode 字符。 由于 Unicode 占用更多空间,因此任何路径名都可以在
\\?\
前面,以显式请求删除对该名称的字节长度的MAX_PATH
限制。 这使问题变得复杂。MSDN 在标题为 文件名、路径的文章中对路径长度有这样的说法和命名空间:
因此,一个简单的答案是分配一个大小为 MAX_PATH 的缓冲区,检索名称并检查错误。 如果合适,就完成了。 否则,如果它以“
\\?\
”开头,则获得大小为 64KB 左右的缓冲区(上面的短语“32,767 个字符的最大路径是近似值”在这里有点麻烦,所以我留下一些细节以供进一步研究)并重试。溢出
MAX_PATH
但不以“\\?\
”开头似乎是“不可能发生”的情况。 同样,接下来要做什么是您必须处理的细节。对于以“
\\Server\Share\
”开头的网络名称的路径长度限制也可能存在一些混淆,更不用说以“<”开头的内核对象名称空间中的名称了。代码>\\.\”。 上面的文章没有说,我也不确定这个API是否可以返回这样的路径。The usual recipe is to call it setting the size to zero and it is guaranteed to fail and provide the size needed to allocate sufficient buffer. Allocate a buffer (don't forget room for nul-termination) and call it a second time.
In a lot of cases
MAX_PATH
is sufficient because many of the file systems restrict the total length of a path name. However, it is possible to construct legal and useful file names that exceedMAX_PATH
, so it is probably good advice to query for the required buffer.Don't forget to eventually return the buffer from the allocator that provided it.
Edit: Francis points out in a comment that the usual recipe doesn't work for
GetModuleFileName()
. Unfortunately, Francis is absolutely right on that point, and my only excuse is that I didn't go look it up to verify before providing a "usual" solution.I don't know what the author of that API was thinking, except that it is possible that when it was introduced,
MAX_PATH
really was the largest possible path, making the correct recipe easy. Simply do all file name manipulation in a buffer of length no less thanMAX_PATH
characters.Oh, yeah, don't forget that path names since 1995 or so allow Unicode characters. Because Unicode takes more room, any path name can be preceeded by
\\?\
to explicitly request that theMAX_PATH
restriction on its byte length be dropped for that name. This complicates the question.MSDN has this to say about path length in the article titled File Names, Paths, and Namespaces:
So an easy answer would be to allocate a buffer of size
MAX_PATH
, retrieve the name and check for errors. If it fit, you are done. Otherwise, if it begins with "\\?\
", get a buffer of size 64KB or so (the phrase "maximum path of 32,767 characters is approximate" above is a tad troubling here so I'm leaving some details for further study) and try again.Overflowing
MAX_PATH
but not beginning with "\\?\
" appears to be a "can't happen" case. Again, what to do then is a detail you'll have to deal with.There may also be some confusion over what the path length limit is for a network name which begins "
\\Server\Share\
", not to mention names from the kernel object name space which begin with "\\.\
". The above article does not say, and I'm not certain about whether this API could return such a path.实施一些合理的策略来增加缓冲区,例如从 MAX_PATH 开始,然后使每个连续大小比前一个大 1.5 倍(或迭代次数较少时为 2 倍)。 迭代直到函数成功。
Implement some reasonable strategy for growing the buffer like start with MAX_PATH, then make each successive size 1,5 times (or 2 times for less iterations) bigger then the previous one. Iterate until the function succeeds.
使用
可能会起作用。
从 GetModuleFileName 的文档中:
但如果我读到有关 _pgmptr 的内容:
有人知道 _pgmptr 是如何初始化的吗? 如果 SO 支持后续问题,我会将此问题作为后续问题发布。
Using
might work.
From the documentation of GetModuleFileName:
But if I read about _pgmptr:
Anyone who knows how _pgmptr is initialized? If SO had support for follow-up questions I would posted this question as a follow up.
虽然 API 是糟糕设计的证据,但解决方案实际上非常简单。 简单但悲伤的是它必须是这种方式,因为它有点消耗性能,因为它可能需要多次内存分配。 以下是解决方案的一些关键点:
您不能真正依赖不同 Windows 版本之间的返回值,因为它在不同 Windows 版本(例如 XP)上可能具有不同的语义。
如果提供的缓冲区太小而无法容纳字符串,则返回值是包含 0 终止符的字符数。
如果提供的缓冲区
如果提供的缓冲区足够大以容纳字符串,则返回值是不包括 0 终止符的字符数。
这意味着如果返回的值恰好等于缓冲区大小,您仍然不知道它是否成功。 可能还有更多数据。 或不。 最后,只有缓冲区长度实际上大于所需的长度,您才能确定成功。 可悲的是......
所以,解决方案是从一个小的缓冲区开始。 然后,我们调用 GetModuleFileName 传递确切的缓冲区长度(以 TCHAR 为单位)并将返回结果与其进行比较。 如果返回结果小于我们的缓冲区长度,则成功。 如果返回结果大于或等于我们的缓冲区长度,我们必须使用更大的缓冲区重试。 冲洗并重复直至完成。 完成后,我们制作缓冲区的字符串副本(strdup/wcsdup/tcsdup)、清理并返回字符串副本。 该字符串将具有正确的分配大小,而不是临时缓冲区中可能的开销。 请注意,调用者负责释放返回的字符串(strdup/wcsdup/tcsdup malloc 内存)。
请参阅下面的实现和使用代码示例。 我已经使用这个代码十多年了,包括在企业文档管理软件中,其中可能有很多很长的路径。 当然可以通过各种方式优化代码,例如首先将返回的字符串加载到本地缓冲区(TCHAR buf[256])中。 如果该缓冲区太小,您可以启动动态分配循环。 其他优化也是可能的,但这超出了本文的范围。
实现和使用示例:
说了这么多,我想指出您需要非常了解 GetModuleFileName(Ex) 的各种其他注意事项。 32/64 位/WOW64 之间存在不同的问题。 此外,输出不一定是完整的长路径,但很可能是短文件名或受路径别名的影响。 我希望当您使用这样的函数时,目标是为调用者提供可用的、可靠的完整的、长的路径,因此我建议确实确保返回可用的、可靠的、完整的、长的绝对路径,以这样的方式它可以在各种 Windows 版本和体系结构之间移植(同样是 32/64 位/WOW64)。 如何有效地做到这一点超出了本文的范围。
虽然这是现有最糟糕的 Win32 API 之一,但我还是希望您能享受编码带来的乐趣。
While the API is proof of bad design, the solution is actually very simple. Simple, yet sad it has to be this way, for it's somewhat of a performance hog as it might require multiple memory allocations. Here is some keypoints to the solution:
You can't really rely on the return value between different Windows-versions as it can have different semantics on different Windows-versions (XP for example).
If the supplied buffer is too small to hold the string, the return value is the amount of characters including the 0-terminator.
If the supplied buffer is large enough to hold the string, the return value is the amount of characters excluding the 0-terminator.
This means that if the returned value exactly equals the buffer size, you still don't know whether it succeeded or not. There might be more data. Or not. In the end you can only be certain of success if the buffer length is actually greater than required. Sadly...
So, the solution is to start off with a small buffer. We then call GetModuleFileName passing the exact buffer length (in TCHARs) and comparing the return result with it. If the return result is less than our buffer length, it succeeded. If the return result is greater than or equal to our buffer length, we have to try again with a larger buffer. Rinse and repeat until done. When done we make a string copy (strdup/wcsdup/tcsdup) of the buffer, clean up, and return the string copy. This string will have the right allocation size rather than the likely overhead from our temporary buffer. Note that the caller is responsible for freeing the returned string (strdup/wcsdup/tcsdup mallocs memory).
See below for an implementation and usage code example. I have been using this code for over a decade now, including in enterprise document management software where there can be a lot of quite long paths. The code can ofcourse be optimized in various ways, for example by first loading the returned string into a local buffer (TCHAR buf[256]). If that buffer is too small you can then start the dynamic allocation loop. Other optimizations are possible but that's beyond the scope here.
Implementation and usage example:
Having said all that, I like to point out you need to be very aware of various other caveats with GetModuleFileName(Ex). There are varying issues between 32/64-bit/WOW64. Also the output is not necessarily a full, long path, but could very well be a short-filename or be subject to path aliasing. I expect when you use such a function that the goal is to provide the caller with a useable, reliable full, long path, therefor I suggest to indeed ensure to return a useable, reliable, full, long absolute path, in such a way that it is portable between various Windows-versions and architectures (again 32/64-bit/WOW64). How to do that efficiently is beyond the scope here.
While this is one of the worst Win32 APIs in existance, I wish you alot of coding joy nonetheless.
我的示例是“如果一开始不成功,则将缓冲区的长度加倍”方法的具体实现。 它使用字符串(实际上是
wstring
,因为我希望能够处理 Unicode)作为缓冲区来检索正在运行的可执行文件的路径。 为了确定何时成功检索完整路径,它会根据wstring::length()
返回的值检查从GetModuleFileNameW
返回的值,然后使用该值调整大小最终字符串以去除多余的空字符。 如果失败,则返回一个空字符串。My example is a concrete implementation of the "if at first you don't succeed, double the length of the buffer" approach. It retrieves the path of the executable that is running, using a string (actually a
wstring
, since I want to be able to handle Unicode) as the buffer. To determine when it has successfully retrieved the full path, it checks the value returned fromGetModuleFileNameW
against the value returned bywstring::length()
, then uses that value to resize the final string in order to strip the extra null characters. If it fails, it returns an empty string.这是 std::wstring 的另一个解决方案:
Here is a another solution with std::wstring:
这是 Free Pascal (FPC)/Delphi 中的一个实现,以防有人需要它:
Here's an implementation in Free Pascal (FPC)/Delphi in case anyone needs it:
Windows 无法正确处理超过 260 个字符的路径,因此只需使用 MAX_PATH。
您无法运行路径长于 MAX_PATH 的程序。
Windows cannot handle properly paths longer than 260 characters, so just use MAX_PATH.
You cannot run a program having path longer than MAX_PATH.
我的方法是使用 argv,假设您只想获取正在运行的程序的文件名。 当您尝试从不同的模块获取文件名时,已经描述了无需任何其他技巧即可实现此目的的唯一安全方法,可以在此处找到实现。
我还没有遇到 argv 不包含文件路径(Win32 和 Win32 控制台应用程序)的情况。 但以防万一出现上述解决方案的后备方案。 对我来说似乎有点难看,但仍然完成了工作。
My approach to this is to use argv, assuming you only want to get the filename of the running program. When you try to get the filename from a different module, the only secure way to do this without any other tricks is described already, an implementation can be found here.
I had no case where argv didn't contain the file path (Win32 and Win32-console application), yet. But just in case there is a fallback to a solution that has been described above. Seems a bit ugly to me, but still gets the job done.