__addgs* 如何使用,什么是 GS?
在微软的网站上可以找到 内在函数的一些细节
__addgsbyte ( offset, data )
__addgsword ( offset, data )
__addgsdword ( offset, data )
__addgsqword ( offset, data )
。据说offset
是
距 GS 开头的偏移量。我认为 GS 指的是处理器寄存器。
GS 与堆栈有什么关系(如果有的话)?或者,如何计算相对于 GS 的偏移量?
(并且,是否存在与此和特定调用约定相关的任何“陷阱”,例如 __fastcall
?)
On Microsoft's site can be found some details of the
__addgsbyte ( offset, data )
__addgsword ( offset, data )
__addgsdword ( offset, data )
__addgsqword ( offset, data )
intrinsic functions. It is stated that offset
is
the offset from the beginning of GS. I presume that GS refers to the processor register.
How does GS relate to the stack, if at all? Alternatively, how can I calculate an offset with respect to GS?
(And, are there any 'gotchas' relating to this and particular calling conventions, such as __fastcall
?)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
GS 寄存器与堆栈完全无关,因此与 callign 约定无关。在 x64 版本的 Windows 上,它用于指向操作系统数据:
来自 wikipedia :
请注意,这些内在函数仅在内核模式下可用(例如设备驱动程序)。要计算偏移量,您需要知道 GS 指向内存的哪个段。因此,在内核模式下,您需要了解处理器控制区域的布局。
就我个人而言,我不知道这些有什么用。
The GS register does not relate to the stack at all and therefore no relation to callign convensions. On x64 versions of Windows it is used to point to operating system data:
From wikipedia:
Note that those intrinsics are only available in kernel mode (e.g. device drivers). To calculate an offset, you would need to know what segment of memory GS is pointing to. So in kernel mode you would need to know the layout of the Processor Control Region.
Personally I don't know what the use of these would be.
这些内在函数以及 fs 对应项除了访问操作系统特定数据外没有任何实际用途,因此添加它们很可能纯粹是为了让 Windows 开发人员的生活更轻松(我个人将其用于内联 TLS 访问)
these intrinsics, along with there fs counterparts have no real use except for accessing OS specific data, as such its most likely that these where added purely to make the windows developers lives easier (I've personally used this for inline TLS access)
您链接的文档 说:
这意味着它是此 asm 指令的内在函数:
add gs:[offset], data
(带有 GS 段覆盖的普通内存目标add
)以及您选择的操作数-尺寸。编译器大概可以为
offset
部分选择任何寻址模式,而data
可以是寄存器或立即数,因此 offset 和 data 中的一个或两个都可以是运行时变量或常量。访问的实际线性(虚拟)地址将是 gs_base + offset ,其中 gs_base 通过 MSR 设置为任何地址(由操作系统自行设置,或者由您进行系统调用)。
至少在用户空间中,Windows 通常使用 GS 进行 TLS(线程本地存储)。声称此内在函数仅适用于内核代码的答案是错误的。它不会添加到 GS 基址,而是添加到内存中相对于现有 GS 基址的地址。
MS 似乎只记录了 x64 的此内在函数,但它在 32 位模式下也是有效的指令。我不知道为什么他们要费心去限制它。 (当然 qword 形式除外:64 位操作数大小在 32 位模式下不可用。)
也许编译器不知道如何一般优化
__readgsdword
/ (对该数据的操作) /__writegsdword
写入具有相同 gs:offset 地址的内存目标指令。如果是这样,这个内在函数只是代码大小的优化。但有时强制编译器将其作为单个指令执行以使其成为原子指令可能是相关的。该内核上的中断(但不包括其他 CPU 内核的访问)。 IDK(如果这是预期的用例)是;这个答案只是解释一行文档在 x86 asm 和内存分段方面的含义。
The doc you linked says:
That means it's an intrinsic for this asm instruction:
add gs:[offset], data
(a normal memory-destinationadd
with a GS segment override) with your choice of operand-size.The compiler can presumably pick any addressing mode for the
offset
part, anddata
can be a register or immediate, thus either or both of offset and data can be runtime variables or constants.The actual linear (virtual) address accessed will be
gs_base + offset
, where gs_base is set via an MSR to any address (by the OS on its own, or by you making a system call).In user-space at least, Windows normally uses GS for TLS (thread local storage). The answer claiming this intrinsic only works in kernel code is mistaken. It doesn't add to the GS base, it adds to memory at an address relative to the existing GS base.
MS only seems to document this intrinsic for x64, but it's a valid instruction in 32-bit mode as well. IDK why they'd bother to restrict it. (Except of course the qword form: 64-bit operand size isn't available in 32-bit mode.)
Perhaps the compiler doesn't know how to generally optimize
__readgsdword
/ (operation on that data) /__writegsdword
into a memory-destination instruction with the same gs:offset address. If so, this intrinsic would just be a code-size optimization.But perhaps it's sometimes relevant to force the compiler to do it as a single instruction to make it atomic wrt. interrupts on this core (but not wrt. accesses by other CPU cores). IDK if that's an intended use case is; this answer is just explaining what that one line of documentation means in terms of x86 asm and memory-segmentation.