Linux 上的 AIO 支持
有谁知道我可以在哪里获得有关最新 Linux 内核上对 aio 的内核支持状态的最新信息?谷歌搜索显示的网页可能已经过时了。
编辑:
更具体地说,我对非文件相关的描述符感兴趣,例如管道和套接字。网上说不支持,现在还是这样吗?
编辑2: 我正在寻找类似于 Windows OVERLAPPED IO 的东西
Does anyone know where I can get up to date information about the state on Kernel support for aio on the latest Linux Kernel?. Google searches bring up web pages that may be hopelessly out of date.
Edit:
More specifically, I am interested in non-file related descriptors like pipes and sockets. Stuff on the web indicate that there is no support, is this still the case?
Edit2:
What I am looking for is something similar to Windows OVERLAPPED IO
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您不需要 POSIX AIO(即 man aio)来异步使用套接字和管道。根据
man 3 aio
这甚至是不可能的。您应该使用非阻塞文件描述符,以及事件通知接口,例如select()
、poll()
,或epoll
。epoll
是 Linux 特有的,但扩展性比前两者好得多。要在非阻塞模式下使用文件描述符,您必须在每个文件描述符上设置 O_NONBLOCK 标志:
文件描述符处于非阻塞模式后,I/O 操作如 read() 和
write()
永远不会阻塞,但如果操作无法立即完成,则会返回EAGAIN
或EWOULDBLOCK
。一些更具体的操作,例如connect()
,必须在非阻塞模式下以不同的方式使用;请参阅相关手册页。为了能够正确使用非阻塞文件描述符,您的应用程序需要事件驱动。基本上,在
main()
中,您需要首先初始化内容,然后进入事件循环。事件循环重复等待事件(使用事件通知接口,例如epoll_wait()
),然后检查发生了哪些事件,并对它们做出响应。现在,当您确实执行
read()
时,它因EWOULDBLOCK
失败,您可以将其添加到监视可读性的文件描述符列表中;当事件提供者指示可读性时,您重试。同样,如果您尝试
write()
并且失败并显示EWOULDBLOCK
,您可能需要缓冲数据,并在指示可写性时重试。You don't need POSIX AIO (i.e. man aio) to use sockets and pipes asynchronously. According to
man 3 aio
it is not even possible. You should use non-blocking file descriptors instead, together with an event notification interface, such asselect()
,poll()
, orepoll
.epoll
is Linux specific, but scales much better than the former two.To use file descriptors in non-blocking mode you have to set the
O_NONBLOCK
flag on every file descriptor:After a file descriptor is in non-blocking mode, I/O operations like
read()
andwrite()
will never block, but will returnEAGAIN
orEWOULDBLOCK
if the operation cannot be completed immediately. Some more specific operations, likeconnect()
, have to be used in a different way in non-blocking mode; see relevant man pages.To be able to use non-blocking file descritors correctly, your application needs to be event driven. Basically, in
main()
, you need to first initialize stuff, then enter the event loop. The event loop repetedly waits for events (using an event notification interface, e.g.epoll_wait()
), then checks which events happened, and responds to them.Now when you do say a
read()
, and it fails withEWOULDBLOCK
, you add it to the list of file descriptors watched for readability; when the event provider indicates readability, you try again.Similarly, if you try to
write()
and it fails withEWOULDBLOCK
, you might want to buffer the data and try again when writability is indicated.Linux 下有两种类型的 AIO。
一种是内核-AIO。它很丑陋,有时行为不符合文档(例如,它会在某些条件下同步运行,而您无法对其执行某些操作,并且在某些条件下它不会正确取消正在进行的请求等, ETC)。它不适用于管道。
这些是 io_ 类型的函数。请注意,您必须与
-laio
链接,您必须在某些系统(例如 Debian/Ubuntu)上单独安装它。第二个是纯用户态实现 (glibc),它根据需要生成线程来处理请求。它有详细的文档记录,工作得相当好,并且根据文档,它可以与几乎任何文件描述符包括管道一起使用。
这些是
aio_
类型的函数。我绝对会推荐使用它们,即使它们是“不酷的用户态实现”——它们工作得很好。顺便说一句,两者都同时使用 eventfd 作为通知机制,尽管我上次查看时内核版本仍然没有记录(但功能位于标头中)。
或者,正如 Ambroz Bizjak 指出的那样,完全跳过 AIO,因为您所描述的情况并不是绝对必要的。
编辑:
另一方面,由于您使用了“管道”和“套接字”这两个词,您是否知道vmsplice 和 拼接?这些可能是向套接字/管道发送数据或从套接字/管道发送数据的最有效的函数。不幸的是,这又是一种记录模糊、难以理解且陷阱不明的黑客行为。已警告您,请自行承担风险。
splice
允许您将数据从套接字(或任何文件描述符)传输到管道,或者反之亦然。vmsplice
允许您在应用程序空间和管道之间传输数据。具有讽刺意味的是,
vmsplice
在理想情况下应该做完全相同的事情(重新映射页面,又名“玩虚拟机”),早在 2006 年,就有一个人以此为论点,声称所有 BSD 开发人员都是白痴。好消息就这么多,坏消息是您可以移动的数据量存在“秘密限制”。据我记得它是 64kB(但可以在 /proc 中的某个位置进行配置)。如果您有比这更多的数据,则必须在多个块中工作,可能需要使用多个管道缓冲区,在读取另一个缓冲区时填充一个缓冲区,并在完成后重用旧的管道缓冲区。
这就是事情变得复杂的地方。如果您浏览有关内核陷阱的讨论,您会发现即使是大师也不能 100% 确定在处理多个缓冲区时何时覆盖旧缓冲区是安全的。
另外,要使
vmsplice
真正起作用(即重新映射页面而不是复制),您需要使用“GIFT”标志,至少对我来说,从记录该内存随后会发生什么。按照文档的字面意思,您将需要泄漏内存,因为您永远不允许再次触摸它。当然不可能是这样。也许我只是愚蠢。我最终放弃了这一点,只是决定使用
epoll
进行准备,并使用普通的write
进行非阻塞套接字。这种组合可能不是最佳性能,但它有详细的文档记录,并且按文档记录工作。There are two kinds of AIO under Linux.
One is kernel-AIO. It is ugly and sometimes does not behave in accordance with the documentation (for example, it will run synchronously under certain conditions without you being able to do something about it, and it will not properly cancel in-flight requests under certain conditions, etc, etc). It does not work on pipes.
These are the
io_
kind of functions. Note that you must link with-laio
, which you must separately install on some systems (e.g. Debian/Ubuntu).The second is is a pure userland implementation (glibc) which spawns threads on demand to handle requests. It is well-documented, works reasonably well, and according to the documentation, and it works with pretty much anything that is a file descriptor including pipes.
These are the
aio_
kind of functions. I would definitively recommend to use these, even if they are an "uncool userland implementation" -- they work nicely.Both work with eventfd as a notification mechanism in the mean time, btw, though the kernel version was still undocumented last time I looked (but the funciton is in the headers).
Or, as Ambroz Bizjak pointed out, skip AIO at all, for what you describe it's not strictly necessary.
EDIT:
On a different note, since you used the words "pipes" and "sockets", are you aware of vmsplice and splice? Those are the probably most efficient functions to send data to/from sockets/pipes. Unluckily, it's another one of those ambiguously documented, hard to understand hacks with obscure pitfalls. Proceed at your own risk, you have been warned.
splice
lets you transfer data from a socket (or any file descriptor) to a pipe, or the other way around.vmsplice
lets you transfer data between application space and a pipe.Ironically,
vmsplice
is ideally supposed to do the exact same thing (remap pages, a.k.a. "play with VM") that one particular person took as argument to claim that all BSD developers are idiots, back in 2006.So much for the good news, the bad news is that there is a "secret limit" to how much data you can move. As far as I remember it's 64kB (but configurable somewhere in /proc). If you have more data than that, you must therefore work in several chunks, presumably with several pipe buffers, filling one while the other is read, and reusing old pipe buffers after they are done.
And this is where it gets complicated. If you browse through the discussions Kernel Trap, you find that even the Grand Master is not 100% sure about when it's safe to overwrite an old buffer when juggling with several buffers.
Also, for
vmsplice
to really work (i.e. remapping pages instead of copying), you need to use the "GIFT" flag, and at least to me it's not clear from the docs what becomes of that memory then. Following the docs to the letter, you would need to leak memory, since you are never allowed to touch it again. Of course that can't be it. Maybe I'm just stupid.I eventually gave up on this, and just settled for using
epoll
for readiness and non-blocking sockets with plain normalwrite
. That combination is maybe not the utmost performer, but it is well-documented and works as documented.AIO 支持已包含在 Linux 内核中。这就是为什么Google 上的第一个热门只提供 2.4 Linux 内核的补丁。在 2.6 和 3.0 中它已经存在了。
如果你查看 Linux 内核源代码,它位于 fs/aio.c ,
其中有 GNU libc 手册中的一些文档,但请注意,aio 不适用于所有类型的 Linux 文件描述符。大多数一般“如何”文档都已过时2006 年左右,这是合适的,因为当时 Linux 中的 AIO 成为了头条新闻。
请注意,POSIX.1b 和 Unix98 标准没有改变,因此您能否具体说明一下示例“过时”的本质?
AIO support has been included in the linux kernel proper. That's why the first hit on Google only offers patches to the 2.4 Linux kernel. In 2.6 and 3.0 it's already in there.
If you checkout the Linux kernel source code, it's at fs/aio.c
There's some documentation in the GNU libc manual, but be advised that aio is not possible for all types of Linux file descriptors. Most of the general "how to" documentation is dated around 2006, which is appropriate since that's when AIO in Linux was making the headlines.
Note that the POSIX.1b and Unix98 standards haven't changed, so can you be a bit specific as to the nature of the "out-of-date"ness of the examples?