是否有任何平台在 fd_set(对于 select() 或 pselect())上使用结构复制会导致问题?

发布于 2024-08-25 01:37:23 字数 1752 浏览 2 评论 0原文

select()pselect()系统调用修改它们的参数(“fd_set *”参数),因此输入值告诉系统要检查哪些文件描述符,返回值告诉程序员哪个文件描述符当前可用。

如果您要针对同一组文件描述符重复调用它们,则需要确保每次调用都有描述符的最新副本。最明显的方法是使用结构副本:(

fd_set ref_set_rd;
fd_set ref_set_wr;
fd_set ref_set_er;
...
...code to set the reference fd_set_xx values...
...
while (!done)
{
    fd_set act_set_rd = ref_set_rd;
    fd_set act_set_wr = ref_set_wr;
    fd_set act_set_er = ref_set_er;
    int bits_set = select(max_fd, &act_set_rd, &act_set_wr,
                          &act_set_er, &timeout);
    if (bits_set > 0)
    {
        ...process the output values of act_set_xx...
    }
 }

编辑以删除不正确的 struct fd_set 引用 - 正如“R..”所指出的。

我的问题:

  • 是否存在对所示的 fd_set 值进行结构复制不安全的平台?

我担心存在隐藏的内存分配或类似的意外情况。 (有宏/函数 FD_SET()、FD_CLR()、FD_ZERO() 和 FD_ISSET() 来屏蔽应用程序的内部结构。)

我可以看到 MacOS X (Darwin) 是安全的;因此,其他基于 BSD 的系统可能是安全的。您可以通过在答案中记录您知道安全的其他系统来提供帮助。

(我确实有点担心 fd_set 在处理超过 8192 个打开文件描述符时效果如何 - 默认的打开文件最大数量仅为 256,但最大数量是“无限制”。另外,由于结构为 1 KB,因此复制代码的效率并不高,但是在每个周期上运行文件描述符列表以重新创建输入掩码也不一定有效。也许您无法执行 select() 。 当你打开那么多文件描述符时,尽管那是你最有可能需要该功能的时候。)


有一个相关的SO问题 - 询问'poll() vs select()' 它解决了与此问题不同的一组问题。


请注意,在 MacOS X 上 - 大概还有更普遍的 BSD - 有一个 FD_COPY() 宏或函数,其有效原型为:

  • extern void FD_COPY(const limit) fd_set *from,限制fd_set *to);

可能值得在尚不可用的平台上进行模拟。

The select() and pselect() system calls modify their arguments (the 'fd_set *' arguments), so the input value tells the system which file descriptors to check and the return values tell the programmer which file descriptors are currently usable.

If you are going to call them repeatedly for the same set of file descriptors, you need to ensure that you have a fresh copy of the descriptors for each call. The obvious way to do that is to use a structure copy:

fd_set ref_set_rd;
fd_set ref_set_wr;
fd_set ref_set_er;
...
...code to set the reference fd_set_xx values...
...
while (!done)
{
    fd_set act_set_rd = ref_set_rd;
    fd_set act_set_wr = ref_set_wr;
    fd_set act_set_er = ref_set_er;
    int bits_set = select(max_fd, &act_set_rd, &act_set_wr,
                          &act_set_er, &timeout);
    if (bits_set > 0)
    {
        ...process the output values of act_set_xx...
    }
 }

(Edited to remove incorrect struct fd_set references - as pointed out by 'R..'.)

My question:

  • Are there any platforms where it is not safe to do a structure copy of the fd_set values as shown?

I'm concerned lest there be hidden memory allocation or anything unexpected like that. (There are macros/functions FD_SET(), FD_CLR(), FD_ZERO() and FD_ISSET() to mask the internals from the application.)

I can see that MacOS X (Darwin) is safe; other BSD-based systems are likely to be safe, therefore. You can help by documenting other systems that you know are safe in your answers.

(I do have minor concerns about how well the fd_set would work with more than 8192 open file descriptors - the default maximum number of open files is only 256, but the maximum number is 'unlimited'. Also, since the structures are 1 KB, the copying code is not dreadfully efficient, but then running through a list of file descriptors to recreate the input mask on each cycle is not necessarily efficient either. Maybe you can't do select() when you have that many file descriptors open, though that is when you are most likely to need the functionality.)


There's a related SO question - asking about 'poll() vs select()' which addresses a different set of issues from this question.


Note that on MacOS X - and presumably BSD more generally - there is an FD_COPY() macro or function, with the effective prototype:

  • extern void FD_COPY(const restrict fd_set *from, restrict fd_set *to);.

It might be worth emulating on platforms where it is not already available.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

乖乖兔^ω^ 2024-09-01 01:37:23

由于 struct fd_set 只是一个常规的 C 结构,因此应该没问题。我个人不喜欢通过 = 运算符进行结构复制,因为我工作过的许多平台无法访问正常的编译器内部函数集。在我的书中,显式使用 memcpy() 而不是让编译器插入函数调用是更好的方法。

来自 C 规范的 6.5.16.1 简单分配 部分(为了简洁起见,在此处进行了编辑):

应满足以下条件之一:

...

  • 左侧操作数具有与右侧类型兼容的结构或联合类型的限定或非限定版本;

...

简单赋值 (=) 中,右操作数的值将转换为赋值表达式的类型,并替换存储在左操作数指定的对象中的值。

如果从与第一个对象的存储以任何方式重叠的另一个对象读取存储在对象中的值,则重叠应是精确的,并且两个对象应具有兼容类型的合格或不合格版本;否则,行为未定义。

所以,只要 struct fd_set 实际上是一个常规的 C struct,就一定会成功。但是,它确实取决于您的编译器发出某种代码来执行此操作,或者依赖于它用于结构分配的任何 memcpy() 内在函数。如果您的平台由于某种原因无法链接到编译器的内部库,则它可能无法工作。

如果打开的文件描述符数量多于struct fd_set 所能容纳的数量,您将不得不使用一些技巧。 linux 手册页 说:

fd_set 是一个固定大小的缓冲区。执行 FD_CLR()FD_SET(),且 fd 值为负数或等于或大于 FD_SETSIZE code> 将导致未定义的行为。此外,POSIX 要求 fd 是有效的文件描述符。

如下所述,可能不值得付出努力来证明您的代码在所有系统上都是安全的。 FD_COPY() 就是为了这样的用途而提供的,并且大概总是保证:

FD_COPY(&fdset_orig, &fdset_copy) 将已分配的 &fdset_copy 文件描述符集替换为 &fdset_orig 的副本>.

Since struct fd_set is just a regular C structure, that should always be fine. I personally don't like doing structure copying via the = operator, since I've worked on plenty of platforms that didn't have access to the normal set of compiler intrinsics. Using memcpy() explicitly rather than having the compiler insert a function call is a better way to go, in my book.

From the C spec, section 6.5.16.1 Simple assignment (edited here for brevity):

One of the following shall hold:

...

  • the left operand has a qualified or unqualified version of a structure or union type compatible with the type of the right;

...

In simple assignment (=), the value of the right operand is converted to the type of the assignment expression and replaces the value stored in the object designated by the left operand.

If the value being stored in an object is read from another object that overlaps in any way the storage of the first object, then the overlap shall be exact and the two objects shall have qualified or unqualified versions of a compatible type; otherwise, the behavior is undefined.

So there you go, as long as struct fd_set is a actually a regular C struct, you're guaranteed success. It does depend, however, on your compiler emitting some kind of code to do it, or relying on whatever memcpy() intrinsic it uses for structure assignment. If your platform can't link against the compiler's intrinsic libraries for some reason, it may not work.

You will have to play some tricks if you have more open file descriptors than will fit into struct fd_set. The linux man page says:

An fd_set is a fixed size buffer. Executing FD_CLR() or FD_SET() with a value of fd that is negative or is equal to or larger than FD_SETSIZE will result in undefined behavior. Moreover, POSIX requires fd to be a valid file descriptor.

As mentioned below, it might not be worth the effort to prove that your code is safe on all systems. FD_COPY() is provided for just such a use, and is, presumably, always guaranteed:

FD_COPY(&fdset_orig, &fdset_copy) replaces an already allocated &fdset_copy file descriptor set with a copy of &fdset_orig.

妄想挽回 2024-09-01 01:37:23

首先,没有struct fd_set。它简称为fd_set。但是,POSIX 确实要求它是结构类型,因此复制是明确定义的。

其次,在标准 C 下,fd_set 对象无法包含动态分配的内存,因为在返回之前不需要使用任何函数/宏来释放它。即使编译器具有 alloca(基于堆栈分配的 pre-vla 扩展),fd_set 也无法使用堆栈上分配的内存,因为程序可能会传递指针到fd_set到另一个使用FD_SET的函数等,分配的内存一旦返回给调用者就不再有效。只有当 C 编译器为析构函数提供某种扩展时,fd_set 才能使用动态分配。

总之,仅分配/memcpy fd_set对象似乎是安全的,但可以肯定的是,我会做类似的事情:

#ifndef FD_COPY
#define FD_COPY(dest,src) memcpy((dest),(src),sizeof *(dest))
#endif

或者只是:然后

#ifndef FD_COPY
#define FD_COPY(dest,src) (*(dest)=*(src))
#endif

你将使用系统提供的 FD_COPY 宏(如果存在),并且仅在缺少时回退到理论上可能不安全的版本。

First of all, there is no struct fd_set. It's simply called fd_set. However, POSIX does require it to be a struct type, so copying is well-defined.

Secondly, there is no way under standard C in which the fd_set object could contain dynamically allocated memory, since there is no requirement to use any function/macro to free it before returning. Even if the compiler has alloca (a pre-vla extension for stack-based allocation), fd_set could not use memory allocated on the stack, because a program might pass a pointer to the fd_set to another function which uses FD_SET, etc., and the allocated memory would cease to be valid as soon as it returns to the caller. Only if the C compiler offered some extension for destructors could fd_set use dynamic allocation.

In conclusion, it seems to be safe just to assign/memcpy fd_set objects, but to be sure, I would do something like:

#ifndef FD_COPY
#define FD_COPY(dest,src) memcpy((dest),(src),sizeof *(dest))
#endif

or alternatively just:

#ifndef FD_COPY
#define FD_COPY(dest,src) (*(dest)=*(src))
#endif

Then you'll use the system's provided FD_COPY macro if it exists, and only fall back to the theoretically-potentially-unsafe version if it's missing.

如梦初醒的夏天 2024-09-01 01:37:23

您是对的,POSIX 不保证复制 fd_set 必须“有效”。我个人不知道有什么地方没有,但我从来没有做过这个实验。

您可以使用 poll() 替代方案(也是 POSIX)。它的工作方式与 select() 非常相似,除了输入/输出参数不是不透明的(并且不包含指针,因此裸露的 memcpy 就可以工作),而且它的设计也完全消除了复制“请求的文件描述符”结构的需要(因为“请求的事件”和“返回的事件”存储在不同的字段中)。

您的猜测也是正确的,select()(和 poll())对于大量文件描述符的扩展效果不是特别好 - 这是因为每次函数返回后,您必须循环遍历每个文件描述符以测试其上是否有活动。解决这个问题的方法是使用各种非标准接口(例如 Linux 的 epoll()、FreeBSD 的 kqueue),如果您发现有延迟,您可能需要研究一下这些接口问题。

You are correct that POSIX doesn't guarantee that copying a fd_set has to "work". I'm not personally aware of anywhere that it doesn't, but then I've never done the experiment.

You can use the poll() alternative (which is also POSIX). It works in a very similar way to select(), except that the input/output parameter is not opaque (and contains no pointers, so a bare memcpy will work), and its design also entirely removes the need to make a copy of the "requested file descriptors" structure (because the "requested events" and "returned events" are stored in different fields).

You are also correct to surmise that select() (and poll()) don't scale particularly well to large numbers of file descriptors - this is because every time the function returns, you must loop through every file descriptor to test if there was activity on it. The solutions to this are various non-standard interfaces (eg. Linux's epoll(), FreeBSD's kqueue), which you may need to look into if you find you are having latency problems.

凶凌 2024-09-01 01:37:23

我对 MacOS X、Linux、AIX、Solaris 和 HP-UX 做了一些研究,并得到了一些有趣的结果。我使用了以下程序:

#if __STDC_VERSION__ >= 199901L
#define _XOPEN_SOURCE 600
#else
#define _XOPEN_SOURCE 500
#endif /* __STDC_VERSION__ */

#ifdef SET_FD_SETSIZE
#define FD_SETSIZE SET_FD_SETSIZE
#endif

#ifdef USE_SYS_TIME_H
#include <sys/time.h>
#else
#include <sys/select.h>
#endif /* USE_SYS_TIME_H */

#include <stdio.h>

int main(void)
{
    printf("FD_SETSIZE = %d; sizeof(fd_set) = %d\n", (int)FD_SETSIZE, (int)sizeof(fd_set));
    return 0;
}

它在每个平台上编译了两次:(

cc -o select select.c
cc -o select -DSET_FD_SETSIZE=16384

在一个平台 HP-UX 11.11 上,我必须添加 -DUSE_SYS_TIME_H 才能编译所有内容。)我单独对 FD_COPY 进行了目视检查 - 仅MacOS X 似乎包含它,并且必须通过确保未定义 _POSIX_C_SOURCE 或定义 _DARWIN_C_SOURCE 来激活它。

AIX 5.3

  • 默认 FD_SETSIZE 为 65536
  • FD_SETSIZE 参数可以调整大小
  • 无 FD_COPY

HP-UX 11.11

  • 标头 - 使用 相反
  • 默认 FD_SETSIZE 为 2048
  • FD_SETSIZE 参数可以调整大小
  • 无 FD_COPY

HP-UX 11.23

  • 默认 FD_SETSIZE 为 2048
  • FD_SETSIZE 参数可以调整大小
  • 无 FD_COPY

Linux(内核2.6.9、glibc 2.3.4)

  • 默认 FD_SETSIZE 为 1024
  • FD_SETSIZE 参数无法调整大小
  • 无 FD_COPY

MacOS X 10.6.2

  • 默认 FD_SETSIZE 为 1024
  • FD_SETSIZE 参数可调整大小
  • 如果严格符合 POSIX,则定义 FD_COPY未请求或如果指定 _DARWIN_C_SOURCE

Solaris 10 (SPARC)

  • 对于 32 位,默认 FD_SETSIZE 为 1024,对于 64 位,默认 FD_SETSIZE 为 65536
  • 可以调整 FD_SETSIZE 参数的大小
  • 无 FD_COPY

显然,对程序进行了简单修改允许自动检查 FD_COPY:

#ifdef FD_COPY
    printf("FD_COPY is a macro\n");
#endif

找出如何确保它可用并不一定是微不足道的;您最终会进行手动扫描并找出如何触发它。

在所有这些机器上,看起来 fd_set 可以通过结构副本进行复制,而不会遇到未定义行为的风险。

I've done a little research on MacOS X, Linux, AIX, Solaris and HP-UX, and there are some interesting results. I used the following program:

#if __STDC_VERSION__ >= 199901L
#define _XOPEN_SOURCE 600
#else
#define _XOPEN_SOURCE 500
#endif /* __STDC_VERSION__ */

#ifdef SET_FD_SETSIZE
#define FD_SETSIZE SET_FD_SETSIZE
#endif

#ifdef USE_SYS_TIME_H
#include <sys/time.h>
#else
#include <sys/select.h>
#endif /* USE_SYS_TIME_H */

#include <stdio.h>

int main(void)
{
    printf("FD_SETSIZE = %d; sizeof(fd_set) = %d\n", (int)FD_SETSIZE, (int)sizeof(fd_set));
    return 0;
}

It was compiled twice on each platform:

cc -o select select.c
cc -o select -DSET_FD_SETSIZE=16384

(And on one platform, HP-UX 11.11, I had to add -DUSE_SYS_TIME_H to get things to compile at all.) I separately did a visual check on FD_COPY - only MacOS X seemed to include it, and that had to be activated by ensuring that _POSIX_C_SOURCE was not defined or by defining _DARWIN_C_SOURCE.

AIX 5.3

  • Default FD_SETSIZE is 65536
  • The FD_SETSIZE parameter can be resized
  • No FD_COPY

HP-UX 11.11

  • No <sys/select.h> header - use <sys/time.h> instead
  • Default FD_SETSIZE is 2048
  • The FD_SETSIZE parameter can be resized
  • No FD_COPY

HP-UX 11.23

  • Has <sys/select.h>
  • Default FD_SETSIZE is 2048
  • The FD_SETSIZE parameter can be resized
  • No FD_COPY

Linux (kernel 2.6.9, glibc 2.3.4)

  • Default FD_SETSIZE is 1024
  • The FD_SETSIZE parameter cannot be resized
  • No FD_COPY

MacOS X 10.6.2

  • Default FD_SETSIZE is 1024
  • The FD_SETSIZE parameter can be resized
  • FD_COPY is defined if strict POSIX compliance is not requested or if _DARWIN_C_SOURCE is specified

Solaris 10 (SPARC)

  • Default FD_SETSIZE is 1024 for 32-bit, 65536 for 64-bit
  • The FD_SETSIZE parameter can be resized
  • No FD_COPY

Clearly, a trivial modification to the program allows automatic checking of FD_COPY:

#ifdef FD_COPY
    printf("FD_COPY is a macro\n");
#endif

What is not necessarily trivial is finding out how to ensure that it is available; you end up doing the manual scan and working out how to trigger it.

On all these machines, it looks like an fd_set can be copied by a structure copy without running into risk of undefined behaviour.

爱的十字路口 2024-09-01 01:37:23

我没有足够的代表将此作为评论添加到 caf 的答案中,但是有一些库可以抽象非标准接口,例如 epoll()kqueue。 libevent 是其中之一,libev 是另一个。我认为 GLib 也有一个与其主循环相关的。

I don't have enough rep to add this as a comment to caf's answer, but there are libraries to abstract over the non-standard interfaces like epoll() and kqueue. libevent is one, and libev another. I think GLib also has one that ties into its mainloop.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文