无法分配请求的地址 - 可能的原因?
我有一个由主服务器和分布式从服务器组成的程序。从属服务器向服务器发送状态更新,如果服务器在固定时间内没有收到特定从属服务器的消息,则会将该从属服务器标记为关闭。这种情况一直在发生。
通过检查日志,我发现从站只能向服务器发送一个状态更新,然后永远无法发送另一次更新,总是在调用 connect() 时失败“无法分配请求的地址(99)。
奇怪的是,从服务器能够向服务器发送几个其他更新,并且所有连接都发生在同一端口上似乎此故障的最常见原因是连接保持打开状态,但我有。找不到任何打开的东西是否有其他可能。 为了澄清一下
,我的连接方式如下:
struct sockaddr *sa; // parameter
size_t sa_size; //parameter
int i = 1;
int stream;
stream = socket(AF_INET,SOCK_STREAM,0);
setsockopt(stream,SOL_SOCKET,SO_REUSEADDR,&i,sizeof(i));
bindresvport(stream,NULL);
connect(stream,sa,sa_size);
此代码位于一个函数中,用于获取与另一台服务器的连接,这 4 个调用中任何一个的失败都会导致该函数失败。
I have a program that consists of a master server and distributed slave servers. The slave servers send status updates to the server, and if the server hasn't heard from a specific slave in a fixed period, it marks the slave as down. This is happening consistently.
From inspecting logs, I have found that the slave is only able to send one status update to the server, and then is never able to send another update, always failing on the call to connect() "Cannot assign requested address (99).
Oddly enough, the slave is able to send several other updates to the server, and all of the connections are happening on the same port. It seems that the most common cause of this failure is that connections are left open, but I'm having trouble finding anything left open. Are there other possible explanations?
To clarify, here's how I'm connecting:
struct sockaddr *sa; // parameter
size_t sa_size; //parameter
int i = 1;
int stream;
stream = socket(AF_INET,SOCK_STREAM,0);
setsockopt(stream,SOL_SOCKET,SO_REUSEADDR,&i,sizeof(i));
bindresvport(stream,NULL);
connect(stream,sa,sa_size);
This code is in a function to obtain a connection to another server, and a failure on any of those 4 calls causes the function to fail.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
事实证明,问题确实是地址繁忙 - 繁忙是由我们处理网络通信的方式中的其他一些问题引起的。您的意见帮助我解决了这个问题。谢谢。
编辑:具体来说,处理网络通信的问题是,如果第一次失败,这些状态更新将不断重新发送。让每个分布式从站同时尝试发送其状态更新只是时间问题,这会使我们的网络过度饱和。
It turns out that the problem really was that the address was busy - the busyness was caused by some other problems in how we are handling network communications. Your inputs have helped me figure this out. Thank you.
EDIT: to be specific, the problems in handling our network communications were that these status updates would be constantly re-sent if the first failed. It was only a matter of time until we had every distributed slave trying to send its status update at the same time, which was over-saturating our network.
也许 SO_REUSEADDR 在这里有帮助?
http://www.unixguide.net/network/socketfaq/4.5.shtml
Maybe SO_REUSEADDR helps here?
http://www.unixguide.net/network/socketfaq/4.5.shtml
这只是黑暗中的一枪:当您首先在没有绑定的情况下调用 connect 时,系统会分配您的本地端口,如果您有多个线程连接和断开连接,它可能会尝试分配已在使用的端口。内核源文件 inet_connection_sock.c 暗示了这种情况。就像实验一样,首先尝试绑定到本地端口,确保每个绑定/连接使用不同的本地端口号。
this is just a shot in the dark : when you call connect without a bind first, the system allocates your local port, and if you have multiple threads connecting and disconnecting it could possibly try to allocate a port already in use. the kernel source file inet_connection_sock.c hints at this condition. just as an experiment try doing a bind to a local port first, making sure each bind/connect uses a different local port number.
好吧,我的问题不是端口,而是绑定地址。我的服务器有一个内部地址(10.0.0.4)和一个外部地址(52.175.223.XX)。当我尝试连接时:
它失败了,因为本地套接字是 10.0.0.4 而不是外部 52.175.223.XX。您可以使用 sudo ifconfig 查看本地可用接口。
Okay, my problem wasn't the port, but the binding address. My server has an internal address (10.0.0.4) and an external address (52.175.223.XX). When I tried connecting with:
It failed because the local socket was 10.0.0.4 and not the external 52.175.223.XX. You can checkout the local available interfaces with
sudo ifconfig
.