我已经运行 nagios 大约两年了,但最近我的一项服务开始出现这个问题。
我正在进行
CRITICAL - Socket timeout after 10 seconds
check_http -H my.host.com -f follow -u /abc/def
检查,该检查过去工作正常。没有其他服务报告此问题。远程站点已启动且运行正常,我可以从 nagios 服务器执行 wget http://my.host.com/abc/def ,并且它可以正常下载响应。另外,执行 check_http -H my.host.com -f follow
效果很好,即只有当我使用 -u
参数时,事情才会中断。我还尝试向它传递不同的用户代理字符串,没有区别。我尝试增加超时时间,但没有成功。我尝试使用 -v,但得到的只是:
GET /abc/def HTTP/1.0
User-Agent: check_http/v1861 (nagios-plugins 1.4.11)
Connection: close
Host: my.host.com
CRITICAL - Socket timeout after 10 seconds
...这并没有告诉我出了什么问题。
我有什么想法可以解决这个问题吗?
谢谢!
I've been running nagios for about two years, but recently this problem started appearing with one of my services.
I'm getting
CRITICAL - Socket timeout after 10 seconds
for a check_http -H my.host.com -f follow -u /abc/def
check, which used to work fine. No other services are reporting this problem. The remote site is up and healthy, and I can do a wget http://my.host.com/abc/def
from the nagios server, and it downloads the response just fine. Also, doing a check_http -H my.host.com -f follow
works just fine, i.e. it's only when I use the -u
argument that things break. I also tried passing it a different user agent string, no difference. I tried increasing the timeout, no luck. I tried with -v, but all it get is:
GET /abc/def HTTP/1.0
User-Agent: check_http/v1861 (nagios-plugins 1.4.11)
Connection: close
Host: my.host.com
CRITICAL - Socket timeout after 10 seconds
... which does not tell me what's going wrong.
Any ideas how I could resolve this?
Thanks!
发布评论
评论(5)
尝试使用
check_http
的-N
选项。我遇到了类似的问题,在我的例子中,网络服务器在发送响应后没有终止连接(https 有效,http 无效)。 check_http 尝试从打开的套接字读取数据,直到服务器关闭连接。如果没有发生,就会发生超时。
-N
选项告诉check_http
仅接收标头,而不接收页面/文档的内容。Try using the
-N
option ofcheck_http
.I ran into similar problems, and in my case the web server didn't terminate the connection after sending the response (https was working, http wasn't). check_http tries to read from the open socket until the server closes the connection. If that doesn't happen then the timeout occurs.
The
-N
option tellscheck_http
to receive only the header, but not the content of the page / document.我将问题归结为最新版本 OpenSUSE 中配置的安全提供程序的问题。
从其他网页的摘要来看,这似乎是尝试使用 TLSv2 协议时出现的问题,该协议似乎无法正常工作,或者默认配置中缺少某些内容以允许其工作。
为了解决这个问题,我从 JRE 安全配置文件中注释掉了有问题的安全提供程序。
安全提供商。您的配置中的值可能有所不同,但本质上是 SunPKCS11 提供程序存在问题。
此配置通常可以在
您正在使用的 JRE 中找到。
I tracked my issue down to an issue with the security providers configured in the most recent version of OpenSUSE.
From summary of other web pages it appears to be an issue with an attempt to use TLSv2 protocol which does not appear to work correctly, or is missing something in the default configurations to allow it to work.
To overcome the problem I commented out the security provider in question from the JRE security configuration file.
The security.provider. value may be different in your configuration, but essentially the SunPKCS11 provider is at issue.
This configuration is normally found in
of the JRE that you are using.
在 nrpe.cfg 中使用此 url 进行了修复:(在 Deb 6.0 Squeeze 上使用 nagios-nrpe-server)
Fixed with this url in nrpe.cfg: (on Deb 6.0 Squeeze using nagios-nrpe-server)
对于感兴趣的人,我也偶然发现了这个问题,问题最终出现在网络服务器上的 mod_itk 中。
即使当前的 CentOS 或 Debian 软件包中似乎未包含该补丁,也有可用的补丁:
https://lists.err.no/pipermail/mpm-itk/2015-September/000925.html
For whoever is interested, I stumbled in this problem too and the problem ended up being in mod_itk on the web server.
A patch is available, even if it seems it's not included in the current CentOS or Debian packages:
https://lists.err.no/pipermail/mpm-itk/2015-September/000925.html
就我而言,
/etc/postfix/main.cf
文件配置不好。我的 mailserverrelay 没有定义,而且限制也很大。
我应该补充一下:
In my case
/etc/postfix/main.cf
file was not good configured.My mailserverrelay was not defined and was also very restrictive.
I should to add: