当其他协议似乎已关闭时,如何才能运行 DNS 名称解析?
我们正在尝试实现一个基于Moxa UC-7112-LX嵌入式计算机(uClinux OS)的软件。我们使用 Cinteron MC52i GSM 调制解调器(常规 GPRS 服务)和标准 pppd 连接到互联网。
连接后一切似乎都很好。 Ping 实用程序正常工作,我的程序中的 Socket 功能也正常工作。然而,一段时间后,ppp 连接以一种非常奇怪的方式中断。以下是这种情况的症状:
- 当我使用某个主机名作为参数调用 ping 实用程序时,系统能够解析其 IP 并开始发送 ICMP 数据包,但没有得到响应。我正在尝试不同的网络资源名称,以便系统无法缓存它们的地址或其他内容。无论我选择什么,系统都能正确解析 IP,但无法获得任何 ping 响应。
- 我的应用程序中的
connect()
和write()
函数没有返回错误,但是当涉及read()
时,函数返回并设置了 errno至ECONNRESET
(连接由对等方重置)。该程序使用标准套接字功能(TCP 协议) - ppp 链接显示为正在运行(
ifconfig ppp0
)
因此,我遇到的情况是:链接足以维持 DNS 解析服务(UDP 是工作?)但不足以运行 TCP 连接并接收 ping 回波...
这种情况并不总是出现。有时系统可以正常工作几天而不出现任何问题。每当出现问题时,简单的重置即可解决一切。
我知道我们使用的系统非常奇特,这里描述的情况可能与一些有缺陷的 tcp 堆栈或 pppd 实现有关。考虑到系统是由制造商预先配置的,我没有任何选项来重建/更改操作系统固件。
我仍然希望有人在任何类似 linux 的系统上看到过类似的情况。有什么方法可以测试为什么 DNS 名称解析工作正常,而其他网络功能却不起作用?是否可以通过某些 pppd 设置删除此类连接状态?
编辑:
首先,我想解决 IP 地址本地缓存的可能性。我没有 dig
实用程序,也不知道如何检查哪个主机将结果提供给 getaddrinfo()
。不过我确信这些地址没有被缓存,因为我正在尝试 ping 完全随机的 URL。另外,由于 GPRS 响应时间较慢,因此无需使用时间测量实用程序来查看 ping 在开始发送数据包之前需要 1-2 秒或更长时间来解析 IP。此外,ncsd
、BIND
或任何 dns 服务器不在计算机本地运行。我知道您可能不会将此视为证据,但这就是我在我的系统上提供的实用程序集。
我想提供一些有关互联网连接操作的附加信息。
正常连接状态
系统加载时的rc
脚本运行另一个脚本作为后台进程:
sh /etc/connect &
connect
脚本如下:
#!/bin/sh
echo First connect attempt > /etc/ppp/conn.info
while true
do
date >> /etc/ppp/conn.info
pppd call mts
echo Reconnecting... >> /etc/ppp/conn.info
done
我已经这样做的原因在这里建立一个循环很简单:连接会持续几个小时,之后总是会中断。不幸的是,我的 pppd
实现不支持 logfile 选项(所以我不明白为什么它被破坏了)。 persist 似乎也不起作用,所以我来到了上面的连接脚本。 pppd 选项是:
/dev/ttyM0 115200 crtscts
connect 'chat -f /etc/ppp/peers/mts.chat'
noauth
user mts
password mts
noipdefault
usepeerdns
defaultroute
ifconfig ppp0
给出:
ppp0 Link encap:Point-Point Protocol
inet addr:172.22.22.109 P-t-P:192.168.254.254 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
RX packets:34 errors:0 dropped:0 overruns:0 frame:0
TX packets:36 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:3
RX bytes:3130 (3.0 KiB) TX bytes:2250 (2.1 KiB)
这就是它开始变得奇怪的地方。每当我连接时,我都会得到不同的 inet addr
但 Ptp
始终相同:192.168.254.254。这与默认网关条目中显示的地址相同,由 netstat -rn
给出:
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
192.168.254.254 0.0.0.0 255.255.255.255 UH 0 0 0 ppp0
192.168.4.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
192.168.15.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
192.168.0.0 192.168.15.1 255.255.0.0 UG 0 0 0 eth0
0.0.0.0 192.168.254.254 0.0.0.0 UG 0 0 0 ppp0
route -Cevn
在我的系统上不可用,route 提供与上面相同的信息。
但我永远无法 ping 通 192.168.254.254,即使一切都按预期工作:tcp 连接、ping、DNS 等。这是跟踪路由的结果:
traceroute to kernel.org (149.20.4.69), 30 hops max, 40 byte packets
1 172.16.4.210 (172.16.4.210) 528.765 ms 545.269 ms 616.67 ms
2 172.16.4.226 (172.16.4.226) 563.034 ms 526.176 ms 537.07 ms
3 10.250.85.161 (10.250.85.161) 572.805 ms 564.073 ms 556.766 ms
4 172.31.250.9 (172.31.250.9) 556.513 ms 563.383 ms 580.724 ms
5 172.31.250.10 (172.31.250.10) 518.15 ms 526.403 ms 537.574 ms
6 pub2.kernel.org (149.20.4.69) 538.058 ms 514.222 ms 538.575 ms
7 pub2.kernel.org (149.20.4.69) 537.531 ms 538.52 ms 537.556 ms
8 pub2.kernel.org (149.20.4.69) 568.695 ms 523.099 ms 570.983 ms
9 pub2.kernel.org (149.20.4.69) 526.511 ms 534.583 ms 537.994 ms
##### traceroute loops here - why?? #######
所以,我可以假设172.16.4.210 是对等方的地址。此类地址在任何情况下都是可 ping 通的(见下文)。我不知道为什么traceroute输出的结构是这样的(数据包来自ISP的内部网络直接到达目的地,在目的地地址“循环” - 它不应该是这样的)。
另外我想指出的是,我可以 ping DNS 服务器,但跟踪路由不会一直到达它。
您可能会注意到有 eth0 和 eth1 设备。它们与案件无关。 eth1 未连接,eth0 连接到 LAN,但无法访问互联网。
连接状态不良
因此,一段时间后,出现了所讨论的情况。除了 DNS 服务器(以及对等点,我从 DNS 的跟踪路由结果中获得的地址)之外,我无法 ping 任何内容,并且无法通过 tcp 与远程主机通信。 DNS 解析正在工作
网络实用程序提供与正常状态下相同的输出。我有相同的无法 ping 通的对等点(来自 ifconfig 结果的 192.168.254.254),路由表是相同的:
# ifconfig ppp0
ppp0 Link encap:Point-Point Protocol
inet addr:172.22.22.109 P-t-P:192.168.254.254 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
RX packets:297 errors:0 dropped:0 overruns:0 frame:0
TX packets:424 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:3
RX bytes:33706 (32.9 KiB) TX bytes:27451 (26.8 KiB)
# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.254.254 * 255.255.255.255 UH 0 0 0 ppp0
192.168.4.0 * 255.255.255.0 U 0 0 0 eth1
192.168.15.0 * 255.255.255.0 U 0 0 0 eth0
192.168.0.0 192.168.15.1 255.255.0.0 UG 0 0 0 eth0
default 192.168.254.254 0.0.0.0 UG 0 0 0 ppp0
请注意,原始 ppp 连接(我用来提供 正常< /em> 状态)持续存在。我的 /etc/connect 脚本没有循环(脚本生成的临时日志中没有新记录)。
这里是对 DNS 服务器的 ping:
# cat /etc/resolv.conf
#search moxa.com
nameserver 213.87.0.1
nameserver 213.87.1.1
# ping 213.87.0.1
PING 213.87.0.1 (213.87.0.1): 56 data bytes
64 bytes from 213.87.0.1: icmp_seq=0 ttl=59 time=559.8 ms
64 bytes from 213.87.0.1: icmp_seq=1 ttl=59 time=509.9 ms
64 bytes from 213.87.0.1: icmp_seq=2 ttl=59 time=559.8 ms
和traceroute:
# traceroute 213.87.0.1
traceroute to 213.87.0.1 (213.87.0.1), 30 hops max, 40 byte packets
1 172.16.4.210 (172.16.4.210) 542.449 ms 572.858 ms 595.681 ms
2 172.16.4.214 (172.16.4.214) 590.392 ms 565.887 ms 676.919 ms
3 * * *
4 217.8.237.62 (217.8.237.62) 603.1 ms 569.078 ms 553.723 ms
5 * * *
6 * * *
## and so on ###
***
行可能看起来很麻烦,但我在正常情况下为该 DNS 获得相同的跟踪路由
ping 到 172.16.4.210 也可以正常工作。
现在到 TCP。我在我的 PC 上启动了一个简单的 echo 服务器,并尝试通过 telnet 连接到它(实际的 IP 地址未显示):
# telnet XXX.XXX.XXX.XXX 9060
Trying XXX.XXX.XXX.XXX(25635)...
Connected to XXX.XXX.XXX.XXX.
Escape character is '^]'.
aaabbbccc
Connection closed by foreign host.
这就是这里发生的情况。成功的 connect()
就像在我的自定义应用程序中一样,当 telnet 调用 read()
时,连接已关闭...。实际服务器没有收到任何传入连接。为什么“connect()”正常返回(它无法从主机获得握手响应!)超出了我的知识范围。
果然,相同的 telnet 测试在正常状态下工作正常。
注意:
由于我的系统的嵌入式特性,我没有在服务器故障上发布此内容。据我了解,serverfault 处理更传统的系统(例如运行“正常”linux 的 x86)。我只是希望 stackoverflow 有更多像我的 Moxa 这样了解此类系统的嵌入式专家。
We are trying to implement a software based on Moxa UC-7112-LX embedded computer (uClinux OS). We use Cinteron MC52i GSM modem (regular GPRS service) and standart pppd to connect to the Internet.
Everything seems to be fine, right after the connection. Ping utility is working, Socket functions in my program work normally too. However after some time ppp connection brokes in a very peculiar way. These are the symptoms of that situation:
- When I call ping utility with some host name as parameter the system is able to resolve it's IP and starts sending ICMP packets but gets no response. I am trying different web resources names, so that the system cannot have their addresses cached or something. Whatever I choose, the system correctly resolves IP but can't get any ping responce.
connect()
andwrite()
functions in my application give no error return but when it comes toread()
the function returns with errno set toECONNRESET
(Connection reset by peer). The program uses standard socket functions (TCP protocol)- the ppp link is shown as running (
ifconfig ppp0
)
So, the situation that I have is: the link is good enough to maintain DNS resolving service (UDP is working?) but NOT good enough to run TCP connection and receive ping echoes...
The situation does not appear all the time. Sometimes the system can work normally for days without any problem. Whenever the problem appears, simple reset solves everything.
I know that the system we use is quite exotic, and the situation described here may be connected with some buggy tcp stack or pppd implementation. Considering that the system is preconfigured by the manufacturer I don't have any options to rebuild/change the OS firmware.
Still I hope that someone have seen the similar situation on any linux-like system. Is there any way to test why DNS name resolving is working while the other network stuff does not? Is it possible to remove such connection state with some pppd settings?
Edit:
First of all, I'd like to address the possibility of local caching of the IP addresses. I don't have dig
utility and I have no idea how to check which host gives the result to getaddrinfo()
. Still I'm sure that the addresses are not cached cause I'm trying to ping totally random URLs. Also given the slow GPRS response time it is not necessary to have the time measuring utility to see that ping takes 1-2 seconds or more to resolve IP before starting sending out packets. Furthermore ncsd
, BIND
or any dns servers do not run locally on the machine. I understand that you may not see that as proof, but that's what I have given the utility set available on my system.
I'd like to give some additional information concerning the internet connection operation.
Normal connection state
The rc
script at system load runs another script as background process:
sh /etc/connect &
The connect
script is as follows:
#!/bin/sh
echo First connect attempt > /etc/ppp/conn.info
while true
do
date >> /etc/ppp/conn.info
pppd call mts
echo Reconnecting... >> /etc/ppp/conn.info
done
The reason that I've made a loop here is simple: the connection persists for several hours and after that it always breaks. Unfortunately my implementation of pppd
does not support the logfile option (so I can't see why is it broken). persist does not seem to work either so I've come to the connect script above. The pppd options are:
/dev/ttyM0 115200 crtscts
connect 'chat -f /etc/ppp/peers/mts.chat'
noauth
user mts
password mts
noipdefault
usepeerdns
defaultroute
ifconfig ppp0
gives:
ppp0 Link encap:Point-Point Protocol
inet addr:172.22.22.109 P-t-P:192.168.254.254 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
RX packets:34 errors:0 dropped:0 overruns:0 frame:0
TX packets:36 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:3
RX bytes:3130 (3.0 KiB) TX bytes:2250 (2.1 KiB)
And thats where it starts getting strange. Whenever I connect I'm getting different inet addr
but P-t-p
is always the same: 192.168.254.254. This is the same address that appears in default gateway entry, as given by netstat -rn
:
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
192.168.254.254 0.0.0.0 255.255.255.255 UH 0 0 0 ppp0
192.168.4.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
192.168.15.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
192.168.0.0 192.168.15.1 255.255.0.0 UG 0 0 0 eth0
0.0.0.0 192.168.254.254 0.0.0.0 UG 0 0 0 ppp0
route -Cevn
is unavailable on my system, route gives the same info as above.
But I'm never able to ping the 192.168.254.254, not even when everything is working as intended: tcp connection, ping, DNS etc. Here is the result of traceroute:
traceroute to kernel.org (149.20.4.69), 30 hops max, 40 byte packets
1 172.16.4.210 (172.16.4.210) 528.765 ms 545.269 ms 616.67 ms
2 172.16.4.226 (172.16.4.226) 563.034 ms 526.176 ms 537.07 ms
3 10.250.85.161 (10.250.85.161) 572.805 ms 564.073 ms 556.766 ms
4 172.31.250.9 (172.31.250.9) 556.513 ms 563.383 ms 580.724 ms
5 172.31.250.10 (172.31.250.10) 518.15 ms 526.403 ms 537.574 ms
6 pub2.kernel.org (149.20.4.69) 538.058 ms 514.222 ms 538.575 ms
7 pub2.kernel.org (149.20.4.69) 537.531 ms 538.52 ms 537.556 ms
8 pub2.kernel.org (149.20.4.69) 568.695 ms 523.099 ms 570.983 ms
9 pub2.kernel.org (149.20.4.69) 526.511 ms 534.583 ms 537.994 ms
##### traceroute loops here - why?? #######
So, I can assume that 172.16.4.210 is peer's address. Such address is pingable in any case (see below). I have no idea why the structure of traceroute output is like this (packets come from internal network of ISP right to the destination, 'loop' at the destination address - it just should not be like this).
Also I would like to note that I can ping DNS server but traceroute does not go all the way up to it.
You may notice that there are eth0 and eth1 devices. They are irrelevant to the case. eth1 is not connected and eth0 is connected to lan without internet access.
Bad connection state
So, some time passes and the situation under question appears. I can't ping anything but DNS server (and peer, the address for which I get from traceroute result for the DNS) and cant communicate with remote host via tcp. DNS resolving is working
The network utilites give the same output as in normal state. I have the same unpingable peer (192.168.254.254 from ifconfig result), the routing table is the same:
# ifconfig ppp0
ppp0 Link encap:Point-Point Protocol
inet addr:172.22.22.109 P-t-P:192.168.254.254 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
RX packets:297 errors:0 dropped:0 overruns:0 frame:0
TX packets:424 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:3
RX bytes:33706 (32.9 KiB) TX bytes:27451 (26.8 KiB)
# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.254.254 * 255.255.255.255 UH 0 0 0 ppp0
192.168.4.0 * 255.255.255.0 U 0 0 0 eth1
192.168.15.0 * 255.255.255.0 U 0 0 0 eth0
192.168.0.0 192.168.15.1 255.255.0.0 UG 0 0 0 eth0
default 192.168.254.254 0.0.0.0 UG 0 0 0 ppp0
Note that the original ppp connection (one which I used to provide the output from normal state) persisted. My /etc/connect script did not loop (there was no new record in a makeshift log the script makes).
Here goes the ping to DNS server:
# cat /etc/resolv.conf
#search moxa.com
nameserver 213.87.0.1
nameserver 213.87.1.1
# ping 213.87.0.1
PING 213.87.0.1 (213.87.0.1): 56 data bytes
64 bytes from 213.87.0.1: icmp_seq=0 ttl=59 time=559.8 ms
64 bytes from 213.87.0.1: icmp_seq=1 ttl=59 time=509.9 ms
64 bytes from 213.87.0.1: icmp_seq=2 ttl=59 time=559.8 ms
And traceroute:
# traceroute 213.87.0.1
traceroute to 213.87.0.1 (213.87.0.1), 30 hops max, 40 byte packets
1 172.16.4.210 (172.16.4.210) 542.449 ms 572.858 ms 595.681 ms
2 172.16.4.214 (172.16.4.214) 590.392 ms 565.887 ms 676.919 ms
3 * * *
4 217.8.237.62 (217.8.237.62) 603.1 ms 569.078 ms 553.723 ms
5 * * *
6 * * *
## and so on ###
***
lines may look like trouble but im getting the same traceroute for that DNS in normal situation
ping to 172.16.4.210 works fine as well.
Now to TCP. I've started a simple echo server on my PC and tried to connect via telnet to it (the actual ip address is not shown):
# telnet XXX.XXX.XXX.XXX 9060
Trying XXX.XXX.XXX.XXX(25635)...
Connected to XXX.XXX.XXX.XXX.
Escape character is '^]'.
aaabbbccc
Connection closed by foreign host.
So thats what happened here. Successfull connect()
just like in my custom application is followed by Connection closed... when telnet called read()
. The actual server did not receive any incoming connection. Why did 'connect()' return normally (it could not get the handshake response from the host!) is beyond my scope of knowledge.
Sure enough same telnet test works fine in normal state.
Note:
I did not publish this on serverfault cause of the embedded nature of my system. serverfault as far as I understand deals with more conventional systems (like x86s running 'normal' linux). I just hope that stackoverflow has more embedded experts who know such systems as my Moxa.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
问:当其他协议似乎已关闭时,如何才能运行 DNS 名称解析?
A:您的本地 DNS 解析器(
bind
是除ncsd
之外的另一种可能性)可能正在缓存第一个响应。dig
会告诉您从哪里获得响应:如果您从
127.0.0.1
获得非常快(低毫秒)的答案,那么您很可能“从相同 DNS 名称的先前查询中获取本地缓存的答案(人们在ppp
连接上使用缓存 DNS 解析器来减少连接时间并实现较小的负载是很常见的减少在 ppp 链接上)。如果您怀疑缓存的答案,请
挖掘
其他一些 DNS 名称,看看它是否也可以解析。ppp
连接中断等情况。其他诊断信息
如果您发现自己处于我描述的最后一种情况,则需要进行一些 IP 和 ppp 级调试,然后才能进一步隔离。正如有人提到的,
tcpdump
在这一点上非常有价值,但听起来你没有它可用。我假设您没有与 DNS 服务器的同一 IP 地址建立 TCP 连接。此时有很多可能性...如果您仍然可以解析随机 DNS 名称,但 TCP 连接失败,则您看到的问题可能是在 ppp 连接的另一端,即内核路由缓存(它保存了一些 TCP 状态信息,例如
MSS
)变得混乱,tcp
的数据包丢失太多,或者任何其他事情。假设您的拓扑结构如下:
当您启动 ppp 连接时,记下您的 IP 地址和默认网关的地址:
如果您没有
iproute2
软件包,也可以找到类似的结果作为发行版的一部分(iproute2
提供ip
实用程序):对于那些使用
iproute2
实用程序(现在几乎每个人都使用)的用户,<代码>ifconfig已经已弃用并由ip
命令取代;但是,如果您有基于 2.2 或 2.4 的旧系统,您可能仍然需要使用ifconfig
。故障排除步骤:
当您开始遇到问题时,首先检查是否可以 ping 通您的访问服务器上的
pppX
地址。pppX
的 IP 地址,那么除了 uCLinux 计算机上的缓存响应之外,您的 DNS 不太可能被其他任何方式解析。pppX
,请尝试ping
TCP 对等方的 IP 地址和 DNS 的 IP 地址(如果不在上)本地主机
)。除非涉及防火墙,否则您必须能够成功ping
防火墙才能使这些功能发挥作用。如果您可以
ping
pppX
的 IP 地址,但无法ping
您的 TCP 对等方的 IP 地址,请检查您的路由表以查看是否您的默认路由仍然指向ppp0
如果您的默认路由指向
ppp0
,请检查您是否仍然可以 ping 通该 IP 地址默认路由。如果您可以
ping
您的默认路由,并且可以ping
您尝试连接的远程主机,请检查内核的路由缓存中的 IP 地址远程 TCP 主机...查找任何奇怪或可疑的内容如果您可以
ping
远程 TCP 主机(并且您需要执行大约 200ping
code> 可以肯定...tcp
对严重的数据包丢失很敏感,而 GPRS 是出了名的有损),请尝试成功地进行telnet<远程端口>
。如果两者都成功,那么是时候开始在您的软件中寻找线索了。如果您仍然无法弄清楚正在发生的事情,请在您回来时附上上述命令的输出...以及您如何启动
ppp
连接。Q: How can I have DNS name resolving running while other protocols seem to be down?
A: Your local DNS resolver (
bind
is another possibility besidesncsd
) might be caching the first response.dig
will tell you where you are getting the response from:If you are getting a very quick (low milliseconds) answer from
127.0.0.1
, then it's very likely that you're getting a locally cached answer from a prior query of the same DNS name (and it's quite common for people to use caching DNS resolvers on appp
connection to reduce connection time, as well as achieving a small load reduction on the ppp link).If you suspect a cached answer, do a
dig
on some other DNS name to see whether it can resolve too.ppp
connection going down.Other diagnostic information
If you find yourself in either of the last situations I described, you need to do some IP and ppp-level debugs before this can be isolated further. As someone mentioned,
tcpdump
is quite valuable at this point, but it sounds like you don't have it available.I assume you are not making a TCP connection to the same IP address of your DNS server. There are many possibilities at this point... If you can still resolve random DNS names, but TCP connections are failing, it is possible that the problem you are seeing is on the other side of the ppp connection, that the kernel routing cache (which holds a little TCP state information like
MSS
) is getting messed up, you have too much packet loss fortcp
, or any number of things.Let's assume your topology is like this:
When you initiate your ppp connection, take note of your IP address and the address of your default gateway:
Similar results can be found if you don't have the
iproute2
package as part of your distro (iproute2
provides theip
utility):For those with the
iproute2
utilities (which is almost everybody these days),ifconfig
has been deprecated and replaced by theip
commands; however, if you have an older 2.2 or 2.4-based system you may still need to useifconfig
.Troubleshooting steps:
When you start having the problem, first check whether you can ping the address of
pppX
on your access server.pppX
on the other side, then it is highly unlikely your DNS is getting resolved by anything other than a cached response on your uCLinux machine.pppX
, then try toping
the ip address of your TCP peer and the IP address of the DNS (if it is not onlocalhost
). Unless there is a firewall involved, you must be able toping
it successfully for any of this to work.If you can
ping
the ip address ofpppX
but you cannotping
your TCP peer's ip address, check your routing table to see whether your default route is still pointing outppp0
If your default route points through
ppp0
, check whether you can still ping the ip address of the default route.If you can
ping
your default route and you canping
the remote host that you're trying to connect to, check the kernel's routing cache for the IP address of the remote TCP host.... look for anything odd or suspiciousIf you can
ping
the remote TCP host (and you need to do about 200pings
to be sure...tcp
is sensitive to significant packet loss & GPRS is notoriously lossy), try making a successfultelnet <remote_host> <remote_port>
. If both are successful, then it's time to start looking inside your software for clues.If you still can't untangle what is happening, please include the output of the aforementioned commands when you come back... as well as how you're starting the
ppp
connection.Ping 永远不应该成为最终用户应用程序的一部分(请参阅注释),并且任何程序都不应依赖 ping 来运行。 ping 充其量可能会告诉我们 TCP/IP 堆栈的一部分正在远程运行。请参阅此处。
OP所描述的问题似乎并不是问题。所有网络连接都会失败,解析器可能会也可能不会使用网络,并且 ping 并没有真正的帮助。我猜想OP可以检查调制解调器是否已连接,以及是否再次连接。
编辑:伪代码
注意:如果您正在为网络人员编写网络监控应用程序,则例外。
Pings should never be part of an end-user application(see note), and no program should rely on ping to function. At best ping might tell us that a part of the TCP/IP stack was running on the remote. See my argument here.
What the OP describes as a problem doesn't seem to be a problem. All network connections fail, the resolver may or may not use the network, and ping isn't really helpful. I would guess that the OP can check that the modem is connected or not, and if it isn't connect again.
edit: Pseudo code
Note: the exception would be if you are writing a network monitoring application for a networking person.