使用 fsockopen 的罕见奇怪读数

发布于 2024-09-09 22:22:30 字数 1246 浏览 5 评论 0原文

我在一个小型 cronjob 上使用 fsockopen 来读取和解析不同服务器上的提要。对于大多数过去来说,这非常有效。然而,在某些服务器上,我在响应中得到非常奇怪的行,如下所示:

<language>en</language>
 <sy:updatePeriod>hourly</sy:updatePeriod>
 <sy:updateFrequency>1</sy:updateFrequency>

11
 <item>
  <title>
1f
July 8th, 2010</title>
  <link>
32
http://darkencomic.com/?p=2406</link>
  <comments>
3e

但是当我在例如记事本++中打开提要时,它工作得很好,显示:

<language>en</language>
 <sy:updatePeriod>hourly</sy:updatePeriod>
 <sy:updateFrequency>1</sy:updateFrequency>
   <item>
  <title>July 8th, 2010</title>
  <link>http://darkencomic.com/?p=2406</link>
  <comments>

...只是为了显示摘录。那么,我在这里做错了什么还是这超出了我的控制范围?我很感激任何解决这个问题的想法。 这是我用来检索提要的代码的一部分:

$fp = @fsockopen($url["host"], 80, $errno, $errstr, 5);
  if (!$fp) {
   throw new UrlException("($errno) $errstr ~~~ on opening ".$url["host"]."");
  } else {
   $out = "GET ".$path." HTTP/1.1\r\n"
     ."Host: ".$url["host"]."\r\n"
     ."Connection: Close\r\n\r\n";
   fwrite($fp, $out);
   $contents = '';
   while (!feof($fp)) {
    $contents .= stream_get_contents($fp,128);
   }
   fclose($fp);

I'm using fsockopen on a small cronjob to read and parse feeds on different servers. For the most past, this works very well. Yet on some servers, I get very weird lines in the response, like this:

<language>en</language>
 <sy:updatePeriod>hourly</sy:updatePeriod>
 <sy:updateFrequency>1</sy:updateFrequency>

11
 <item>
  <title>
1f
July 8th, 2010</title>
  <link>
32
http://darkencomic.com/?p=2406</link>
  <comments>
3e

But when I open the feed in e.g. notepad++, it works just fine, showing:

<language>en</language>
 <sy:updatePeriod>hourly</sy:updatePeriod>
 <sy:updateFrequency>1</sy:updateFrequency>
   <item>
  <title>July 8th, 2010</title>
  <link>http://darkencomic.com/?p=2406</link>
  <comments>

...just to show an excerpt. So, am I doing anything wrong here or is this beyond my control? I'm grateful for any idea to fix this.
Here's part of the code I'm using to retrieve the feeds:

$fp = @fsockopen($url["host"], 80, $errno, $errstr, 5);
  if (!$fp) {
   throw new UrlException("($errno) $errstr ~~~ on opening ".$url["host"]."");
  } else {
   $out = "GET ".$path." HTTP/1.1\r\n"
     ."Host: ".$url["host"]."\r\n"
     ."Connection: Close\r\n\r\n";
   fwrite($fp, $out);
   $contents = '';
   while (!feof($fp)) {
    $contents .= stream_get_contents($fp,128);
   }
   fclose($fp);

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

千紇 2024-09-16 22:22:30

这看起来像 HTTP 分块传输编码 - 这是一种方式HTTP 将响应分割成几个小部分;引用:

每个非空块都以
它嵌入的数据的八位位组数
(以十六进制表示的大小)随后
通过 CRLF(回车符和行
feed)和数据本身。

然后用 CRLF 关闭。
在一些
实现,空白
字符 (0x20) 之间填充
块大小和 CRLF。

When working with `fsockopen` and the like, you have to deal with the HTTP Protocol yourself... Which is not always as easy as one might think ;-)

避免处理此类问题的解决方案是使用 curl :它已经知道 HTTP 协议——这意味着您不必重新发明轮子;-)

This looks like HTTP Chunked transfer encoding -- which is a way HTTP has of segmenting a response into several small parts ; quoting :

Each non-empty chunk starts with the
number of octets of the data it embeds
(size written in hexadecimal) followed
by a CRLF (carriage return and line
feed), and the data itself.
The chunk
is then closed with a CRLF.
In some
implementations, white space
characters (0x20) are padded between
chunk-size and the CRLF.

When working with `fsockopen` and the like, you have to deal with the HTTP Protocol yourself... Which is not always as easy as one might think ;-)

A solution to avoid having to deal with such stuff would be to use something like curl : it already knows the HTTP Protocol -- which means you won't have to re-invent the whell ;-)

醉殇 2024-09-16 22:22:30

我没有看到任何奇怪的事情会导致这种行为。有什么方法可以使用 cURL 来为您做到这一点吗?它可能完全解决问题:)

I don't see anything strange that could cause that kind of behaviour. Is there any way you can use cURL to do this for you? It might solve the problem altogether :)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文