浏览器和 wget 加载 JPEG 的方式不同?

发布于 2024-10-28 17:24:53 字数 354 浏览 3 评论 0原文

我被这个难住了。尝试在浏览器中加载此图像,然后将其保存到硬盘上。

http://profile.ak.fbcdn.net/hprofile-ak-snc4 /41674_660962816_995_n.jpg

这是一个有效的 JPEG 文件,大小为 11377 字节。

现在尝试使用 wgetcurl 下载它。仅显示 11252 字节,并且图像的右下部分丢失。

什么给?

I'm stumped on this one. Try loading this image in your browser, and then save it to your hard disk.

http://profile.ak.fbcdn.net/hprofile-ak-snc4/41674_660962816_995_n.jpg

It's a valid JPEG file at 11377 bytes.

Now try to download it with wget or curl. Only 11252 bytes show up, and the bottom right part of the image is missing.

What gives?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

司马昭之心 2024-11-04 17:24:53

这里...

进行数据包转储,我发现 Facebook 返回相同到 Safari 的 Content-Length 与卷曲的内容长度相同,并且该内容长度是不正确 11252:

GET /hprofile-ak-snc4/41674_660962816_995_n.jpg HTTP/1.1
User-Agent: curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3
Host: profile.ak.fbcdn.net
Accept: */*

HTTP/1.1 200 OK
Content-Type: image/jpeg
... snip ....
Content-Length: 11252

对于 Safari:

GET /hprofile-ak-snc4/41674_660962816_995_n.jpg HTTP/1.1
Host: profile.ak.fbcdn.net
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; en-us) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27
... snip ...

HTTP/1.1 200 OK
Content-Type: image/jpeg
... snip ...
Content-Length: 11252

所以我猜测 Facebook 发送的内容长度不正确。为了测试这一点,我将使用 netcat:(

$ cat  headers
GET /hprofile-ak-snc4/41674_660962816_995_n.jpg HTTP/1.0
Host: profile.ak.fbcdn.net
Accept: */*

EOF
$ nc -vvv profile.ak.fbcdn.net 80  output
Warning: Inverse name lookup failed for `142.231.1.174'
Notice: Real hostname for profile.ak.fbcdn.net [142.231.1.165] is a142-231-1-165.deploy.akamaitechnologies.com
profile.ak.fbcdn.net [142.231.1.174] 80 (http) open
Total received bytes: 12k (11639)
Total sent bytes: 97
$ head output
HTTP/1.0 200 OK
Content-Type: image/jpeg
... snip ...
Content-Length: 11252

请注意,我使用了 HTTP/1.0,因此 Facebook 服务器不会尝试保持连接打开)

删除 ouput< 的前 10 行/code> 使用文本编辑器然后将其保存为 output.jpg,我就得到了完整的图像。

因此,这确认 Facebook 发送了不正确的 Content-Length 标头(并且图像被截断,因为curl 正在关注内容长度,而 netcat 则不然)。

进一步挖掘,似乎 Aleski 是正确的 - 当图像以 gzip 压缩发送时,Content-Length 是正确的。为了确认这一点,我将 Accept-Encoding: gzip 添加到我的 headers 文件中。 Facebook 正确地发回了经过 gzip 压缩的响应,该响应是预期的长度,并且解压缩它会产生正确的图像。

tl;dr:如果 Content-Encoding 不是 gzip,则 Facebook 的 Content-Length 不正确。

Here goes…

Taking a packet dump, I see that Facebook returns the same Content-Length to Safari as it does to curl, and that content-length is the incorrect 11252:

GET /hprofile-ak-snc4/41674_660962816_995_n.jpg HTTP/1.1
User-Agent: curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3
Host: profile.ak.fbcdn.net
Accept: */*

HTTP/1.1 200 OK
Content-Type: image/jpeg
... snip ....
Content-Length: 11252

And with Safari:

GET /hprofile-ak-snc4/41674_660962816_995_n.jpg HTTP/1.1
Host: profile.ak.fbcdn.net
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; en-us) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27
... snip ...

HTTP/1.1 200 OK
Content-Type: image/jpeg
... snip ...
Content-Length: 11252

So I'm going to guess Facebook is sending an incorrect Content-Length. To test this, I'll use netcat:

$ cat  headers
GET /hprofile-ak-snc4/41674_660962816_995_n.jpg HTTP/1.0
Host: profile.ak.fbcdn.net
Accept: */*

EOF
$ nc -vvv profile.ak.fbcdn.net 80  output
Warning: Inverse name lookup failed for `142.231.1.174'
Notice: Real hostname for profile.ak.fbcdn.net [142.231.1.165] is a142-231-1-165.deploy.akamaitechnologies.com
profile.ak.fbcdn.net [142.231.1.174] 80 (http) open
Total received bytes: 12k (11639)
Total sent bytes: 97
$ head output
HTTP/1.0 200 OK
Content-Type: image/jpeg
... snip ...
Content-Length: 11252

(note that I used HTTP/1.0 so the Facebook servers wouldn't try to hold the connection open)

Removing the first 10 lines of ouput using a text editor then saving it as output.jpg, I've got the complete image.

So this confirm that Facebook is sending an incorrect Content-Length header (and the image is getting cut off because curl is paying attention to the content length while netcat isn't).

Digging a little further, it seems like Aleski is correct — the Content-Length is correct when the image is sent gzip-compressed. To confirm this, I added Accept-Encoding: gzip to my headers file. Facebook correctly sends back a gzip'd response which is the expected length, and uncompressing it results in the correct image.

tl;dr: Facebook's Content-Length is incorrect if the Content-Encoding is not gzip.

嗫嚅 2024-11-04 17:24:53

看来服务器有问题。当我测试它时,firefox 和 wget 之间的区别在于,firefox 表明它接受对其请求的 gzip 或 deflate 压缩答案,而 wget 则不接受。

服务器对 Firefox 的响应是 11252 字节的压缩数据,对 wget 的响应是 11377 字节的未压缩数据。然而,它发送给双方的内容长度都是 11252(正如 David 已经说过的)。

换句话说,服务器似乎正在缓存压缩版本,并且即使在发送未压缩的数据时也错误地发送了压缩大小。您获得了所有数据,但由于服务器通告的数据较少,wget(以及其他要求未压缩数据的软件)会丢弃“额外”数据。

It seems the server is faulty. When I tested it, the difference between firefox and wget was that firefox indicated that it accepts gzip or deflate -compressed answers to it's request, whereas wget did not.

The servers response to firefox was 11252 bytes of compressed data, and it's response to wget was 11377 bytes of uncompressed data. The Content-Length it sent was however 11252 to both (as David already said).

In other words, it seems that the server is caching the compressed version and incorrectly sending the compressed size even when sending the data uncompressed. You get all the data, but since the server advertises less data, wget (and other software that asks for uncompressed data) discards the "extra" data.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文