查找正文是否包含 gzip 压缩数据

发布于 2024-10-31 20:58:14 字数 137 浏览 1 评论 0原文

我有一个程序,它会搜索来自特定字符串的curl 请求的回复。我有时会得到 gzip 压缩的数据。有没有办法确定回复是文本格式还是压缩格式? 标头有时包含 gziipped,deflate 标头,但其不一致。有没有办法搜索字符串并查找其是否经过 gzip 压缩?

i have a program wherein it searches the reply from a curl request for specific strings. i sometimes get gzipped data. is there a way to find whether the reply is text or gzipped format?
header sometimes contain gziipped,deflate header, but its not consistent. is there a way to search the string and find if its gzipped?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

空气里的味道 2024-11-07 20:58:14

您可以尝试查看数据的前两个字节。对于 gzip 压缩数据,它们应该是 0x1f、0x8b

Member header and trailer

ID1 (IDentification 1)
ID2 (IDentification 2)
These have the fixed values ID1 = 31 (0x1f, \037), ID2 = 139 (0x8b, \213),
to identify the file as being in gzip format.

You could try taking a look at the first two bytes of data. For gzipped data, they should be 0x1f, 0x8b.

Member header and trailer

ID1 (IDentification 1)
ID2 (IDentification 2)
These have the fixed values ID1 = 31 (0x1f, \037), ID2 = 139 (0x8b, \213),
to identify the file as being in gzip format.
滴情不沾 2024-11-07 20:58:14

您可以查看文件的第一个字节。也许它们包含一个魔法数字

You could look at the first bytes of the file. Perhaps they containt a magic number.

清眉祭 2024-11-07 20:58:14

gzip 文件格式以一些“魔法字节”开头。您可以检查主体是否以这些开头,如果是,则将字节推回流中并开始解压缩。

The gzip file format starts with some "magic bytes". You can check whether the body starts with these, and if it does, push back the bytes into the stream and start unzipping it.

奢望 2024-11-07 20:58:14

您可以通过 zcat 进行管道传输,如果失败,请按原样使用该字符串。我知道这很马虎,但它应该是可靠的;纯文本文件永远不会包含有效的 gzip 压缩数据。

You could pipe it through zcat, and if it fails, use the string as is. Sloppy I know, but it ought to be reliable; a plain text file would never contain valid gzipped data.

傲世九天 2024-11-07 20:58:14

符合标准的 HTTP 响应将包含 Content-Encoding: 或 Transfer-Encoding: 标头,为压缩响应指定“gzip”,从而无需通过查看幻数进行猜测。不幸的是,许多网站的这些标题都是错误的。

Standards-compliant HTTP responses will contain a Content-Encoding: or Transfer-Encoding: header specifying "gzip" for compressed responses, eliminating the need to guess by looking at magic number. Unfortunately, lots of sites get these headers wrong, though.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文