当前位置：文江博客话题详情

是否可以使用 Linux 命令从 HTTP 服务器仅读取前 N 个字节？

发布于 2024-11-03 08:37:53 字数 703 浏览 2 评论 0原文

给定 URL http://www.example.com，我们可以读取第一个页内有 N 个字节？

使用wget，我们可以下载整个页面。
使用curl，有-r，0-499指定前500个字节。看来问题已经解决了。
<块引用>
您还应该注意，许多 HTTP/1.1 服务器没有启用此功能，因此当您尝试获取范围时，您将获得整个文档。
在Python中使用urlib。类似的问题这里，但是根据康斯坦丁的评论，这是真的吗？
<块引用>
上次我尝试这种技术时失败了，因为实际上不可能从 HTTP 服务器读取指定数量的数据，即您隐式读取所有 HTTP 响应，然后才从中读取前 N 个字节。所以最后你下载了整个 1Gb 的恶意响应。

那么，实际中我们如何从HTTP服务器读取前N个字节呢？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

青衫负雪 2024-11-10 08:37:53

您可以通过以下curl命令本地完成此操作（无需下载整个文档）。根据curl手册页：

范围
HTTP 1.1 引入了字节范围。使用此功能，客户端可以请求仅获取指定文档的一个或多个子部分。 卷曲
使用 -r 标志支持这一点。
获取文档的前 100 个字节：
    卷曲-r 0-99 http://www.get.this/

获取文档的最后 500 字节：  
    卷曲-r -500 http://www.get.this/

`curl` 还支持 FTP 文件的简单范围。
那么你只能指定开始和停止位置。

使用 FTP 获取文档的前 100 个字节：
    卷曲-r 0-99 ftp://www.get.this/README

即使将 Java Web 应用程序部署到 GigaSpaces，它也适用于我。

You can do it natively by the following curl command (no need to download the whole document). According to the curl man page:

RANGES
HTTP 1.1 introduced byte-ranges. Using this, a client can request to get only one or more subparts of a specified document. curl
supports this with the -r flag.
Get the first 100 bytes of a document:
    curl -r 0-99 http://www.get.this/

Get the last 500 bytes of a document:  
    curl -r -500 http://www.get.this/

`curl` also supports simple ranges for FTP files as well.
Then you can only specify start and stop position.

Get the first 100 bytes of a document using FTP:
    curl -r 0-99 ftp://www.get.this/README

It works for me even with a Java web app deployed to GigaSpaces.

回复收藏 0 原文

柠檬色的秋千 2024-11-10 08:37:53

curl <url> | head -c 499

或

curl <url> | dd bs=1 count=499

应该做的

还有更简单的实用程序，可能具有边界可用性，例如

    netcat host 80 <<"HERE" | dd count=499 of=output.fragment
GET /urlpath/query?string=more&bloddy=stuff

HERE

GET /urlpath/query?string=more&bloddy=stuff

curl <url> | head -c 499

curl <url> | dd bs=1 count=499

should do

Also there are simpler utils with perhaps borader availability like

    netcat host 80 <<"HERE" | dd count=499 of=output.fragment
GET /urlpath/query?string=more&bloddy=stuff

HERE

GET /urlpath/query?string=more&bloddy=stuff

回复收藏 0 原文

白鸥掠海 2024-11-10 08:37:53

您还应该知道，许多
HTTP/1.1服务器没有这个
启用该功能，这样当您
尝试获得一个范围，你会相反
获取整个文档。

无论如何，您都必须获取整个网络，因此您可以使用卷曲获取网络并将其通过管道传输到头部。

头
c, --bytes=[-]N
打印每个文件的前 N 个字节；以“-”开头，打印全部
但每个文件的最后 N 个字节

回复收藏 0 原文

爱的故事 2024-11-10 08:37:53

我来到这里寻找一种方法来计算服务器的处理时间，我认为我可以通过告诉curl 在 1 个字节或其他内容后停止下载来测量时间。

对我来说，更好的解决方案是执行 HEAD 请求，因为这通常会让服务器正常处理请求，但不会返回任何响应正文：

time curl --head <URL>

I came here looking for a way to time the server's processing time, which I thought I could measure by telling curl to stop downloading after 1 byte or something.

For me, the better solution turned out to be to do a HEAD request, since this usually lets the server process the request as normal but does not return any response body: