在 HTTP GET 上,头是否先出现,然后是正文(使用 apache 的 HttpClient 库)?

发布于 2024-10-10 08:26:08 字数 247 浏览 5 评论 0原文

我在做什么: 尝试熟悉 HTTP 协议及其实现。

我的问题: 在读取实际正文之前是否可以获取 HTTP GET 的标头(特别是内容长度)?据我了解,我可以使用 HEAD 调用来实现此目的,但我正在尝试查看是否需要它。具体来说,在 HttpClient 公共库(以及我猜的大多数其他库)中,有一些方法可以将响应正文作为流检索。该流是在传入时从套接字读取还是已经被缓冲?

谢谢!

What I am doing:
Trying to get familiar with the HTTP protocol and it's implementation.

My question:
Is it possible to get the headers (specifically Content-Length) of an HTTP GET before you read the actual body? From what I understand, I could use a HEAD call for that purpose, but am trying to see if it is even required. Specifically, in the HttpClient commons library (and most other I guess), there are methods to retrieve the response body as a stream. Is this stream being read from the socket as it's coming in or has it already been buffered?

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

鸠魁 2024-10-17 08:26:08

HTTP 服务器首先发送标头。您使用的特定 HTTP 客户端是否提前公开它们是另一回事。

The HTTP server sends the headers first. Whether the particular HTTP client you're using exposes them ahead of time is a separate matter.

尾戒 2024-10-17 08:26:08

简而言之,是的,标题首先出现。总是。

正如您提到的 HEAD 请求将允许客户端仅获取标头而不获取内容,但是通过 GET 请求标头始终可用,并且在实际内容之前到达。然而,内容长度字段对于动态内容是可选的,因此它可能永远不可用。

根据您的实现,流可能会也可能不会被缓冲。但在大多数情况下,当通过流运算符读取时,您会收到一些小的缓冲单元(通常是一行)中的内容。

In short, yes the headers come in first. Always.

As you mention HEAD request will allow the client to fetch only the headers and no content, however via a GET request the headers are always available, and arrive before the actual content. However the content-length field is optional for dynamic content, so it may not ever be available.

Depending on your implementation the stream may or may not be buffered at all. But in most cases, when reading via the stream operators, you receive the content in some small buffered unit, usually a line.

爱*していゐ 2024-10-17 08:26:08

如果Content-Length头字段在响应头中发送,您可以在响应正文之前读取它。但有时(动态生成的内容,无缓冲)可以省略(rfc )。当您的浏览器在下载时无法显示进度条时就会发生这种情况。

If the content-length header field is sent in the response header, you can read it before the response body. But sometimes (dynamically generatted content, unbuffered) it can be omitted (rfc). This is what happens when your browser can't display a progress bar while downloading.

牵你手 2024-10-17 08:26:08

httpclient commons 类直接从套接字流式传输,因此不会浪费内存。如果传递了内容长度标头,则无需读取整个响应正文即可读取它。

请参阅http://hc.apache.org/httpclient-3.x/features。 html

The httpclient commons class streams directly from the socket so it does not waste memory. If the content-length header is passed, you can read it without reading in the whole response body.

See http://hc.apache.org/httpclient-3.x/features.html

醉殇 2024-10-17 08:26:08

标头始终位于正文之前,状态位于标头之前。请参阅 HTTP 规范的“4.1 消息类型”部分。

通用 HTTP 消息的格式为:

        generic-message = start-line
                          *(message-header CRLF)
                          CRLF
                          [ message-body ]
        start-line      = Request-Line | Status-Line

对于响应,起始行包含状态行。它后面必须是标头,然后是最终的消息正文。

The headers always come before the body, and the status before the headers. See the HTTP specification, section "4.1 Message Types".

The format of a generic HTTP message is :

        generic-message = start-line
                          *(message-header CRLF)
                          CRLF
                          [ message-body ]
        start-line      = Request-Line | Status-Line

For a response, start-line contains the Status-Line. It must be followed by the headers, and then the eventual message body.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文