对于带有锚点的链接，HttpClient 出现 400 错误

发布于 2024-10-03 17:17:13 字数 532 浏览 6 评论 0原文

这是我的代码：

DefaultHttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet(url);
HttpResponse response = client.execute(request);

这适用于迄今为止我尝试过的每个网址，除了一些包含锚点的网址。其中一些锚定网址返回 400。奇怪的是，并非所有链接都包含锚定，其中很多链接都可以正常工作。

不幸的是，我必须非常笼统，因为我无法在这里提供具体的网址。

这些链接完全有效，并且在任何浏览器中都可以正常工作，但 HttpClient 在尝试该链接时返回 400。如果我移除锚点，它就会起作用。

有什么想法要寻找什么吗？

例如：http://www.somedomain.com/somedirectory/somepage#someanchor

再次对泛型感到抱歉

编辑：我应该提到这是针对 Android 的。

原文

Here is my code:

DefaultHttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet(url);
HttpResponse response = client.execute(request);

This works for every url I have tried so far, except for some urls that contain an anchor. Some of these anchored urls return a 400. The weird thing is that it isn't all links that contain an anchor, a lot of them work just fine.

Unfortunately, I have to be really general as I can't provide the specific urls here.

The links are completely valid and work just fine in any browser, but the HttpClient returns a 400 when trying the link. If I remove the anchor it will work.

Any ideas what to look for?

For example: http://www.somedomain.com/somedirectory/somepage#someanchor

Sorry again for the generics

EDIT: I should mention this is for Android.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

孤蝉 2024-10-10 17:17:13

您在网址中使用锚点的方式不正确。
当我们执行“获取”时，我们需要获取整个资源（页面）。锚点只是一个标记位置的标签，通常浏览器会在页面加载后滚动到锚点的位置。在特定锚点“获取”页面是没有意义的 - 必须获取整个页面。

您的结果不一致可能是因为某些网络服务器忽略了锚点组件，而其他服务器正在纠正您的错误。

解决方案只是在运行代码之前删除 url 的 #anchor 部分。

回复收藏 0 原文

被你宠の有点坏 2024-10-10 17:17:13

正如 @Greg Sansom 所说，URL 不应与锚/片段一起发送。 URL 的片段部分与服务器无关。

以下是来自 HTTP 相关部分的预期 URL 语法1.1 规范：

    http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]

注意：语法中没有fragment部分。

如果您确实发送片段，那么会发生什么显然是服务器实现特定的。我希望您会看到各种各样的响应：

某些服务器会默默地剥离/忽略片段部分。（这就是您期望发生的情况）。
某些服务器可能会将其视为请求错误并以 400 进行响应。某些
服务器可能会错误地将片段视为路径或查询的一部分，并为您提供 404 或其他响应，具体取决于片段使请求变得“混乱”的程度。服务器。
某些服务器实际上可能会为该片段赋予特定的含义。（这对我来说是一件愚蠢的事情，但你永远不知道......）

IMO，最明智的解决方案是在实例化 HttpGet 对象之前将其从 URL 中删除。

FOLLOWUP

从 URL 字符串中删除片段的推荐方法是将其转换为 java.net.URL 或 java.net.URI实例，提取相关组件，使用它们创建一个新的 java.net.URL 或 java.net.URI 实例（当然省略片段），最后将其变回字符串。

但我认为，如果您可以安全地假设您的 URL 都是有效的绝对 HTTP 或 HTTPS URL，那么以下内容也应该有效。

    int pos = url.indexOf("#");
    String strippedUrl = (pos >= 0) ? url.substring(0, pos) : url;

As @Greg Sansom says, the URL should not be sent with an anchor / fragment. The fragment part of the URL is not relevant to the server.

Here's the expected URL syntax from relevant part of the HTTP 1.1 specification:

    http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]

Note: there is no fragment part in the syntax.

What happens if you do send a fragment clearly is server implementation specific. I expect that you will see a variety of responses:

Some servers will silently strip / ignore the fragment part. (This is what you are expecting to happen).
Some servers might treat this as a request error and respond with a 400.
Some servers might mistakenly treat the fragment as part of the path or query, and give you a 404 or some other response, depending on how "confused" the fragment makes the server.
Some servers might actually imbue the fragment with a specific meaning. (This strikes me as a stupid thing to do, but you never know ...)

IMO, the most sensible solution is to strip it from the URL before instantiating the HttpGet object.

FOLLOWUP

The recommended way to remove a fragment from a URL string is to turn it into a java.net.URL or java.net.URI instance, extract the relevant components, use these to create a new java.net.URL or java.net.URI instance (leaving out the fragment of course), and finally turn it back into a String.

But I think that the following should also work, if you can safely assume that your URLs are all valid absolute HTTP or HTTPS URLs.

    int pos = url.indexOf("#");
    String strippedUrl = (pos >= 0) ? url.substring(0, pos) : url;

回复收藏 0 原文

愿得七秒忆 2024-10-10 17:17:13

String user_url2="uhttp://www.somedomain.com/somedirectory/somepage#someanchor";

    HttpClient client = new DefaultHttpClient();
    HttpGet siteRequest = new HttpGet(user_url2);
    StringBuilder sb = new StringBuilder();

    HttpResponse httpResponse;

    try {
        httpResponse = client.execute(siteRequest);
        HttpEntity entity = httpResponse.getEntity();
        InputStream in = entity.getContent();

        String line = null;
        BufferedReader reader = new BufferedReader(
                new InputStreamReader(in));
        while ((line = reader.readLine()) != null)

        {

            sb.append(line);

        }

        result = sb.toString();

结果字符串将显示 url 值

String user_url2="uhttp://www.somedomain.com/somedirectory/somepage#someanchor";

    HttpClient client = new DefaultHttpClient();
    HttpGet siteRequest = new HttpGet(user_url2);
    StringBuilder sb = new StringBuilder();

    HttpResponse httpResponse;

    try {
        httpResponse = client.execute(siteRequest);
        HttpEntity entity = httpResponse.getEntity();
        InputStream in = entity.getContent();

        String line = null;
        BufferedReader reader = new BufferedReader(
                new InputStreamReader(in));
        while ((line = reader.readLine()) != null)

        {

            sb.append(line);

        }

        result = sb.toString();

result string will display url value

回复收藏 0 原文