tidhttp 获取程序不适用于某些网址

发布于 2024-08-24 10:55:12 字数 916 浏览 8 评论 0原文

我在 delphi Tidhttp 组件中遇到一个问题,其中 GET 过程无法获取特定的 url,但在其他 url 上它正在工作。示例:此代码返回一个空的response.datastring。仅对于此 error_url,Response.datastring 为空,但对于其他 url,response.datastring 具有值。我需要获取该 error_url 的内容来解决这个问题。

procedure TForm1.Button1Click(Sender: TObject);
var
  Response : TStringStream;
  error_url: string;
begin
  error_url := 'http://www.chefscatalog.com/international/home.aspx'; //error url
  Response := TStringStream.Create;
  try
    IdHTTP1.Get(error_url, Response);
    Memo1.Text := Response.DataString;
  finally
    FreeAndNil(Response);
  end;
end;

顺便说一下,idHTTP1 重定向属性在这里设置为 true,因此重定向不是问题。

这是我遇到的异常: 1.http/1.1 302 找到 2. EDecompressionError 并显示消息“ZLib Error (-3)”

您可以在此链接中下载该项目的源代码(indytest.zip) http://www.yourfilelink.com/get.php?fid=534933

请帮助我。提前致谢 :)

I am encountering a problem in delphi Tidhttp component wherein the GET procedure can't fetch a specific url but on other urls it is working. Example: this code returns an empty response.datastring. Response.datastring is empty only with this error_url but with other urls the response.datastring has a value. I need to fetch the content of that error_url to fix this problem.

procedure TForm1.Button1Click(Sender: TObject);
var
  Response : TStringStream;
  error_url: string;
begin
  error_url := 'http://www.chefscatalog.com/international/home.aspx'; //error url
  Response := TStringStream.Create;
  try
    IdHTTP1.Get(error_url, Response);
    Memo1.Text := Response.DataString;
  finally
    FreeAndNil(Response);
  end;
end;

By the way idHTTP1 redirect property is set here to true so redirection is not the problem.

This is the exception I encountered:
1. http/1.1 302 Found
2. EDecompressionError with message 'ZLib Error (-3)'

You can download the source code (which is indytest.zip) of this project in this link http://www.yourfilelink.com/get.php?fid=534933

Please help me guys. Thanks in advance :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

小姐丶请自重 2024-08-31 10:55:12

原因是您尝试访问的网站正在寻找 cookie,如果未设置它,它会尝试设置它,然后执行 302 重定向回其自身。

因为您没有连接 cookie 管理器,所以您最终会陷入 302 重定向循环,因为网站会不断检查 cookie、设置然后重定向。

处理 cookie,只需一个 302 就可以正常工作。


但是,由于某种原因,Indy 似乎忽略了该站点发送的 cookie。如果我点击 http://www.google.com 我会编写一些测试代码,我得到的

New cookie: PREF
New cookie: NID
Redirecting (1) to: http://www.google.co.nz/
New cookie: PREF
New cookie: NID

是标题谷歌发送

Set-Cookie: PREF=ID=3c7e441914b902ae:TM=1268686477:LM=1268686477:S=Z-Gwqx52jK0V1rYR; expires=Wed, 14-Mar-2012 20:54:37 GMT; path=/; domain=.google.com
Set-Cookie: NID=32=vsOZvkr4AOZ7320d_OBPf2zR2jau4E6pupbOe_ZaaX4DNjahTzSV-mSA55naTk-5cXQcn7SNEp7uSxbE_cFrL9ZftGApTGZMPGKzcz3_NZE_2MYpWG5PGbwWFw9t2d_R; expires=Tue, 14-Sep-2010 20:54:37 GMT; path=/; domain=.google.com; HttpOnly

但是对于其他站点,我在调试输出中

Redirecting (1) to: http://www.chefscatalog.com/error.aspx?impsid=0
Redirecting (2) to: http://www.chefscatalog.com/error.aspx?impsid=0

一直得到这个,最多尝试 15 次。
如果我们查看站点发回的标头,

Set-Cookie: ASP.NET_SessionId=4o0bpi45evee0d45qos1uy55; path=/; HttpOnly
Set-Cookie: ChefsSite=CartID=00000000-0000-0000-0000-000000000000&cst=f4t8YpBpAAkNiRUd9BEf2luKAA%3d%3d&act=c0f2VBCSbv30F4kasnvWS5OfJQ%3d%3d&CookiesEnabled=False; expires=Wed, 14-Apr-2010 20:54:22 GMT; path=/

我会注意到该站点缺少 Set-Cookie 末尾的域,这很奇怪,但我不认为 RFC 必须这样做。如果我们查看 idCookieManager 的 AddCookie/2 方法,它需要该参数上的主机,因此它可能无法在任何不提供域的 Set-Cookie 上工作。

我已经在另外几个网站上对此进行了测试,如果 Set-Cookie 包含domain=.google.com,则一切正常;

值得注意的是,在 idHttp.OnRedirect 上,如果您查看

idHttp.Response.RawHeaders.Text

不起作用的网站的 ,您不会看到 Set-Cookies,但在起作用的网站上,您确实会看到Set-Cookies...

但是,如果我将 idhttp useragent 设置为

    Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.1) Gecko/20100122 firefox/3.6.1

(来自另一个答案),

那么它似乎可以很好地拾取 cookie 很

    New cookie: ASP.NET_SessionId
    New cookie: ChefsSite
    Redirecting (1) to: http://www.chefscatalog.com/international/home.aspx
    New cookie: ChefsSite

奇怪。

The reason is the website you are trying to hit is looking for a cookie and if it's not getting set it tries to set it, and then does a 302 redirect back to it's self.

Because you haven't hooked up a cookie manager you are ending up in a 302 redirect loop as the site keeps checking for cookie, setting and then redirecting.

Handle cookies and it will work just fine with only a single 302.


However it seems that for some reason Indy is ignoring the cookies that are being sent by this site. I whipped up some test code if I hit http://www.google.com I get

New cookie: PREF
New cookie: NID
Redirecting (1) to: http://www.google.co.nz/
New cookie: PREF
New cookie: NID

this is the headers that google send

Set-Cookie: PREF=ID=3c7e441914b902ae:TM=1268686477:LM=1268686477:S=Z-Gwqx52jK0V1rYR; expires=Wed, 14-Mar-2012 20:54:37 GMT; path=/; domain=.google.com
Set-Cookie: NID=32=vsOZvkr4AOZ7320d_OBPf2zR2jau4E6pupbOe_ZaaX4DNjahTzSV-mSA55naTk-5cXQcn7SNEp7uSxbE_cFrL9ZftGApTGZMPGKzcz3_NZE_2MYpWG5PGbwWFw9t2d_R; expires=Tue, 14-Sep-2010 20:54:37 GMT; path=/; domain=.google.com; HttpOnly

However for that other site, I get this in my debug output

Redirecting (1) to: http://www.chefscatalog.com/error.aspx?impsid=0
Redirecting (2) to: http://www.chefscatalog.com/error.aspx?impsid=0

all the way up to 15 attempts..
if we look at what headers the site sends back

Set-Cookie: ASP.NET_SessionId=4o0bpi45evee0d45qos1uy55; path=/; HttpOnly
Set-Cookie: ChefsSite=CartID=00000000-0000-0000-0000-000000000000&cst=f4t8YpBpAAkNiRUd9BEf2luKAA%3d%3d&act=c0f2VBCSbv30F4kasnvWS5OfJQ%3d%3d&CookiesEnabled=False; expires=Wed, 14-Apr-2010 20:54:22 GMT; path=/

I note there the site is missing the domain off the end of the Set-Cookie, which is weird but I don't think it's a must from the RFC. if we look at the AddCookie/2 methods of idCookieManager its wanting a host on that param so maybe it wont work on any Set-Cookies that don't give the domain.

I have tested this on a couple more sites and all work fine IF the Set-Cookie includes domain=.google.com;

It's also interesting to note that on the idHttp.OnRedirect if you look at

idHttp.Response.RawHeaders.Text

for the site that doesn't work you don't see the Set-Cookies but on the sites that do work you do see the Set-Cookies...

However, if I set idhttp useragent to

    Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.1) Gecko/20100122 firefox/3.6.1

(from another answer)

then it seems to pickup the cookies just fine

    New cookie: ASP.NET_SessionId
    New cookie: ChefsSite
    Redirecting (1) to: http://www.chefscatalog.com/international/home.aspx
    New cookie: ChefsSite

Weird.

不寐倦长更 2024-08-31 10:55:12

检查 OnRedirect 事件。由于某种原因,您被重定向到错误页面。

http://www.chefscatalog.com/error.aspx?impsid=0

反过来,它会将您重定向回同一错误页面,直到您耗尽 RedirectMaximum (15)。

更新:

一旦您被重定向到错误页面,Wizzard 将在下面解释为什么它不断地一遍又一遍地重定向回同一错误页面。曲奇饼。

您首先被重定向的原因可能是该网站无法识别(或不喜欢)您的用户代理string(在 Request 属性中)。默认情况下,它是“Mozilla/3.0(兼容;Indy 库)”。将其更改为 FireFox 使用的当前字符串,IE 或其他认可的浏览器。

我用“Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.1) Gecko/20100122 firefox/3.6.1”尝试过,它似乎工作得很好。

您可以在 Indy KB PDF 中找到更多详细信息。

Check the OnRedirect event. For some reason, you are being redirected to an error page.

http://www.chefscatalog.com/error.aspx?impsid=0

Which, in turn, redirects you back to this same error page until you exhaust your RedirectMaximum (15).

Update:

Once you are redirected to the error page, Wizzard explains below why it constantly redirects back to the same error page over and over. Cookies.

The reason you're being redirected in the first place is probably that the site doesn't recognize (or like) your user agent string (in Request property). By default, it's "Mozilla/3.0 (compatible; Indy Library)". Change it to a current string used by FireFox, IE or other recognized browser.

I tried it with "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.1) Gecko/20100122 firefox/3.6.1", and it seems to work just fine.

You can find more details in the Indy KB PDF.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文