powershell httpwebrequest GET 方法 cookiecontainer 问题?

发布于 2024-10-27 06:00:32 字数 1942 浏览 6 评论 0原文

我正在尝试抓取一个具有用户身份验证的网站。我可以通过 POST 发送我的登录信息并存储 cookie。但是,登录后尝试访问受保护页面时出现 403 错误。

$url = "https://some_url"

$CookieContainer = New-Object System.Net.CookieContainer

$postData = "User=UserName&Password=Pass"

$buffer = [text.encoding]::ascii.getbytes($postData)

[net.httpWebRequest] $req = [net.webRequest]::create($url)
$req.method = "POST"
$req.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
$req.Headers.Add("Accept-Language: en-US")
$req.Headers.Add("Accept-Encoding: gzip,deflate")
$req.Headers.Add("Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7")
$req.AllowAutoRedirect = $false
$req.ContentType = "application/x-www-form-urlencoded"
$req.ContentLength = $buffer.length
$req.TimeOut = 50000
$req.KeepAlive = $true
$req.Headers.Add("Keep-Alive: 300");
$req.CookieContainer = $CookieContainer
$reqst = $req.getRequestStream()
$reqst.write($buffer, 0, $buffer.length)
$reqst.flush()
$reqst.close()
[net.httpWebResponse] $res = $req.getResponse()
$resst = $res.getResponseStream()
$sr = new-object IO.StreamReader($resst)
$result = $sr.ReadToEnd()
$res.close()



$url2 = "https://some_url/protected_page"

[net.httpWebRequest] $req2 = [net.webRequest]::create($url2)
$req2.Method = "GET"
$req2.Accept = "text/html"
$req2.AllowAutoRedirect = $false
$req2.CookieContainer = $CookieContainer
$req2.TimeOut = 50000
[net.httpWebResponse] $res2 = $req2.getResponse()
$resst = $res2.getResponseStream()
$sr = new-object IO.StreamReader($resst)
$result = $sr.ReadToEnd()

解决方法:因此,在尝试了几乎所有方法之后,我最终尝试了一些不同的方法,并且它确实有效。

发布登录信息并获取会话 cookie 后,我使用 webclient 通过将 cookie 字符串添加到标头来访问安全页面。

$web = new-object net.webclient
$web.Headers.add("Cookie", $res.Headers["Set-Cookie"])
$result = $web.DownloadString("https://secure_url")

最酷的事情之一是 webclient 保存 cookie。要访问另一个安全页面,您只需调用 $web.downloadstring("https://another_secure_url") :)

I'm trying to scrape a website that has user authentication. I am able to do a POST to send my login and stores a cookie. However, after the login I get a 403 error when trying to access the protected page.

$url = "https://some_url"

$CookieContainer = New-Object System.Net.CookieContainer

$postData = "User=UserName&Password=Pass"

$buffer = [text.encoding]::ascii.getbytes($postData)

[net.httpWebRequest] $req = [net.webRequest]::create($url)
$req.method = "POST"
$req.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
$req.Headers.Add("Accept-Language: en-US")
$req.Headers.Add("Accept-Encoding: gzip,deflate")
$req.Headers.Add("Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7")
$req.AllowAutoRedirect = $false
$req.ContentType = "application/x-www-form-urlencoded"
$req.ContentLength = $buffer.length
$req.TimeOut = 50000
$req.KeepAlive = $true
$req.Headers.Add("Keep-Alive: 300");
$req.CookieContainer = $CookieContainer
$reqst = $req.getRequestStream()
$reqst.write($buffer, 0, $buffer.length)
$reqst.flush()
$reqst.close()
[net.httpWebResponse] $res = $req.getResponse()
$resst = $res.getResponseStream()
$sr = new-object IO.StreamReader($resst)
$result = $sr.ReadToEnd()
$res.close()



$url2 = "https://some_url/protected_page"

[net.httpWebRequest] $req2 = [net.webRequest]::create($url2)
$req2.Method = "GET"
$req2.Accept = "text/html"
$req2.AllowAutoRedirect = $false
$req2.CookieContainer = $CookieContainer
$req2.TimeOut = 50000
[net.httpWebResponse] $res2 = $req2.getResponse()
$resst = $res2.getResponseStream()
$sr = new-object IO.StreamReader($resst)
$result = $sr.ReadToEnd()

WORKAROUND: So after trying almost everything I ended up trying something different and it actually works.

After posting the login and getting the session cookie, I use webclient to access the secure page by adding the cookie string to the headers.

$web = new-object net.webclient
$web.Headers.add("Cookie", $res.Headers["Set-Cookie"])
$result = $web.DownloadString("https://secure_url")

One of the cool thing about this is that webclient saves the cookie. To access another secure page, you can just call $web.downloadstring("https://another_secure_url") :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

傾城如夢未必闌珊 2024-11-03 06:00:33

我会使用 IE 自动化。这样就不必使用 cookie、标头等。更容易。

I would use IE automation. With this don't have to work with cookies, headers etc. Much easier.

黒涩兲箜 2024-11-03 06:00:32

我发现由于 cookie 可以附加附加信息(例如 URL 或仅 HTTP),因此 $res.Headers["Set-Cookie"] 对我不起作用。但是使用 $CookieContainer 变量,您可以轻松地将其更改为使用 GetCookieHeader(url),这将删除额外的信息并为您留下格式正确的 cookie 字符串:

$web = new-object net.webclient
$web.Headers.add("Cookie", $CookieContainer.GetCookieHeader($url))
$result = $web.DownloadString($url)

I found that since cookies can have additional information attached (like the URL or HTTP-only), the $res.Headers["Set-Cookie"] didn't work for me. But using your $CookieContainer variable, you can easily change it to use GetCookieHeader(url), which will strip out the extra information and leave you with a properly formatted cookie string:

$web = new-object net.webclient
$web.Headers.add("Cookie", $CookieContainer.GetCookieHeader($url))
$result = $web.DownloadString($url)
爱情眠于流年 2024-11-03 06:00:32

人们一直在要求完整的申请表,这里有

$url = "https://some_url"

$CookieContainer = New-Object System.Net.CookieContainer

$postData = "User=UserName&Password=Pass"

$buffer = [text.encoding]::ascii.getbytes($postData)

[net.httpWebRequest] $req = [net.webRequest]::create($url)
$req.method = "POST"
$req.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
$req.Headers.Add("Accept-Language: en-US")
$req.Headers.Add("Accept-Encoding: gzip,deflate")
$req.Headers.Add("Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7")
$req.AllowAutoRedirect = $false
$req.ContentType = "application/x-www-form-urlencoded"
$req.ContentLength = $buffer.length
$req.TimeOut = 50000
$req.KeepAlive = $true
$req.Headers.Add("Keep-Alive: 300");
$req.CookieContainer = $CookieContainer
$reqst = $req.getRequestStream()
$reqst.write($buffer, 0, $buffer.length)
$reqst.flush()
$reqst.close()
[net.httpWebResponse] $res = $req.getResponse()
$resst = $res.getResponseStream()
$sr = new-object IO.StreamReader($resst)
$result = $sr.ReadToEnd()
$res.close()


$web = new-object net.webclient
$web.Headers.add("Cookie", $res.Headers["Set-Cookie"])
$result = $web.DownloadString("https://secure_url")

People have been asking for the complete application, here you have it

$url = "https://some_url"

$CookieContainer = New-Object System.Net.CookieContainer

$postData = "User=UserName&Password=Pass"

$buffer = [text.encoding]::ascii.getbytes($postData)

[net.httpWebRequest] $req = [net.webRequest]::create($url)
$req.method = "POST"
$req.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
$req.Headers.Add("Accept-Language: en-US")
$req.Headers.Add("Accept-Encoding: gzip,deflate")
$req.Headers.Add("Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7")
$req.AllowAutoRedirect = $false
$req.ContentType = "application/x-www-form-urlencoded"
$req.ContentLength = $buffer.length
$req.TimeOut = 50000
$req.KeepAlive = $true
$req.Headers.Add("Keep-Alive: 300");
$req.CookieContainer = $CookieContainer
$reqst = $req.getRequestStream()
$reqst.write($buffer, 0, $buffer.length)
$reqst.flush()
$reqst.close()
[net.httpWebResponse] $res = $req.getResponse()
$resst = $res.getResponseStream()
$sr = new-object IO.StreamReader($resst)
$result = $sr.ReadToEnd()
$res.close()


$web = new-object net.webclient
$web.Headers.add("Cookie", $res.Headers["Set-Cookie"])
$result = $web.DownloadString("https://secure_url")
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文