HtmlAgilityPack.HtmlDocument Cookie
这与脚本内(可能在脚本标签内)设置的cookie有关。
System.Windows.Forms.HtmlDocument
执行这些脚本,并且可以通过其 Cookies 检索 cookie 集(如 document.cookie=etc...
) em> 属性。
我假设 HtmlAgilityPack.HtmlDocument 不会执行此操作(执行)。我想知道是否有一种简单的方法来模拟 System.Windows.Forms.HtmlDocument 功能(cookie 部分)。
有人吗?
This pertains to cookies set inside a script (maybe inside a script tag).
System.Windows.Forms.HtmlDocument
executes those scripts and the cookies set (like document.cookie=etc...
) can be retrieved through its Cookies property.
I assume HtmlAgilityPack.HtmlDocument
doesn't do this (execution). I wonder if there is an easy way to emulate the System.Windows.Forms.HtmlDocument
capabilities (the cookies part).
Anyone?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
当我需要同时使用Cookies和HtmlAgilityPack,或者只是创建自定义请求(例如,设置
User-Agent
属性等)时,这就是我所做的:WebQuery
...
我们需要在这个方法中做什么?
好吧,使用 HttpWebRequest 和 HttpWebResponse,生成 http 请求手动(Internet 上有几个关于如何执行此操作的示例),使用接收流的构造函数创建
HtmlDocument
类的实例。我们必须使用什么流?好吧,返回的是:
如果您使用HttpWebRequest进行查询,您可以轻松设置
CookieContainer
它的属性到您每次访问新页面之前声明的变量,这样您访问的站点设置的所有 cookie 都将正确存储在您在WebQuery
类,考虑您仅使用WebQuery
类的一个实例。希望您觉得这个解释有用。考虑一下,使用它,您可以做任何您想做的事情,无论 HtmlAgilityPack 是否支持它。
When I need to use Cookies and HtmlAgilityPack together, or just create custom requests (for example, set the
User-Agent
property, etc), here is what I do:WebQuery
...
What do we need to do inside this method?
Well, using HttpWebRequest and HttpWebResponse, generate the http request manually (there are several examples of how to do this on Internet), create an instance of a
HtmlDocument
class using the constructor that receives an stream.What stream do we have to use? Well, the one returned by:
If you use HttpWebRequest to make the query, you can easily set the
CookieContainer
property of it to the variable you declared before everytime you access a new page, and that way all cookies set by the sites you access will be properly stored in theCookieContainer
variable you declared in yourWebQuery
class, taking in count you're using only one instance of theWebQuery
class.Hope you find useful this explanation. Take in count that using this, you can do whatever you want, no matter if HtmlAgilityPack supports it or not.
我还使用了 Rohit Agarwal 的 BrowserSession 类以及 HtmlAgilityPack。
但对我来说,随后的“获取函数”调用不起作用,因为每次都会设置新的 cookie。
这就是为什么我自己添加了一些功能。 (我的解决方案距离完美还有很长的路要走 - 这只是一个快速而肮脏的修复)但对我来说它有效,如果你不想花很多时间来调查 BrowserSession 类是我所做的:
添加/修改的功能如下:
它的作用:它基本上保存了初始的 cookie" Post-Response”并将相同的 CookieContainer 添加到稍后调用的请求中。我不完全理解为什么它在初始版本中不起作用,因为它在 AddCookiesTo 函数中以某种方式执行相同的操作。 (if (Cookies != null && Cookies.Count > 0) request.CookieContainer.Add(Cookies);)
无论如何,有了这些附加功能,它现在应该可以正常工作了。
它可以这样使用:
所有后续调用都应该使用:
我希望当您遇到同样的问题时它会有所帮助。
I also worked with Rohit Agarwal's BrowserSession class together with HtmlAgilityPack.
But for me subsequent calls of the "Get-function" didn't work, because every time new cookies have been set.
That's why I added some functions by my own. (My solution is far a way from beeing perfect - it's just a quick and dirty fix) But for me it worked and if you don't want to spent a lot of time in investigating BrowserSession class here is what I did:
The added/modified functions are the following:
What it does: It basically saves the cookies from the initial "Post-Response" and adds the same CookieContainer to the request called later. I do not fully understand why it was not working in the initial version because it somehow does the same in the AddCookiesTo-function. (if (Cookies != null && Cookies.Count > 0) request.CookieContainer.Add(Cookies);)
Anyhow, with these added functions it should work fine now.
It can be used like this:
all subsequent calls should use:
I hope it helps when you're facing the same problem.