HTMLAgilityPack 和加载超时

发布于 2024-11-05 02:06:52 字数 451 浏览 3 评论 0原文

我在服务器上的解析器中使用 HTMLAgilityPack,但我正在解析的网站之一遇到问题:每天早上 6 点左右,他们往往会关闭服务器进行维护,这会导致无法正常运行HTMLWeb 的 Load() 方法,并使我的应用程序崩溃。你们中有人有更安全的方法将网站加载到 HTMLAgilityPack 中,或者有某种方法可以在 C# 中进行错误检查以防止我的应用程序崩溃吗? (我的 C# 有点生疏了)。这是我现在的代码:

HtmlWeb webGet = new HtmlWeb();
HtmlDocument document = webGet.Load(dealsiteLink); //The Load() method here stalls the program because it takes 1 or 2 minutes before it realizes the website is down

谢谢!

I'm using HTMLAgilityPack in a parser that I have up on a server, but I'm having issues with one of the websites that I'm parsing: Every day around 6am they tend to shut down their servers for maintenance, which throws off the Load() method for HTMLWeb, and makes my app crash. Do any of you guys have a more secure way of loading a website into HTMLAgilityPack, or maybe some way to do error checking in C# to prevent my app from crashing? (my c# is a little rusty). Here is my code right now:

HtmlWeb webGet = new HtmlWeb();
HtmlDocument document = webGet.Load(dealsiteLink); //The Load() method here stalls the program because it takes 1 or 2 minutes before it realizes the website is down

Thank you!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

神回复 2024-11-12 02:06:52

只需用 try-catch 包围调用即可:

HtmlWeb webGet = new HtmlWeb();

HtmlDocument document;
try
{
    document = webGet.Load(dealsiteLink); 
}
catch (WebException ex)
{
    // Logic to retry (maybe in 10 minutes) goes here
}

确切的重试逻辑将取决于应用程序的结构 - 您可能会发现 try-catch 块需要放在应用程序需求的更高位置去比这更高的地方。

我认为 WebException 是您应该捕获的异常,但我无法确定,因为我找不到文档。您可能会发现还需要捕获TimeoutException

Just surround the call with a try-catch:

HtmlWeb webGet = new HtmlWeb();

HtmlDocument document;
try
{
    document = webGet.Load(dealsiteLink); 
}
catch (WebException ex)
{
    // Logic to retry (maybe in 10 minutes) goes here
}

The exact retry logic will depend on how your application is structured - you will probably find that the try-catch block needs to be placed higher up in your application needs to go much higher up than this.

I think WebException is the exception you should catch, but I can't be sure because I can't find the documentation. You might find that you also need to catch TimeoutException.

独闯女儿国 2024-11-12 02:06:52

尝试在网站主页上执行 WebRequest.GetReponse 并捕获 WebException,如果收到 WebException 可能会给予一些时间并重试,直到收到响应,一旦收到响应则继续使用 HtmlAgilityPack 的加载方法。

检查此

http://msdn.microsoft.com /en-us/library/system.net.webrequest.getresponse.aspx#Y700

Try doing a WebRequest.GetReponse on the websites homepage and catch a WebException, if you get WebException may be give some time and try again until you get a response back, once you get a response then proceed with HtmlAgilityPack's load method.

Check this

http://msdn.microsoft.com/en-us/library/system.net.webrequest.getresponse.aspx#Y700

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文