Foreach 循环需要很长时间才能跳出
抓取一个网页,包含大约 250 个表格分区。 使用 WatiN 和 WatinCSSSelectors
首先,我选择具有属性“width=90%”的所有 td 标签:
var allMainTDs = browser.CssSelectAll("td[width=\"90%\"]");
然后我创建一个 foreach 循环,将 var 的内容粘贴到列表中。 int 用于检查循环当前所在的 td 标记。
List<Element> eletd = new List<Element>();
int i = 0;
foreach (Element td in allMainTDs)
{
eletd.Add(td);
i++;
Console.WriteLine(i);
}
它很快就达到了第 250 个标签。但随后需要大约 6 分钟(使用 StopWatch 对象计时)才能进入下一条语句。这里发生了什么?
Scraping a webpage, containing about 250 table divisions.
Using WatiN and WatinCSSSelectors
First I select all td tags with attribute 'width=90%':
var allMainTDs = browser.CssSelectAll("td[width=\"90%\"]");
Then I make a foreach loop, stick the contents of the var into a List. The int is there to check what td tag the loop is currently at.
List<Element> eletd = new List<Element>();
int i = 0;
foreach (Element td in allMainTDs)
{
eletd.Add(td);
i++;
Console.WriteLine(i);
}
It reaches the 250th tag fairly quickly. But then it takes about 6 minutes (timed with a StopWatch object) to go onto the next statement. What is happening here?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
你可以试试这个:
You could try this:
foreach
循环大致相当于以下代码(不完全是,但足够接近):您描述的行为指向此代码的清理部分。 CssSelectAll 调用结果的枚举器可能具有繁重的 Dispose 方法。您可以通过用上面的代码替换循环来确认这一点,并省略finally块,或者设置断点来确认
Dispose
需要永远运行。A
foreach
loop is roughly equivalent to the following code (not exactly, but close enough):The behavior you describe points to the cleanup portion of this code. It's possible that the enumerator for the result of the
CssSelectAll
call has a heavy Dispose method. You could confirm this by replacing your loop with something like the code above, and omit the finally block, or set breakpoints to confirmDispose
takes forever to run.如果您在 .net 4.0 下并且执行环境允许并行性,您可能应该尝试
If you under .net 4.0 and you execution environment allows for parallelism, you may be should try the