Html Agility Pack 无法使用 xpath 找到列表选项

发布于 2024-11-11 16:03:22 字数 1850 浏览 5 评论 0原文

这与我之前的问题有关,但似乎我还有另一个极端情况,其中 Html Agility Pack 无法按预期工作。

这是 Html(精简到要点,并删除了敏感信息):

<html>
<select id="one-time-payment-form:vendor-select-supplier">
    <option value="1848">Frarma Express</option>
    <option value="2119">Maderas Garcia</option>
    <option value="1974">Miaris, S.A.</option>
    <option value="3063">Ricoh Panama</option>
    <option value="3840">UNO EXPRESS</option>
    <option value="68">Garrett Blaser Gretsch</option>
    <option value="102">Oriel Antonio Grau</option>
</select>
</html>

这是代码:

const string xpath = "//*[contains(@id, 'one-time-payment-form:vendor-select-')]/option[contains(text(), 'UNO EXPRESS')]";
var driver = new FirefoxDriver(new FirefoxProfile()) { Url = "PATH_TO_FILE_CONTAINING_HTML_SHOWN_ABOVE" };
Thread.Sleep(2000);

//Can WebDriver find it?
var e = driver.FindElementByXPath(xpath);
Console.WriteLine(e!=null ? "WebDriver success" : "WebDriver failure");

//Can Html Agility Pack find it?
var source = driver.PageSource;
var htmlDoc = new HtmlDocument { OptionFixNestedTags = true };
HtmlNode.ElementsFlags.Remove("form");
htmlDoc.LoadHtml(source);
var nodes = htmlDoc.DocumentNode.SelectNodes(xpath);
Console.WriteLine(nodes!=null ? "Html Agility Pack success" : "Html Agility Pack failure");

driver.Quit();

当我运行代码时,控制台显示:

WebDriver success
Html Agility Pack failure

很明显 WebDriver 定位该项目没有问题 @XPath //*[contains (@id, 'one-time- payment-form:vendor-select-')]/option[contains(text(), 'UNO EXPRESS')],但 Html Agility Pack 不能。

有什么想法吗?

This is related to my previous question, but it seems I have another corner case where Html Agility Pack doesn't work as expected.

Here's the Html (stripped down to the essentials, and sensitive information removed):

<html>
<select id="one-time-payment-form:vendor-select-supplier">
    <option value="1848">Frarma Express</option>
    <option value="2119">Maderas Garcia</option>
    <option value="1974">Miaris, S.A.</option>
    <option value="3063">Ricoh Panama</option>
    <option value="3840">UNO EXPRESS</option>
    <option value="68">Garrett Blaser Gretsch</option>
    <option value="102">Oriel Antonio Grau</option>
</select>
</html>

And here's the code:

const string xpath = "//*[contains(@id, 'one-time-payment-form:vendor-select-')]/option[contains(text(), 'UNO EXPRESS')]";
var driver = new FirefoxDriver(new FirefoxProfile()) { Url = "PATH_TO_FILE_CONTAINING_HTML_SHOWN_ABOVE" };
Thread.Sleep(2000);

//Can WebDriver find it?
var e = driver.FindElementByXPath(xpath);
Console.WriteLine(e!=null ? "WebDriver success" : "WebDriver failure");

//Can Html Agility Pack find it?
var source = driver.PageSource;
var htmlDoc = new HtmlDocument { OptionFixNestedTags = true };
HtmlNode.ElementsFlags.Remove("form");
htmlDoc.LoadHtml(source);
var nodes = htmlDoc.DocumentNode.SelectNodes(xpath);
Console.WriteLine(nodes!=null ? "Html Agility Pack success" : "Html Agility Pack failure");

driver.Quit();

When I run the code, the console reads:

WebDriver success
Html Agility Pack failure

So clearly WebDriver has no problem locating the item @XPath //*[contains(@id, 'one-time-payment-form:vendor-select-')]/option[contains(text(), 'UNO EXPRESS')], but Html Agility Pack cannot.

Any ideas?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

ˉ厌 2024-11-18 16:03:22

这是“设计使然”。 OPTION 和 FORM 的想法是一样的。由于历史原因,Html Agility Pack 对某些标签的处理方式有所不同。在 HTML 3.2 时代,OPTION 并不总是关闭的,而在 HTML 3.2 中,它不是必需的。

尝试添加这个:

HtmlNode.ElementsFlags.Remove("option");

This is "by design". It's the same idea for OPTION and FORM. Some tags are handled differently because of historical reasons by the Html Agility Pack. Back then in HTML 3.2 time, OPTION was not always closed, and in HTML 3.2, it's not required.

Try adding this:

HtmlNode.ElementsFlags.Remove("option");
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文