HTML xpath 树转储？使用 Ruby Watir

发布于 2024-12-22 02:58:50 字数 324 浏览 2 评论 0原文

帮助！在使用 Watir 库小心地单步执行 irb 来控制浏览器（Firefox 和 Chrome）时，xpath 地址似乎太狡猾了，无法依赖。例如。某一时刻，余额的 xpath 似乎是一致的，所以我在脚本中使用该地址。有时它可以工作，但经常因“找不到元素”而崩溃，尽管每次我手动单步执行时，网页数据都在那里（萤火虫检查以确认）。

是的，使用依赖 Ajax 的网站，但变化不大......银行网站在访问过程中几乎保持不变。

所以问题......是否有某种方法 watir-webdriver 可以简单地以 xpath 树的形式为我提供它目前在 DOM 中看到的所有内容的长而详细的转储？会帮我排除故障。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

深居我梦 2024-12-29 02:58:50

最重要的答案是不使用 xpath，而是使用 watir，因为 UI 是要使用的。

当谈到在浏览器自动化中指定元素的方法时，总的来说，Xpath 是邪恶的，它很慢，它创建的代码通常（正如您所发现的）非常非常脆弱，并且几乎不可能阅读和理解。仅当其他方法均无效时才将其用作最后手段。

如果您使用 Watir API（如 Watir 或 Watir-webdriver），那么您希望根据元素自身的属性（例如类、名称、id、文本等）来识别元素。如果这不起作用，则根据元素进行识别在包装该元素的最近的容器上，该容器有办法唯一地找到它。如果这不起作用，请通过同级元素或子元素进行标识，并使用 .parent 方法将 dom“向上”移动到“父容器元素”。

为了变得脆弱和难以可读，请比较从注释中获取的以下内容，并考虑使用 element_by_xpath 的代码：

/html/body/form/div[6]/div/table/tbody/tr[2]/td[2]/table/tbody/tr[2]/td/p/table[‌2]/tbody/tr/td[2]/p/table/tbody/tr[3]/td[2]

然后与此比较（其中整个代码比单独的 xpath 短）

browser.cell(:text => "Total Funds Avail. for Trading").parent.cell(:index => 1).text

或稍微少一点brittle 用你想要的文本的单元格的某些属性替换索引

browser.cell(:text => "Total Funds Avail. for Trading").parent.cell(:class => "balanceSnapShotCellRight").text

xpath 示例很难理解，不知道你在寻找什么元素，也不知道为什么代码可能会选择该元素。由于索引值如此之多，对页面设计的任何更改或表中所需行上方的额外行都会破坏该代码。

第二个更容易理解，我只需阅读它就可以知道脚本试图在页面上找到什么，以及它如何找到它。表中的额外行或页面布局的其他更改不会破坏代码。（除了重新排列表中的列，如果我要利用目标单元格的类或其他一些特征，甚至可以避免这种

情况（如下面评论中的示例所示）就此而言，如果该类的使用对于页面上的该元素是唯一的，那么

browser.cell(:class => 'balanceSnapShotCellRight').text

只要表中只有一个具有该类的单元格就可以正常工作。

现在公平地说，我知道有一些方法可以更优雅地使用 xpath 。类似于我们在 Watir 代码中所做的事情如上所述，但虽然这是事实，但它仍然不那么容易阅读和使用，并且不是大多数人通常（错误）使用 xpath 来选择对象的方式，特别是如果他们使用了创建类似于以下内容的脆弱神秘 xpath 代码的记录器上面的示例）

这个问题的答案描述了在 Watir 中识别元素的三种基本方法。每个答案都涵盖一种方法，您将使用哪种方法取决于在给定情况下最有效的方法。

如果您在给定页面上发现了挑战，请在此处提出一个有关该问题的问题，并在您尝试使用的元素之前/之后/周围包含 HTML 示例，这里的人员通常可以为您指明方向。

如果您还没有这样做，请完成Watir wiki 中的一些教程，注意 xpath 很少被使用。

最后，你提到了Firewatir。不要使用 Firewatir，它已经过时并且不再开发，并且不能与任何最新版本的 FF 一起使用。而是使用 Watir-Webdriver 来驱动 Firefox 或 Chrome（或 IE）。

The big answer is to not use xpath, but instead use watir as the UI is intended to be used.

When it comes to a means to specify elements in browser automation, by and large Xpath is evil, it is SLOW, the code it creates is often (as you are finding) very very brittle, and it's nearly impossible to read and make sense of. Use it only as a means of last resort when nothing else will work.

If you are using a Watir API (as with Watir or Watir-webdriver) then you want to identify the element based on it's own attributes, such as class, name, id, text, etc If that doesn't work, then identify based on the closest container that wraps the element which has a way to find it uniquely. If that doesn't work identify by a sibling or sub-element and use the .parent method as a way to walk 'up' the dom to the 'parent container element.

To the point of being brittle and difficult readability, compare the following taken from the comments and consider the code using element_by_xpath on this:

/html/body/form/div[6]/div/table/tbody/tr[2]/td[2]/table/tbody/tr[2]/td/p/table[‌2]/tbody/tr/td[2]/p/table/tbody/tr[3]/td[2]

and then compare to this (where the entire code is shorter than just the xpath alone)

browser.cell(:text => "Total Funds Avail. for Trading").parent.cell(:index => 1).text

or to be a bit less brittle replace index by some attribute of the cell who's text you want

browser.cell(:text => "Total Funds Avail. for Trading").parent.cell(:class => "balanceSnapShotCellRight").text

The xpath example is very difficult to make any sense of, no idea what element you are after or why the code might be selecting that element. And since there are so many index values, any change to the page design or just extra rows in the table above the one you want will break that code.

The second is much easier to make sense of, I can tell just by reading it what the script is trying to find on the page, and how it is locating it. Extra rows in the table, or other changes to page layout will not break the code. (with the exception of re-arranging the columns in the table, and even that could be avoided if I was to make use of class or some other characteristic of the target cell (as did an example in the comments below)

For that matter, if the use of the class is unique to that element on the page then

browser.cell(:class => 'balanceSnapShotCellRight').text

Would work just fine as long as there is only one cell with that class in the table.

Now to be fair I know there are ways to use xpath more elegantly to do something similar to what we are doing in the Watir code above, but while this is true, it's still not as easy to read and work with, and is not how most people commonly (mis)use xpath to select objects, especially if they have used recorders that create brittle cryptic xpath code similar to the sample above)

The answers to this SO question describe the three basic approaches to identifying elements in Watir. Each answer covers an approach, which one you would use depends on what works best in a given situation.

If you are finding a challenge on a given page, start a question here about it and include a sample of the HTML before/after/around the element you are trying to work with, and the folks here can generally point you the way.

If you've not done so, work through some of the tutorials in the Watir wiki, notice how seldom xpath is used.

Lastly, you mention Firewatir. Don't use Firewatir, it's out of date and no longer being developed and will not work with any recent version of FF. Instead use Watir-Webdriver to driver Firefox or Chrome (or IE).

回复收藏 0 原文

骑趴 2024-12-29 02:58:50

您只需要输出此 XPath 表达式选择的节点的“innerXml”（我不知道 Watir）：

更新：

如果“转储”意味着不同的内容，例如XPath 表达式的集合，每个表达式选择一个节点，然后看看这个问题的答案：

https://stackoverflow.com/a/4747858/36305

You just need to output the "innerXml" (I don't know Watir) of the node selected by this XPath expression:

Update:

In case that by "dump" you mean something different, such as a set of the XPath expressions each selecting a node, then have a look at the answer of this question:

https://stackoverflow.com/a/4747858/36305

回复收藏 0 原文

~没有更多了~