如何在 selenium 定位器中使用正则表达式
我正在使用 selenium RC,例如,我想获取所有具有匹配属性 href 的链接元素:
http://[^/]*\d+com
我想使用:
sel.get_attribute( '//a[regx:match(@href, "http://[^/]*\d+.com")]/@name' )
这将返回与正则表达式匹配的所有链接的名称属性列表。 (或类似的东西)
谢谢
I'm using selenium RC and I would like, for example, to get all the links elements with attribute href that match:
http://[^/]*\d+com
I would like to use:
sel.get_attribute( '//a[regx:match(@href, "http://[^/]*\d+.com")]/@name' )
which would return a list of the name attribute of all the links that match the regex.
(or something like it)
thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
上面的答案可能是查找与正则表达式匹配的所有链接的正确方法,但我认为回答问题的其他部分(如何在 Xpath 定位器中使用正则表达式)也很有帮助。您需要使用正则表达式 matches() 函数,如下所示:(
当然,这会单击带有“id=checkboxes”或“id=cheANYTHINGHEREboxes”的 div)
但请注意,不支持 matches 函数由 Xpath 的所有本机浏览器实现(最明显的是,在 FF3 中使用它会抛出错误:无效的 xpath[2])。
如果您在使用特定浏览器时遇到问题(就像我对 FF3 所做的那样),请尝试使用 Selenium 的allowNativeXpath("false") 切换到 JavaScript Xpath 解释器。它会更慢,但它似乎可以与更多 Xpath 函数配合使用,包括“matches”和“ends-with”。 :)
The answer above is probably the right way to find ALL of the links that match a regex, but I thought it'd also be helpful to answer the other part of the question, how to use regex in Xpath locators. You need to use the regex matches() function, like this:
(this, of course, would click the div with 'id=checkboxes', or 'id=cheANYTHINGHEREboxes')
Be aware, though, that the matches function is not supported by all native browser implementations of Xpath (most conspicuously, using this in FF3 will throw an error: invalid xpath[2]).
If you have trouble with your particular browser (as I did with FF3), try using Selenium's allowNativeXpath("false") to switch over to the JavaScript Xpath interpreter. It'll be slower, but it does seem to work with more Xpath functions, including 'matches' and 'ends-with'. :)
您可以使用 Selenium 命令 getAllLinks 获取页面上链接 id 的数组,然后您可以循环遍历该数组并使用 getAttribute 检查 href,该数组采用定位器后跟 @ 和属性名称。例如,在 Java 中,这可能是:
You can use the Selenium command getAllLinks to get an array of the ids of links on the page, which you could then loop through and check the href using the getAttribute, which takes the locator followed by an @ and the attribute name. For example in Java this might be:
一个可能的解决方案是使用 sel.get_eval() 并编写一个返回链接列表的 JS 脚本。像下面的答案:
硒:是否有可能在 selenium 定位器中使用正则表达式
A possible solution is to use
sel.get_eval()
and write a JS script that returns a list of the links. something like the following answer:selenium: Is it possible to use the regexp in selenium locators
这里还有一些 Selenium RC 的替代方法。这些不是纯粹的 Selenium 解决方案,它们允许与您的编程语言数据结构和 Selenium 进行交互。
您还可以获取 HTML 页面源代码,然后使用正则表达式该源代码返回一组匹配的链接。使用正则表达式分组来分隔 URL、链接文本/ID 等,然后您可以将它们传递回 Selenium 以单击或导航到。
另一种方法是获取父/根元素的 HTML 页面源或innerHTML(通过 DOM 定位器),然后将 HTML 转换为 XML 作为编程语言中的 DOM 对象。然后,您可以使用所需的 XPath(无论是否使用正则表达式)遍历 DOM,并获取仅包含感兴趣的链接的节点集。从它们解析出链接文本/ID 或 URL,您可以传回 Selenium 以单击或导航到。
根据要求,我在下面提供示例。它是混合语言,因为该帖子似乎并没有特定于语言。我只是使用我可以使用的东西来组合示例。它们没有经过完全测试或根本没有经过测试,但我之前在其他项目中使用过一些代码,因此这些是概念验证代码示例,说明如何实现我刚才提到的解决方案。
Here's some alternate methods as well for Selenium RC. These aren't pure Selenium solutions, they allow interaction with your programming language data structures and Selenium.
You can also get get HTML page source, then regular expression the source to return a match set of links. Use regex grouping to separate out URLs, link text/ID, etc. and you can then pass them back to selenium to click on or navigate to.
Another method is get HTML page source or innerHTML (via DOM locators) of a parent/root element then convert the HTML to XML as DOM object in your programming language. You can then traverse the DOM with desired XPath (with regular expression or not), and obtain a nodeset of only the links of interest. From their parse out the link text/ID or URL and you can pass back to selenium to click on or navigate to.
Upon request, I'm providing examples below. It's mixed languages since the post didn't appear to be language specific anyways. I'm just using what I had available to hack together for examples. They aren't fully tested or tested at all, but I've worked with bits of the code before in other projects, so these are proof of concept code examples of how you'd implement the solutions I just mentioned.
Selenium 的 By.Id 和 By.CssSelector 方法不支持 Regex,而 By.XPath 仅在启用 XPath 2.0 的情况下支持。如果您想使用正则表达式,您可以执行以下操作:
注意:此代码未经测试。此外,您可以通过找出消除第二次搜索的方法来优化此方法。
Selenium's By.Id and By.CssSelector methods do not support Regex and By.XPath only does where XPath 2.0 is enabled. If you want to use Regex, you can do something like this:
Note: this code is untested. Also, you can optimize this method by figuring out a way to eliminate the second search.