XPath 在 Firefox / GreaseMonkey 中阻塞实体
我正在编写一个相当基本的 GreaseMonkey 脚本,该脚本将文本定位在特定元素中,然后使用该文本执行后续操作。相关的代码如下:
在 HTML 中,有一个包含“someclass”类的跨度,其中包含一小串文本:
<span class="someclass">some text</span>
然后在 JavaScript 中,我试图找到这个类并提取其内容(“someclass”)使用标准 XPath jazz 将文本')转换为变量:
document.evaluate("//span[@class='someclass']/text()", document, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);
问题如下:当我在“某些文本”是带有基本字符的基本字符串的页面上运行此命令时,一切正常,但是当我在“某些文本”的页面上运行它时' 包含实体,则失败。例如,这些都很好,XPath 返回我想要的文本:
<span class="someclass">some text</span>
<span class="someclass">some other text</span>
<span class="someclass">sometext</span>
<span class="someclass">some text 12345</span>
但是,这给了我一个错误:
<span class="someclass">some text's text</span>
返回的错误是:
Error: The expression is not a legal expression.
Source File: file:///blahblahblah.user.js
Line: (JS line i gave above)
我在这里和 Google 上发现了一些关于 XPath 如何处理实体的问题的结果,但它们是所有的事情都像 [text() = 'blah &racquo; blah']
— 换句话说,它们的实体位于 XPath 查询本身中。我的不是,它们位于我试图从 XPath 查询返回的文本中。
这是同样的问题吗?有什么简单的方法可以解决吗?
谢谢!
I am writing a fairly basic GreaseMonkey script that locates text in a specific element and then uses that text to do things later. The relevant bits of code are as follows:
In the HTML there is a span with the class 'someclass', which contains a small string of text:
<span class="someclass">some text</span>
Then in the JavaScript i am trying to find this class and pull its contents (the 'some text') into a variable using the standard XPath jazz:
document.evaluate("//span[@class='someclass']/text()", document, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);
Here's the problem: When i run this on pages where 'some text' is a basic string with basic characters, everything works fine, but when i run it on pages where 'some text' contains entities, then it fails. For example, these are all fine and XPath returns the text i want:
<span class="someclass">some text</span>
<span class="someclass">some other text</span>
<span class="someclass">sometext</span>
<span class="someclass">some text 12345</span>
However, this gives me an error:
<span class="someclass">some text's text</span>
The error returned is:
Error: The expression is not a legal expression.
Source File: file:///blahblahblah.user.js
Line: (JS line i gave above)
I found a few results on here and on Google talking about how XPath has trouble with entities, but they were all doing things like [text() = 'blah &racquo; blah']
— in other words, their entities are in the XPath query itself. Mine aren't, they're in the text that i'm trying to return from the XPath query.
Is this the same problem? Is there any easy way around it?
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
问题是 XPath 表达式中的字符串文字必须用引号或撇号括起来,并且不应包含周围的字符。
包含引号和撇号的文字字符串需要转换(在您的情况下由 Javascript 程序)转换为不包含这两种类型的字符的字符串。
最简单的方法是将这些类型的字符的每个实例替换为其字符实体 - 例如将每个
'
替换为' ;
并使用'
作为文字字符串的周围字符。第二种方法是替换
为 XPath 表达式:
警告:使用不受信任的数据创建 XPath 表达式不是一个好主意 - 这可能会导致 >XPath 注入。为了避免 XPath 注入,如果您的编程语言和函数库允许这样做,请始终编译您的 XPath 表达式并通过将数据作为参数传递来运行它。
The problem is that a string literal in an XPath expression must be surrounded by either quotes or apostrophes and should not contain the surrounding character.
A literal string that contains both quotes and apostrophes needs to be transformed (in your case by your Javascript program) into one that doesn't contain both these types of characters.
The simplest way to do this is to replace each instance of one of these types of characters with its character entity -- say replace every
'
with'
and use the'
as surrounding character for the literal string.A second way is to replace
with the XPath expression:
Warning: It is not a good idea to use untrusted data to create an XPath expression -- this may result in XPath injection. To avoid XPath injections, if your programming language and function libraries allow this, always compile your XPath expression and run it with passing the data as parameter(s).