从 HTML 文档中具有相同 id 的元素中获取数据
我正在使用 PHP DOMDocument 类来解析 HTML 文件,并使用代码
$dom =new DOMDocument();
@$dom->loadHTMLFile($file_path);
$dom->getElementById("my_id")
来获取 ID 为“my_id”的元素的数据,但问题是 HTML 文档包含多个具有相同 ID 的元素,我想要所有这些元素中的数据.. HTML代码,
<div id="my_id">
phone number 123
</div>
<div id="my_id">
address somewhere
</div>
<div id="my_id">
date of birth
</div>
我知道ID是唯一的,但这里的情况是这样的.. 在这种情况下, getElementById() 将返回一个数组。
I am using PHP DOMDocument class to parse the HTML file, and using the code,
$dom =new DOMDocument();
@$dom->loadHTMLFile($file_path);
$dom->getElementById("my_id")
to fetch the data of the element with the ID "my_id", but the problem is the HTML document is containing multiple elements with same ID, and i want the data in all that elements..
The HTML code,
<div id="my_id">
phone number 123
</div>
<div id="my_id">
address somewhere
</div>
<div id="my_id">
date of birth
</div>
i know the ID is unique, but here the case is like that..
in this case will getElementById() will return an array..
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
不,如果有的话
getElementById()
将返回一个DOMElement
。如果返回多个节点,结果将为DOMNodeList
,但这并不适用于此。此外,在您验证文档之前,DOM 不会识别您的 ID一个 DTD 或 Schema 文件,将 id 属性定义为实际的 XML ID 属性,与其他属性不同。这就是为什么
DOMAttr
有一个方法isId< /代码>
和
XML 要求 ID 具有唯一值。正如 VolkerK 在评论中指出的那样,当使用
loadHTMLFile
时,此验证将自动发生。请参阅我对 简化 PHP DOM XML 解析 - 的回答如何?了解更多详细信息。
No, if anything
getElementById()
will return aDOMElement
. In case of multiple returned nodes, results would be aDOMNodeList
, but that doesnt apply here.Furthermore, DOM will not recognize your IDs until you validate the Document against a DTD or Schema file that defines the id attribute as an actual XML ID attribute, which is different from other attributes. That's why
DOMAttr
has a methodisId
andXML requires IDs to be of unique value. As VolkerK pointed out in the comments, when using
loadHTMLFile
, this validation will occur automatically.See my answer to Simplify PHP DOM XML parsing - how? for more detailed information.
没有。你会发现 getElementById 的值是未定义的,尽管你将能够发现该元素是一个 DIV
Nope. You'll find that the value of the getElementById is undefined, though you will be able to find out that the element is a DIV
也许 ID 属性的 XPath 查询会有所帮助。
Maybe a XPath Query for the ID-attribute can help.
如果您(或其他人)绝对无法修复传入的数据(正如已经指出的那样,这是唯一真正正确的事情),这可能是 SimpleHTMLDOM 更宽松的解析结果证明是卓有成效的。
我还没有尝试过它是如何处理这个问题的,但我可以想象它可以
根据需要工作。
If there's absolutely no way you (or somebody else) can fix the incoming data (which, as has been pointed out, is the only really right thing to do) This might be a case where SimpleHTMLDOM's more lenient parsing turns out to be fruitful.
I haven't tried how it deals with this, but I could imagine that
works as needed.