如何使用PHP将XML转换为JSON?
关于将XML转换为PHP中的JSON的多个线程,我确实具有以下代码,它们运行良好:
function jsonPrepareXml(object $domNode): void
{
foreach ($domNode->childNodes as $node) {
if ($node->hasChildNodes()) {
jsonPrepareXml($node);
} else {
if ($domNode->hasAttributes() && strlen($domNode->nodeValue) !== 0) {
$domNode->setAttribute("nodeValue", $node->textContent);
$node->nodeValue = "";
}
}
}
}
$dom = new \DOMDocument();
$dom->loadXML(FileHelpers::fileGetContents($file), LIBXML_NOCDATA);
jsonPrepareXml($dom);
$xmlData = $dom->saveXML();
$sxml = \simplexml_load_string($xmlData);
$json = \json_decode(
\json_encode($sxml, JSON_THROW_ON_ERROR),
null,
512,
JSON_THROW_ON_ERROR
);
现在,我遇到了一个问题,即在某些情况下,在某些情况下,在某些情况下,在某些情况下,在某些情况下,在某些情况下会删除XML文件。我找不到这些文件的共同点。它并不总是相同的字符。而且,如果我仅将CDATA部分复制到空的XML以进行调试,则读取整个数据。 因此,我想我会删除libxml_nocdata
常数,因为libxml在解析为cdata时会读取整个文本。但是,由于CDATA没有转换,因此转换为JSON失败。 因此,我认为我会在jsonpreparexml()
函数中将CDATA节点转换为普通文本节点,
elseif ($node instanceof \DOMCdataSection) {
$node = new \DOMText((string) $node->nodeValue);
}
但这不会更改任何内容。
关于如何解决此问题有什么想法吗? 当然,如果原始功能可以正常工作,那将是很棒的,但是我无法解决此问题。即使使用不同的PHP版本或LIBXML版本。所以我放弃了。 目前,我的php 8.0.11。
更新: 到目前为止,由于文件包含许多个人数据,我无法发布XML文件,该文件触发了该错误。但是现在我确实有一个XML文件很好地显示了错误: 如果我将文件加载到以下代码中,则以“ Majapahit Empire,The City”的结尾结束。
<?php declare(strict_types=1);
$dom = new \DOMDocument();
$dom->loadXML(FileHelpers::fileGetContents($file), LIBXML_NOCDATA);
header("Content-type: text/plain");
echo $dom->saveXML();
因此,这是我的功能以准备JSON转换的属性。如前所述,我可以删除libxml_nocdata
,但是随后在转换为JSON时会获得空节点。
因此,我将寻找一个修复程序或至少解决方法,将所有CDATA注释转换为普通的文本节点。
主要问题实际上是CDATA节点,而不是JSONPREPAREXML函数。我只是想将该功能用于解决方法。
There are multiple threads about converting XML to JSON in PHP and I do already have the following code that's working pretty well:
function jsonPrepareXml(object $domNode): void
{
foreach ($domNode->childNodes as $node) {
if ($node->hasChildNodes()) {
jsonPrepareXml($node);
} else {
if ($domNode->hasAttributes() && strlen($domNode->nodeValue) !== 0) {
$domNode->setAttribute("nodeValue", $node->textContent);
$node->nodeValue = "";
}
}
}
}
$dom = new \DOMDocument();
$dom->loadXML(FileHelpers::fileGetContents($file), LIBXML_NOCDATA);
jsonPrepareXml($dom);
$xmlData = $dom->saveXML();
$sxml = \simplexml_load_string($xmlData);
$json = \json_decode(
\json_encode($sxml, JSON_THROW_ON_ERROR),
null,
512,
JSON_THROW_ON_ERROR
);
Now I encountered the issue that in some XML-Files Text that is in CData sections is truncated in some cases. I was not able to find what those files have in common. It was not always the same amount of chars. And if I copied only the CData section to an empty XML for debugging the whole data was read.
So I thought I would remove the LIBXML_NOCDATA
constant as libxml reads the whole text when parsing as cdata. But then the conversion to JSON fails as cdata is not converted.
So I thought I would convert cdata nodes to normal text-node like this in the jsonPrepareXml()
function
elseif ($node instanceof \DOMCdataSection) {
$node = new \DOMText((string) $node->nodeValue);
}
But this does not change anything.
Are there any ideas on how to fix this issue?
Of course, it would be great if the original function would work, but I was not able to fix this. Even with different PHP versions or libxml versions. So I gave up on this.
Currently, I'm on PHP 8.0.11.
Update:
So far I was not able to publish an xml file that triggered the error as the files contained a lot of personal data. But now I do have one xml file that shows the error quite nicely: https://drive.google.com/file/d/10iyiH1O6oKG9Zbv91He1_KlCQlhdeZoO/view?usp=sharing
If I load the file with the following code, it ends with 'Majapahit Empire, the city' at day 4.
<?php declare(strict_types=1);
$dom = new \DOMDocument();
$dom->loadXML(FileHelpers::fileGetContents($file), LIBXML_NOCDATA);
header("Content-type: text/plain");
echo $dom->saveXML();
So this is event with my function to prepare the attributes for the json conversion. As stated, I can remove LIBXML_NOCDATA
but then I get empty nodes when converting to json.
So I would be looking for a fix or at least a workaround that would convert all the cdata notes into normal text-nodes.
The main issue really are the cdata nodes and not the jsonPrepareXml function. I just wanted to use that function for the workaround.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
不知道这解决了您的CDATA/XML问题,但是正如评论所说的那样,在我看来,这是我的算法:
如果它尚未完全解决您的问题,请继续阅读以获取更多选项:
有关XML的更受控的JSON编码,包括使用SumpleXML,我已经写了一个博客群系列,该系列涉及常见问题案例,并展示了如何在PHP中实现自己的XML:
当您同时使用DOM文档和仅使用SimplexML的单纯文档时,也可能符合您的需求。
特别是后来的编码示例显示了如何与JsonSerialize界面集成,另外,DomDocument也可以使用自己的节点类(ES);比较 domblaze ,请参阅 ref 。
No idea this is solving your CDATA/XML issue, but as commented, it looked fishy to me, here my algorithm:
if it does not yet fully solve your issue, read on for more options:
For more controlled json encoding of XML, including with SimpleXML, I've written a blog-post series that deals with common problem cases and show how you can implement your own XML to JSON style in PHP:
As you use both DOM Document and SimpleXML using only SimpleXML might match your needs, too.
As especially the later encoding examples show how to integrate with the JsonSerialize interface, alternatively it would be possible with DOMDocument and using own Node class(es); compare DOMBlaze, see ref.