PHP 无效字符错误
运行此代码时我收到此错误: 致命错误:test.php:29 中未捕获异常“DOMException”,消息为“无效字符错误”堆栈跟踪:#0 test.php(29): DOMDocument->createElement('1OhmStable', 'a') #1 {main} 被抛出到 test.php 第 29 行 原始 XML 文件中的节点
确实包含无效字符,但当我从节点中剥离无效字符时,应该创建节点。我需要对原始 XML 文档进行什么类型的编码?我需要解码 saveXML 吗?
function __cleanData($c)
{
return preg_replace("/[^A-Za-z0-9]/", "",$c);
}
$xml = new DOMDocument('1.0', 'UTF-8');
$xml->load('test.xml');
$xml->formatOutput = true;
$append = array();
foreach ($xml->getElementsByTagName('product') as $product )
{
foreach($product->getElementsByTagName('name') as $name )
{
$append[] = $name;
}
foreach ($append as $a)
{
$nodeName = __cleanData($a->textContent);
$element = $xml->createElement(htmlentities($nodeName) , 'a');
}
$product->removeChild($xml->getElementsByTagName('details')->item(0));
$product->appendChild($element);
}
$result = $xml->saveXML();
$file = "data.xml";
file_put_contents($file,$result);
原始 XML 如下所示:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="/v1/xsl/xml_pretty_printer.xsl" type="text/xsl"?>
<products>
<product>
<modelNumber>M100</modelNumber>
<itemId>1553725</itemId>
<details>
<detail>
<name>1 Ohm Stable</name>
<value>600 x 1</value>
</detail>
</details>
</product>
</products>
新文档应如下所示:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="/v1/xsl/xml_pretty_printer.xsl" type="text/xsl"?>
<products>
<product>
<modelNumber>M100</modelNumber>
<itemId>1553725</itemId>
<1 Ohm Stable>
</1 Ohm Stable>
</product>
</products>
I'm getting this error when running this code:Fatal error: Uncaught exception 'DOMException' with message 'Invalid Character Error' in test.php:29 Stack trace: #0 test.php(29): DOMDocument->createElement('1OhmStable', 'a') #1 {main} thrown in test.php on line 29
The nodes that from the original XML file do contain invalid characters, but as I am stripping the invalid characters away from the nodes, the nodes should be created. What type of encoding do I need to do on the original XML document? Do I need to decode the saveXML?
function __cleanData($c)
{
return preg_replace("/[^A-Za-z0-9]/", "",$c);
}
$xml = new DOMDocument('1.0', 'UTF-8');
$xml->load('test.xml');
$xml->formatOutput = true;
$append = array();
foreach ($xml->getElementsByTagName('product') as $product )
{
foreach($product->getElementsByTagName('name') as $name )
{
$append[] = $name;
}
foreach ($append as $a)
{
$nodeName = __cleanData($a->textContent);
$element = $xml->createElement(htmlentities($nodeName) , 'a');
}
$product->removeChild($xml->getElementsByTagName('details')->item(0));
$product->appendChild($element);
}
$result = $xml->saveXML();
$file = "data.xml";
file_put_contents($file,$result);
This is what the original XML looks like:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="/v1/xsl/xml_pretty_printer.xsl" type="text/xsl"?>
<products>
<product>
<modelNumber>M100</modelNumber>
<itemId>1553725</itemId>
<details>
<detail>
<name>1 Ohm Stable</name>
<value>600 x 1</value>
</detail>
</details>
</product>
</products>
The new document is supposed to look like this:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="/v1/xsl/xml_pretty_printer.xsl" type="text/xsl"?>
<products>
<product>
<modelNumber>M100</modelNumber>
<itemId>1553725</itemId>
<1 Ohm Stable>
</1 Ohm Stable>
</product>
</products>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
只是你不能使用以数字开头的元素名称
php parse xml - error : StartTag: 无效的元素名称
一篇不错的文章 :- http://www.xml.com/pub/a/2001/07/25/namingparts.html
Simply you can not use an element name start with number
php parse xml - error: StartTag: invalid element name
A nice article :- http://www.xml.com/pub/a/2001/07/25/namingparts.html
您还没有写出出现该错误的位置。如果是在您清理该值之后,这是我的猜测:
此替换不是为 UTF-8 编码字符串(由 DOMDocument 使用)编写的。您可以使用
u-modifier (PCRE8)Docs
:
这只是一个猜测,我建议您在问题中更准确地说明代码的哪一部分触发了错误。
You have not written where you get that error. In case it's after you cleaned the value, this is my guess:
This replacement is not written for UTF-8 encoded strings (which are used by DOMDocument). You can make it UTF-8 compatible by using the
u
-modifier (PCRE8)Docs:It's just a guess, I suggest you make it more precise in your question which part of your code triggers the error.
即使
__cleandata()
将删除除拉丁字母 az 和数字之外的所有其他字符,它也不一定保证结果是有效的 XML 名称。您的函数可以返回以数字开头的字符串,但数字在 XML 中是非法名称开始字符,它们只能出现在名称中第一个名称字符之后。名称中也禁止使用空格,因此这是预期的 XML 输出失败的另一个原因。Even if
__cleandata()
will remove all other characters than latin alphabets a-z and numbers, it doesn't necessarily guarantee that the result is a valid XML name. Your function can return strings that begin with a number but numbers are illegal name start characters in XML, they can only appear in a name after the first name character. Also spaces are forbidden in names, so that is another point where your expected XML output would fail.确保脚本具有相同的编码:如果是 UTF,请确保它们在文件的最开始处没有字节顺序标记 (BOM)。
为此,请使用 Notepad++ 等文本编辑器打开 XML 文件,并将文件转换为“UTF-8 without BOM”。
我有一个类似的错误,但是有一个 json 文件
Make sure scripts have same encoding: if it's UTF make sure they are without Byte Order Mark (BOM) at very begin of file.
To do that open your XML file with a text editor like Notepad++ and convert your file in "UTF-8 without BOM".
I had a similar error, but with a json file