使用 SimpleXML 解释 BLAST XML 输出——连字符问题?嵌套对象访问语法问题?
我正在尝试使用 SimpleXML 读取一些 NCBI BLAST XML 输出,并且我能够访问部分输出,但不能访问其中的其他部分。
这是 XML 的相关部分(为了可读性而删除了一些不相关的部分):
<?xml version="1.0"?> <!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN" "NCBI_BlastOutput.dtd">
<BlastOutput>
<BlastOutput_program>blastn</BlastOutput_program>
<BlastOutput_db>allconstructs.fasta</BlastOutput_db>
<BlastOutput_iterations>
<Iteration>
<Iteration_iter-num>1</Iteration_iter-num>
<Iteration_query-ID>Query_1</Iteration_query-ID>
<Iteration_query-def>gene_1_query</Iteration_query-def>
<Iteration_query-len>1005</Iteration_query-len>
这是我的代码(注意:$qdef 和 $qlen 的获取方式不同,以确保我在设置/使用 $output 时没有犯一些愚蠢的错误)变量):
$blast = simplexml_load_string($xml);
$output = $xml->BlastOutput_iterations->Iteration;
$qprog = $blast->BlastOutput_program;
$qdef = $xml->BlastOutput_iterations->Iteration->{'Iteration_query-def'};
$qlen = $output->{'Iteration_query-len'};
echo "Query Program: ".$qprog."<br/>Query: ".$qdef."<br/>Query Length: " .$qlen;
这是输出:
Query Program: blastn
Query:
Query Length:
如果我删除 Iteration_query-def 和 Iteration_query-len 周围的 {''},它会将它们视为整数并返回零。
我做错了什么吗?除了 BlastOutput_program 位和其他两个变量之间的 {''} 内容之外,我无法弄清楚我正在做的任何事情。不过,如果我将 {''} 添加到 BlastOutput_program 中,它仍然可以正常工作并生成正确的输出。这是怎么回事?
更新:它使用 xpath 工作,如下所示:
$qlen = $blast->xpath('BlastOutput_iterations/Iteration/Iteration_query-def');
但我仍然很想知道这是否是唯一的方法,或者是否有一种方法可以像我上面所示的那样。
I'm trying to use SimpleXML to read some NCBI BLAST XML output, and I'm able to access some of the output, but not other bits of it.
Here's the relevant part of the XML (some unrelated segments excised for readability):
<?xml version="1.0"?> <!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN" "NCBI_BlastOutput.dtd">
<BlastOutput>
<BlastOutput_program>blastn</BlastOutput_program>
<BlastOutput_db>allconstructs.fasta</BlastOutput_db>
<BlastOutput_iterations>
<Iteration>
<Iteration_iter-num>1</Iteration_iter-num>
<Iteration_query-ID>Query_1</Iteration_query-ID>
<Iteration_query-def>gene_1_query</Iteration_query-def>
<Iteration_query-len>1005</Iteration_query-len>
And here's my code (note: $qdef and $qlen are arrived at differently to make sure I hadn't made some stupid mistake in setting/using the $output variable):
$blast = simplexml_load_string($xml);
$output = $xml->BlastOutput_iterations->Iteration;
$qprog = $blast->BlastOutput_program;
$qdef = $xml->BlastOutput_iterations->Iteration->{'Iteration_query-def'};
$qlen = $output->{'Iteration_query-len'};
echo "Query Program: ".$qprog."<br/>Query: ".$qdef."<br/>Query Length: " .$qlen;
Here's the output:
Query Program: blastn
Query:
Query Length:
If I remove the {''} around Iteration_query-def and Iteration_query-len, it treats them as integers and returns zero for both.
Am I doing something wrong? I can't figure out anything I'm doing differently other than the {''} stuff between the BlastOutput_program bit and the two other variables. If I add the {''} stuff to BlastOutput_program, though, it still works fine and produces correct output for that. What's the deal?
Update: It works using xpath, as follows:
$qlen = $blast->xpath('BlastOutput_iterations/Iteration/Iteration_query-def');
But I'd still really like to know if that's the only way of doing it or if there's a way to do it like I've shown above.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
知道了。一位朋友指出这个网站,它展示了我在做什么错误:我需要指定可能具有多个条目的 XML 元素的索引。
例如
Got it. A friend pointed out this site, which showed what I was doing wrong: I needed to specify the index of the XML elements that potentially had multiple entries.
E.g.