解析 XML 文件 VB.NET
我有以下 XML 文件。我想从文件 out.xml 中获取 VB.NET 中标签 HSP 下第一个 Hsp_qseq、Hsp_hseq 和 Hsp_midline 的值
<?xml version="1.0"?>
<!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN" "NCBI_BlastOutput.dtd">
<BlastOutput>
<BlastOutput_program>blastn</BlastOutput_program>
<BlastOutput_version>BLASTN 2.2.25+</BlastOutput_version>
<BlastOutput_reference>Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb Miller (2000), "A greedy algorithm for aligning DNA sequences", J Comput Biol 2000; 7(1-2):203-14.</BlastOutput_reference>
<BlastOutput_db>positive_Controls</BlastOutput_db>
<BlastOutput_query-ID>Query_1</BlastOutput_query-ID>
<BlastOutput_query-def>rs8192709_C Positive Contol Common Sequence</BlastOutput_query-def>
<BlastOutput_query-len>249</BlastOutput_query-len>
<BlastOutput_param>
<Parameters>
<Parameters_expect>10</Parameters_expect>
<Parameters_sc-match>1</Parameters_sc-match>
<Parameters_sc-mismatch>-2</Parameters_sc-mismatch>
<Parameters_gap-open>0</Parameters_gap-open>
<Parameters_gap-extend>0</Parameters_gap-extend>
<Parameters_filter>L;m;</Parameters_filter>
</Parameters>
</BlastOutput_param>
<BlastOutput_iterations>
<Iteration>
<Iteration_iter-num>1</Iteration_iter-num>
<Iteration_query-ID>Query_1</Iteration_query-ID>
<Iteration_query-def>rs8192709_C Positive Contol Common Sequence</Iteration_query-def>
<Iteration_query-len>249</Iteration_query-len>
<Iteration_hits>
<Hit>
<Hit_num>1</Hit_num>
<Hit_id>gnl|BL_ORD_ID|0</Hit_id>
<Hit_def>rs8192709_C Positive Contol Common Sequence</Hit_def>
<Hit_accession>0</Hit_accession>
<Hit_len>249</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>460.936057665848</Hsp_bit-score>
<Hsp_score>249</Hsp_score>
<Hsp_evalue>9.74431021697707e-133</Hsp_evalue>
<Hsp_query-from>1</Hsp_query-from>
<Hsp_query-to>249</Hsp_query-to>
<Hsp_hit-from>1</Hsp_hit-from>
<Hsp_hit-to>249</Hsp_hit-to>
<Hsp_query-frame>1</Hsp_query-frame>
<Hsp_hit-frame>1</Hsp_hit-frame>
<Hsp_identity>249</Hsp_identity>
<Hsp_positive>249</Hsp_positive>
<Hsp_gaps>0</Hsp_gaps>
<Hsp_align-len>249</Hsp_align-len>
<Hsp_qseq>GGTCAGGATAAAAGGCCCAGTTGGAGGCTGCAGCAGGGTGCAGGGCAGTCAGACCAGGACCATGGAACTCAGCGTCCTCCTCTTCCTTGCACTCCTCACAGGACTCTTGCTACTCCTGGTTCAGCGCCACCCTAACACCCATGACCGCCTCCCACCAGGGCCCCGCCCTCTGCCCCTTTTGGGAAACCTTCTGCAGATGGATAGAAGAGGCCTACTCAAATCCTTTCTGAGGGTAAGACACAGACGAAT</Hsp_qseq>
<Hsp_hseq>GGTCAGGATAAAAGGCCCAGTTGGAGGCTGCAGCAGGGTGCAGGGCAGTCAGACCAGGACCATGGAACTCAGCGTCCTCCTCTTCCTTGCACTCCTCACAGGACTCTTGCTACTCCTGGTTCAGCGCCACCCTAACACCCATGACCGCCTCCCACCAGGGCCCCGCCCTCTGCCCCTTTTGGGAAACCTTCTGCAGATGGATAGAAGAGGCCTACTCAAATCCTTTCTGAGGGTAAGACACAGACGAAT</Hsp_hseq>
<Hsp_midline>
I have this following XML File. I would like to get the values of first Hsp_qseq, Hsp_hseq and Hsp_midline under tag HSP in VB.NET from the file out.xml
<?xml version="1.0"?>
<!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN" "NCBI_BlastOutput.dtd">
<BlastOutput>
<BlastOutput_program>blastn</BlastOutput_program>
<BlastOutput_version>BLASTN 2.2.25+</BlastOutput_version>
<BlastOutput_reference>Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb Miller (2000), "A greedy algorithm for aligning DNA sequences", J Comput Biol 2000; 7(1-2):203-14.</BlastOutput_reference>
<BlastOutput_db>positive_Controls</BlastOutput_db>
<BlastOutput_query-ID>Query_1</BlastOutput_query-ID>
<BlastOutput_query-def>rs8192709_C Positive Contol Common Sequence</BlastOutput_query-def>
<BlastOutput_query-len>249</BlastOutput_query-len>
<BlastOutput_param>
<Parameters>
<Parameters_expect>10</Parameters_expect>
<Parameters_sc-match>1</Parameters_sc-match>
<Parameters_sc-mismatch>-2</Parameters_sc-mismatch>
<Parameters_gap-open>0</Parameters_gap-open>
<Parameters_gap-extend>0</Parameters_gap-extend>
<Parameters_filter>L;m;</Parameters_filter>
</Parameters>
</BlastOutput_param>
<BlastOutput_iterations>
<Iteration>
<Iteration_iter-num>1</Iteration_iter-num>
<Iteration_query-ID>Query_1</Iteration_query-ID>
<Iteration_query-def>rs8192709_C Positive Contol Common Sequence</Iteration_query-def>
<Iteration_query-len>249</Iteration_query-len>
<Iteration_hits>
<Hit>
<Hit_num>1</Hit_num>
<Hit_id>gnl|BL_ORD_ID|0</Hit_id>
<Hit_def>rs8192709_C Positive Contol Common Sequence</Hit_def>
<Hit_accession>0</Hit_accession>
<Hit_len>249</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>460.936057665848</Hsp_bit-score>
<Hsp_score>249</Hsp_score>
<Hsp_evalue>9.74431021697707e-133</Hsp_evalue>
<Hsp_query-from>1</Hsp_query-from>
<Hsp_query-to>249</Hsp_query-to>
<Hsp_hit-from>1</Hsp_hit-from>
<Hsp_hit-to>249</Hsp_hit-to>
<Hsp_query-frame>1</Hsp_query-frame>
<Hsp_hit-frame>1</Hsp_hit-frame>
<Hsp_identity>249</Hsp_identity>
<Hsp_positive>249</Hsp_positive>
<Hsp_gaps>0</Hsp_gaps>
<Hsp_align-len>249</Hsp_align-len>
<Hsp_qseq>GGTCAGGATAAAAGGCCCAGTTGGAGGCTGCAGCAGGGTGCAGGGCAGTCAGACCAGGACCATGGAACTCAGCGTCCTCCTCTTCCTTGCACTCCTCACAGGACTCTTGCTACTCCTGGTTCAGCGCCACCCTAACACCCATGACCGCCTCCCACCAGGGCCCCGCCCTCTGCCCCTTTTGGGAAACCTTCTGCAGATGGATAGAAGAGGCCTACTCAAATCCTTTCTGAGGGTAAGACACAGACGAAT</Hsp_qseq>
<Hsp_hseq>GGTCAGGATAAAAGGCCCAGTTGGAGGCTGCAGCAGGGTGCAGGGCAGTCAGACCAGGACCATGGAACTCAGCGTCCTCCTCTTCCTTGCACTCCTCACAGGACTCTTGCTACTCCTGGTTCAGCGCCACCCTAACACCCATGACCGCCTCCCACCAGGGCCCCGCCCTCTGCCCCTTTTGGGAAACCTTCTGCAGATGGATAGAAGAGGCCTACTCAAATCCTTTCTGAGGGTAAGACACAGACGAAT</Hsp_hseq>
<Hsp_midline>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(49)
||||</Hsp_midline>
</Hsp>
</Hit_hsps>
</Hit>
</Iteration_hits>
<Iteration_stat>
<Statistics>
<Statistics_db-num>58</Statistics_db-num>
<Statistics_db-len>24590</Statistics_db-len>
<Statistics_hsp-len>15</Statistics_hsp-len>
<Statistics_eff-space>5550480</Statistics_eff-space>
<Statistics_kappa>0.46</Statistics_kappa>
<Statistics_lambda>1.28</Statistics_lambda>
<Statistics_entropy>0.85</Statistics_entropy>
</Statistics>
</Iteration_stat>
</Iteration>
</BlastOutput_iterations>
</BlastOutput>
我正在尝试以下代码,但我不知道调用了多少次 .Read 函数。
或者有更好的方法来做到这一点吗?
谢谢
||||</Hsp_midline>
</Hsp>
</Hit_hsps>
</Hit>
</Iteration_hits>
<Iteration_stat>
<Statistics>
<Statistics_db-num>58</Statistics_db-num>
<Statistics_db-len>24590</Statistics_db-len>
<Statistics_hsp-len>15</Statistics_hsp-len>
<Statistics_eff-space>5550480</Statistics_eff-space>
<Statistics_kappa>0.46</Statistics_kappa>
<Statistics_lambda>1.28</Statistics_lambda>
<Statistics_entropy>0.85</Statistics_entropy>
</Statistics>
</Iteration_stat>
</Iteration>
</BlastOutput_iterations>
</BlastOutput>
I am trying the following code but I don't know how many times I have call the .Read function.
Or is there a better way to do this?
Thanks
我会使用 XPath 从文件中提取您需要的信息因为它允许您准确查询所需的节点。
XPath 查询看起来可能相当复杂,但对于简单的操作来说,它相当容易上手。下面是一些示例代码,它使用 XPath 提取您提到的那些节点的值并将其值打印到控制台:
上面代码中唯一有趣的部分是
Dim Foo = Nav.Select("...")
位,参数是查询所需信息的查询表达式 - 在本例中,它是从根到您所在节点的简单路径,但可以使用更强大的查询来执行。这会为每个匹配的节点返回一个迭代器,因此这只是迭代并处理返回的每个节点的情况。
I'd use XPath to pull out the information you need from the file since it allows you to query for exactly the nodes you need.
XPath queries can be quite hairy looking, but for simple operations it's fairly easy to get started with. Here's some sample code that pulls out the values of those nodes you mentioned using XPath and prints their values to the console:
The only interesting part of the code above is the
Dim Foo = Nav.Select("...")
bits, the argument is the query expression to query for the info you want - in this case it's a simple path from the root down to the node you're after, but it is possilbe to use much more powerful queries to execute.This returns an iterator for each matched node, so then it's just a case of iterating through and processing each node that's returned.
|||||||||</Hsp_midline>
</Hsp>
</Hit_hsps>
</Hit>
<Hit>
<Hit_num>2</Hit_num>
<Hit_id>gnl|BL_ORD_ID|29</Hit_id>
<Hit_def>rs8192709_R Positive Control Rare Sequence </Hit_def>
<Hit_accession>29</Hit_accession>
<Hit_len>249</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>455.396108708835</Hsp_bit-score>
<Hsp_score>246</Hsp_score>
<Hsp_evalue>4.53358655933358e-131</Hsp_evalue>
<Hsp_query-from>1</Hsp_query-from>
<Hsp_query-to>249</Hsp_query-to>
<Hsp_hit-from>1</Hsp_hit-from>
<Hsp_hit-to>249</Hsp_hit-to>
<Hsp_query-frame>1</Hsp_query-frame>
<Hsp_hit-frame>1</Hsp_hit-frame>
<Hsp_identity>248</Hsp_identity>
<Hsp_positive>248</Hsp_positive>
<Hsp_gaps>0</Hsp_gaps>
<Hsp_align-len>249</Hsp_align-len>
<Hsp_qseq>GGTCAGGATAAAAGGCCCAGTTGGAGGCTGCAGCAGGGTGCAGGGCAGTCAGACCAGGACCATGGAACTCAGCGTCCTCCTCTTCCTTGCACTCCTCACAGGACTCTTGCTACTCCTGGTTCAGCGCCACCCTAACACCCATGACCGCCTCCCACCAGGGCCCCGCCCTCTGCCCCTTTTGGGAAACCTTCTGCAGATGGATAGAAGAGGCCTACTCAAATCCTTTCTGAGGGTAAGACACAGACGAAT</Hsp_qseq>
<Hsp_hseq>GGTCAGGATAAAAGGCCCAGTTGGAGGCTGCAGCAGGGTGCAGGGCAGTCAGACCAGGACCATGGAACTCAGCGTCCTCCTCTTCCTTGCACTCCTCACAGGACTCTTGCTACTCCTGGTTCAGTGCCACCCTAACACCCATGACCGCCTCCCACCAGGGCCCCGCCCTCTGCCCCTTTTGGGAAACCTTCTGCAGATGGATAGAAGAGGCCTACTCAAATCCTTTCTGAGGGTAAGACACAGACGAAT</Hsp_hseq>
<Hsp_midline>
|||||||||</Hsp_midline>
</Hsp>
</Hit_hsps>
</Hit>
<Hit>
<Hit_num>2</Hit_num>
<Hit_id>gnl|BL_ORD_ID|29</Hit_id>
<Hit_def>rs8192709_R Positive Control Rare Sequence </Hit_def>
<Hit_accession>29</Hit_accession>
<Hit_len>249</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>455.396108708835</Hsp_bit-score>
<Hsp_score>246</Hsp_score>
<Hsp_evalue>4.53358655933358e-131</Hsp_evalue>
<Hsp_query-from>1</Hsp_query-from>
<Hsp_query-to>249</Hsp_query-to>
<Hsp_hit-from>1</Hsp_hit-from>
<Hsp_hit-to>249</Hsp_hit-to>
<Hsp_query-frame>1</Hsp_query-frame>
<Hsp_hit-frame>1</Hsp_hit-frame>
<Hsp_identity>248</Hsp_identity>
<Hsp_positive>248</Hsp_positive>
<Hsp_gaps>0</Hsp_gaps>
<Hsp_align-len>249</Hsp_align-len>
<Hsp_qseq>GGTCAGGATAAAAGGCCCAGTTGGAGGCTGCAGCAGGGTGCAGGGCAGTCAGACCAGGACCATGGAACTCAGCGTCCTCCTCTTCCTTGCACTCCTCACAGGACTCTTGCTACTCCTGGTTCAGCGCCACCCTAACACCCATGACCGCCTCCCACCAGGGCCCCGCCCTCTGCCCCTTTTGGGAAACCTTCTGCAGATGGATAGAAGAGGCCTACTCAAATCCTTTCTGAGGGTAAGACACAGACGAAT</Hsp_qseq>
<Hsp_hseq>GGTCAGGATAAAAGGCCCAGTTGGAGGCTGCAGCAGGGTGCAGGGCAGTCAGACCAGGACCATGGAACTCAGCGTCCTCCTCTTCCTTGCACTCCTCACAGGACTCTTGCTACTCCTGGTTCAGTGCCACCCTAACACCCATGACCGCCTCCCACCAGGGCCCCGCCCTCTGCCCCTTTTGGGAAACCTTCTGCAGATGGATAGAAGAGGCCTACTCAAATCCTTTCTGAGGGTAAGACACAGACGAAT</Hsp_hseq>
<Hsp_midline>
||||
||||