使用 Nokogiri 收集 XML 中的属性
我有这个 XML:
<RECEIPT receiptDate="2012-02-10T12:46:26.661Z" submissionFile="E.coli_ENT_WS.submission.xml" success="false">
<EXPERIMENT alias="ENT 23" status="PUBLIC"/>
<EXPERIMENT alias="WS 23" status="PUBLIC"/>
<RUN alias="ENT 23" status="PUBLIC"/>
<RUN alias="WS 23" status="PUBLIC"/><
SAMPLE alias="ENT 23" status="PUBLIC"/>
<SAMPLE alias="WS 23" status="PUBLIC"/>
<STUDY alias="ENT 23" status="PUBLIC"/>
<STUDY alias="WS 23" status="PUBLIC"/>
<SUBMISSION alias="E.coli_ENT_WS"/>
<MESSAGES>
<ERROR> In run(ENT 23), the FC018_s_6_sequence_L70.txt.md5 not found </ERROR>
<ERROR> In run(ENT 23) found the file format(Illumina_native_fastq), but requires SPOT_DESCRIPTOR information in the experiment(ENT 23)</ERROR>
<ERROR>The Illumina_native_fastq file format required gzip compression for submission.</ERROR>
<ERROR> FILE attribute quality_scoring_system is required</ERROR>
<ERROR>Same file FC018_s_6_sequence_L70.txt found in Run(WS 23) has been used with other Run</ERROR>
<ERROR> In run(WS 23), the FC018_s_6_sequence_L70.txt.md5 not found </ERROR>
<ERROR> In run(WS 23) found the file format(Illumina_native_fastq), but requires SPOT_DESCRIPTOR information in the experiment(WS 23)</ERROR>
<ERROR>The Illumina_native_fastq file format required gzip compression for submission.</ERROR>
<ERROR> FILE attribute quality_scoring_system is required</ERROR>
<INFO> VALIDATE action for the following XML: E.coli_ENT_WS.study.xml E.coli_ENT_WS.sample.xml E.coli_ENT_WS.experiment.xml E.coli_ENT_WS.run.xml </INFO>
<INFO>Inform_on_error is not filled in; auto populated from Submission account. </INFO>
<INFO>Number of files in drop box = 2 & Number of files in Submission = 1</INFO>
<INFO>Deprecated element ignored: CENTER_NAME</INFO>
<INFO>Deprecated element PROJECT_ID converted to RELATED_STUDY</INFO>
<INFO>Deprecated element ignored: CENTER_NAME</INFO>
<INFO>Deprecated element PROJECT_ID converted to RELATED_STUDY</INFO>
<INFO> SPOT_DESCRIPTOR is missing</INFO><INFO> SPOT_DESCRIPTOR is missing</INFO>
<INFO>Experiment (ENT 23) SPOTDESCRIPTOR is optional is null</INFO>
<INFO>Experiment (WS 23) SPOTDESCRIPTOR is optional is null</INFO>
<INFO> In run(ENT 23) file name (FC018_s_6_sequence_L70.txt) mentioned is not found among the submitted files</INFO>
<INFO> In run(WS 23) file name (FC018_s_6_sequence_L70.txt) mentioned is not found among the submitted files</INFO>
</MESSAGES>
<ACTIONS>VALIDATE</ACTIONS>
<ACTIONS>VALIDATE</ACTIONS>
<ACTIONS>VALIDATE</ACTIONS>
<ACTIONS>VALIDATE</ACTIONS>
<ACTIONS>HOLD</ACTIONS>
</RECEIPT>
我能够检索所有元素标签,主要是 EXPERIMENT
、ERROR
、INFO
、ACTION
,消息
。
我想检索的是 EXPERIMENT
和 RECEIPT
等元素的属性,
我正在使用 Nokogiri 进行解析。
我的代码是这样的:
@req_test = %x[curl -F "SUBMISSION=@xml/#{@experiment.alias}.submission.xml" -F "STUDY=@xml/#{@experiment.alias}.study.xml" -F "SAMPLE=@xml/#{@experiment.alias}.sample.xml" -F "RUN=@xml/#{@experiment.alias}.run.xml" -F "EXPERIMENT=@xml/#{@experiment.alias}.experiment.xml" https://www-test.ebi.ac.uk/ena/submit/drop-box/submit/]
@doc = Nokogiri::XML(@req_test)
# collecting all the errors
@expt = @doc.xpath("//ERROR")
# Collecting all the INFO
@info = @doc.xpath("//INFO")
那是我的控制器。我的视图仅用于显示:
<h3>This is the ERRORS Collected</h3>
<% for expt in @expt %>
<ul>
<li><%= expt %><br \></li>
</ul>
<% end %>
<br \ >
<h3>This is the INFO Collected</h3>
<% for info in @info %>
<ul>
<li><%= info %><br \></li>
</ul>
<% end %>
应用程序呈现如下内容:
This is the ERRORS Collected
In run(ENT 23), the FC018_s_6_sequence_L70.txt.md5 not found
In run(ENT 23) found the file format(Illumina_native_fastq), but requires SPOT_DESCRIPTOR information in the experiment(ENT 23)
The Illumina_native_fastq file format required gzip compression for submission.
FILE attribute quality_scoring_system is required
Same file FC018_s_6_sequence_L70.txt found in Run(WS 23) has been used with other Run
In run(WS 23), the FC018_s_6_sequence_L70.txt.md5 not found
In run(WS 23) found the file format(Illumina_native_fastq), but requires SPOT_DESCRIPTOR information in the experiment(WS 23)
The Illumina_native_fastq file format required gzip compression for submission.
FILE attribute quality_scoring_system is required
This is the INFO Collected
VALIDATE action for the following XML: E.coli_ENT_WS.study.xml E.coli_ENT_WS.sample.xml E.coli_ENT_WS.experiment.xml E.coli_ENT_WS.run.xml
Inform_on_error is not filled in; auto populated from Submission account.
Number of files in drop box = 2 & Number of files in Submission = 1
Deprecated element ignored: CENTER_NAME
Deprecated element PROJECT_ID converted to RELATED_STUDY
Deprecated element ignored: CENTER_NAME
Deprecated element PROJECT_ID converted to RELATED_STUDY
SPOT_DESCRIPTOR is missing
SPOT_DESCRIPTOR is missing
Experiment (ENT 23) SPOTDESCRIPTOR is optional is null
Experiment (WS 23) SPOTDESCRIPTOR is optional is null
In run(ENT 23) file name (FC018_s_6_sequence_L70.txt) mentioned is not found among the submitted files
In run(WS 23) file name (FC018_s_6_sequence_L70.txt) mentioned is not found among the submitted files
请有人建议检索方法/选项。
I have this XML:
<RECEIPT receiptDate="2012-02-10T12:46:26.661Z" submissionFile="E.coli_ENT_WS.submission.xml" success="false">
<EXPERIMENT alias="ENT 23" status="PUBLIC"/>
<EXPERIMENT alias="WS 23" status="PUBLIC"/>
<RUN alias="ENT 23" status="PUBLIC"/>
<RUN alias="WS 23" status="PUBLIC"/><
SAMPLE alias="ENT 23" status="PUBLIC"/>
<SAMPLE alias="WS 23" status="PUBLIC"/>
<STUDY alias="ENT 23" status="PUBLIC"/>
<STUDY alias="WS 23" status="PUBLIC"/>
<SUBMISSION alias="E.coli_ENT_WS"/>
<MESSAGES>
<ERROR> In run(ENT 23), the FC018_s_6_sequence_L70.txt.md5 not found </ERROR>
<ERROR> In run(ENT 23) found the file format(Illumina_native_fastq), but requires SPOT_DESCRIPTOR information in the experiment(ENT 23)</ERROR>
<ERROR>The Illumina_native_fastq file format required gzip compression for submission.</ERROR>
<ERROR> FILE attribute quality_scoring_system is required</ERROR>
<ERROR>Same file FC018_s_6_sequence_L70.txt found in Run(WS 23) has been used with other Run</ERROR>
<ERROR> In run(WS 23), the FC018_s_6_sequence_L70.txt.md5 not found </ERROR>
<ERROR> In run(WS 23) found the file format(Illumina_native_fastq), but requires SPOT_DESCRIPTOR information in the experiment(WS 23)</ERROR>
<ERROR>The Illumina_native_fastq file format required gzip compression for submission.</ERROR>
<ERROR> FILE attribute quality_scoring_system is required</ERROR>
<INFO> VALIDATE action for the following XML: E.coli_ENT_WS.study.xml E.coli_ENT_WS.sample.xml E.coli_ENT_WS.experiment.xml E.coli_ENT_WS.run.xml </INFO>
<INFO>Inform_on_error is not filled in; auto populated from Submission account. </INFO>
<INFO>Number of files in drop box = 2 & Number of files in Submission = 1</INFO>
<INFO>Deprecated element ignored: CENTER_NAME</INFO>
<INFO>Deprecated element PROJECT_ID converted to RELATED_STUDY</INFO>
<INFO>Deprecated element ignored: CENTER_NAME</INFO>
<INFO>Deprecated element PROJECT_ID converted to RELATED_STUDY</INFO>
<INFO> SPOT_DESCRIPTOR is missing</INFO><INFO> SPOT_DESCRIPTOR is missing</INFO>
<INFO>Experiment (ENT 23) SPOTDESCRIPTOR is optional is null</INFO>
<INFO>Experiment (WS 23) SPOTDESCRIPTOR is optional is null</INFO>
<INFO> In run(ENT 23) file name (FC018_s_6_sequence_L70.txt) mentioned is not found among the submitted files</INFO>
<INFO> In run(WS 23) file name (FC018_s_6_sequence_L70.txt) mentioned is not found among the submitted files</INFO>
</MESSAGES>
<ACTIONS>VALIDATE</ACTIONS>
<ACTIONS>VALIDATE</ACTIONS>
<ACTIONS>VALIDATE</ACTIONS>
<ACTIONS>VALIDATE</ACTIONS>
<ACTIONS>HOLD</ACTIONS>
</RECEIPT>
I am able to retrieve all the element tags mainly EXPERIMENT
, ERROR
, INFO
, ACTION
, MESSAGE
.
What I would like to retrieve is the attributes from elements like EXPERIMENT
and RECEIPT
I am using Nokogiri for my parsing.
My code is like this:
@req_test = %x[curl -F "SUBMISSION=@xml/#{@experiment.alias}.submission.xml" -F "STUDY=@xml/#{@experiment.alias}.study.xml" -F "SAMPLE=@xml/#{@experiment.alias}.sample.xml" -F "RUN=@xml/#{@experiment.alias}.run.xml" -F "EXPERIMENT=@xml/#{@experiment.alias}.experiment.xml" https://www-test.ebi.ac.uk/ena/submit/drop-box/submit/]
@doc = Nokogiri::XML(@req_test)
# collecting all the errors
@expt = @doc.xpath("//ERROR")
# Collecting all the INFO
@info = @doc.xpath("//INFO")
That was my controller. My View is something just for display:
<h3>This is the ERRORS Collected</h3>
<% for expt in @expt %>
<ul>
<li><%= expt %><br \></li>
</ul>
<% end %>
<br \ >
<h3>This is the INFO Collected</h3>
<% for info in @info %>
<ul>
<li><%= info %><br \></li>
</ul>
<% end %>
and the application renders something like this:
This is the ERRORS Collected
In run(ENT 23), the FC018_s_6_sequence_L70.txt.md5 not found
In run(ENT 23) found the file format(Illumina_native_fastq), but requires SPOT_DESCRIPTOR information in the experiment(ENT 23)
The Illumina_native_fastq file format required gzip compression for submission.
FILE attribute quality_scoring_system is required
Same file FC018_s_6_sequence_L70.txt found in Run(WS 23) has been used with other Run
In run(WS 23), the FC018_s_6_sequence_L70.txt.md5 not found
In run(WS 23) found the file format(Illumina_native_fastq), but requires SPOT_DESCRIPTOR information in the experiment(WS 23)
The Illumina_native_fastq file format required gzip compression for submission.
FILE attribute quality_scoring_system is required
This is the INFO Collected
VALIDATE action for the following XML: E.coli_ENT_WS.study.xml E.coli_ENT_WS.sample.xml E.coli_ENT_WS.experiment.xml E.coli_ENT_WS.run.xml
Inform_on_error is not filled in; auto populated from Submission account.
Number of files in drop box = 2 & Number of files in Submission = 1
Deprecated element ignored: CENTER_NAME
Deprecated element PROJECT_ID converted to RELATED_STUDY
Deprecated element ignored: CENTER_NAME
Deprecated element PROJECT_ID converted to RELATED_STUDY
SPOT_DESCRIPTOR is missing
SPOT_DESCRIPTOR is missing
Experiment (ENT 23) SPOTDESCRIPTOR is optional is null
Experiment (WS 23) SPOTDESCRIPTOR is optional is null
In run(ENT 23) file name (FC018_s_6_sequence_L70.txt) mentioned is not found among the submitted files
In run(WS 23) file name (FC018_s_6_sequence_L70.txt) mentioned is not found among the submitted files
Please could someone suggest the retrieving method/option.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不清楚您想做什么,或者您的问题是什么。以下是可能有帮助的各种答案。
对于任何元素,您可以使用
Nokogiri::XML::Node#attributes
获取将节点名称映射到Nokogiri::XML::Attr
(其中有一个.value
(您可以阅读):除了
attributes
(哈希),您还可以使用.attribute_nodes
,这给你一个直接Attr
数组(每个都有.name
和.value
)。或者,在迭代实验元素时,您可以使用……
提取已知属性的值(返回一个字符串,例如
“ENT 23”
)。如果您尝试自己提取所有属性,您还可以使用...
...如果您想获取文档中任何位置的这些属性的数组(其中包含
.name
和 <代码>.值)。如果您只想要特定元素上的所有
alias
属性(例如EXPERIMENT
),那么您可以使用...It's not clear to me what you are trying to do, or what your problem is. Below are a variety of answers that might help.
For any element you can use
Nokogiri::XML::Node#attributes
to get a hash mapping the name of the node to aNokogiri::XML::Attr
(which has a.value
you can read):Instead of
attributes
(a Hash) you can also use.attribute_nodes
, which gives you a straight array ofAttr
s (with a.name
and.value
each).Alternatively, while iterating through your experiment elements you could use…
…to extract the value of a known attribute (returning a string such as
"ENT 23"
).If you're trying to extract all the attributes on their own, you could also use…
…if you wanted to get an array of just those attributes anywhere in the document (which have a
.name
and.value
).If you only want all the
alias
attributes on a particular element (e.g.EXPERIMENT
) then you can use…