如何使用 SAX 解析器解析 XML
I'm following this tutorial.
It works great but I would like it to return an array with all the strings instead of a single string with the last element.
Any ideas how to do this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
因此,您想要构建一个 XML 解析器来解析像这样的 RSS 提要。
现在您有两个可以使用的 SAX 实现。您可以使用
org.xml.sax
或android.sax
实现。在发布一个短手示例后,我将解释两者的优缺点。android.sax 实现
让我们从
android.sax
实现开始。您首先必须使用
RootElement
和Element
对象定义 XML 结构。无论如何,我会使用 POJO(普通旧 Java 对象)来保存您的数据。这就是所需的 POJO。
Channel.java
该类实现了 Serialized 接口,因此您可以将其放入 Bundle 中并使用它执行某些操作。
现在我们需要一个类来保存我们的项目。在本例中,我将扩展 ArrayList 类。
Items.java
这就是我们的项目容器。我们现在需要一个类来保存每个项目的数据。
Item.java
示例:
正如您所见,这是一个非常简单的示例。使用
android.sax
SAX 实现的主要优点是您可以定义必须解析的 XML 的结构,然后只需将事件侦听器添加到适当的元素即可。缺点是代码变得相当重复且臃肿。org.xml.sax 实现
org.xml.sax
SAX 处理程序实现有点不同。在这里,您不指定或声明 XML 结构,而只是监听事件。最广泛使用的事件是以下事件:
使用上面的 Channel 对象的示例处理程序实现如下所示。
示例
现在说实话,我无法真正告诉您此处理程序实现相对于</code>、<code>link</code> 和 <code>description</code>,我们必须在我们所在的 XML 结构中跟踪它们那一刻。也就是说,如果我们遇到 <code><item></code> 起始标记,我们会将 <code>inItem</code> 标志设置为 <code>true</code> 以确保将正确的数据映射到正确的对象,并且在 <code>endElement</code> 方法中,如果遇到 <code></item></code> 标记,我们会将该标志设置为 <code>false</code>。表示我们已经完成了该项目标签。
android.sax
的任何真正优势。不过,我可以告诉您现在应该非常明显的缺点。看一下startElement
方法中的 else if 语句。由于我们有标签在这个例子中,管理起来非常容易,但是必须解析具有不同级别的重复标签的更复杂的结构就变得很棘手。在那里,您必须使用枚举来设置当前状态,并使用大量 switch/case 语句来检查您所在的位置,或者更优雅的解决方案是某种使用标签堆栈的标签跟踪器。
So you want to build a XML parser to parse a RSS feed like this one.
Now you have two SAX implementations you can work with. Either you use the
org.xml.sax
or theandroid.sax
implementation. I'm going to explain the pro's and con's of both after posting a short hander example.android.sax Implementation
Let's start with the
android.sax
implementation.You have first have to define the XML structure using the
RootElement
andElement
objects.In any case I would work with POJOs (Plain Old Java Objects) which would hold your data. Here would be the POJOs needed.
Channel.java
This class implements the
Serializable
interface so you can put it into aBundle
and do something with it.Now we need a class to hold our items. In this case I'm just going to extend the
ArrayList
class.Items.java
Thats it for our items container. We now need a class to hold the data of every single item.
Item.java
Example:
Now that was a very quick example as you can see. The major advantage of using the
android.sax
SAX implementation is that you can define the structure of the XML you have to parse and then just add an event listener to the appropriate elements. The disadvantage is that the code get quite repeating and bloated.org.xml.sax Implementation
The
org.xml.sax
SAX handler implementation is a bit different.Here you don't specify or declare you XML structure but just listening for events. The most widely used ones are following events:
An example handler implementation using the Channel object above looks like this.
Example
Now to be honest I can't really tell you any real advantage of this handler implementation over the
android.sax
one. I can however tell you the disadvantage which should be pretty obvious by now. Take a look at the else if statement in thestartElement
method. Due to the fact that we have the tags<title>
,link
anddescription
we have to track there in the XML structure we are at the moment. That is if we encounter a<item>
starting tag we set theinItem
flag totrue
to ensure that we map the correct data to the correct object and in theendElement
method we set that flag tofalse
if we encounter a</item>
tag. To signalize that we are done with that item tag.In this example it is pretty easy to manage that but having to parse a more complex structure with repeating tags in different levels becomes tricky. There you'd have to either use Enums for example to set your current state and a lot of switch/case statemenets to check where you are or a more elegant solution would be some kind of tag tracker using a tag stack.
在许多问题中,需要针对不同目的使用不同类型的 xml 文件。我不会试图抓住这个巨大的空间,并从我自己的经历中讲述我需要什么。
Java,也许是我最喜欢的编程语言。此外,这种爱因你可以解决任何问题而没有必要拿出自行车这一事实而得到加强。
因此,我创建了一组运行数据库的客户端服务器,允许客户端远程在数据库服务器中输入条目。无需检查输入数据等,但这不是重点。
作为工作原则,我毫不犹豫地选择了以xml文件的形式传输信息。以下类型:
使其更容易进一步阅读,除了说它是有关医生机构的信息。姓氏、名字、唯一 ID 等。一般来说,数据系列。这个文件安全的到达了服务器端,然后开始解析该文件。
在解析的两个选项中(SAX vs DOM)我选择了SAX,因为事实上他的工作更加聪明,而且他是第一个落入我手中的:)
所以。如您所知,为了成功使用解析器,我们需要重写 DefaultHandler 所需的方法。首先,连接所需的包。
现在我们可以开始编写解析器了,
让我们从startDocument()方法开始。顾名思义,他对文档开头的事件做出反应。在这里您可以挂起各种操作,例如内存分配或重置值,但我们的示例非常简单,因此只需标记适当消息的工作开始即可:
下一步。解析器遍历文档并满足其结构元素。启动方法startElement()。而事实上,他的出现是这样的:startElement(String namespaceURI, String localName, String qName, Attributes atts)。这里namespaceURI - 命名空间,localName - 元素的本地名称,qName - 本地名称与命名空间的组合(用冒号分隔)和atts - 该元素的属性。在这种情况下,一切都很简单。使用 qName'om 并将其放入某个服务线 thisElement 就足够了。这样我们就可以标记出我们目前所处的元素。
接下来,我们来了解一下会议项目的含义。这里包括方法字符()。他的形式为:characters(char[]ch,int start,int length)。那么这里一切都清楚了。 ch - 在此元素中包含字符串本身的文件。起点和长度 - 指示线路起点和长度的服务数量。
啊,是的。我差点忘了。由于其目标是折叠 naparsennye 数据,因此说明了医生的类型。该类已定义并具有所有必需的 setter-getter。
下一个明显的元素结束,然后是下一个。负责结束endElement()。它向我们发出信号,表明该项目已结束,此时您可以执行任何操作。将继续。净化元素。
整个文档结束后,我们就到了文件的末尾。工作结束文档()。在其中,我们可以释放内存,进行一些diagnostichesuyu打印等。在我们的例子中,只需写下解析结束的内容即可。
所以我们有一个类来解析 xml 我们的格式。全文如下:
我希望该主题有助于轻松呈现 SAX 解析器的本质。
不要严格判断第一篇文章:)我希望它至少对某人有用。
UPD:要运行此解析器,您可以使用以下代码:
In many problems it is necessary to use different kinds of xml files for different purposes. I will not attempt to grasp the immensity and tell from my own experience what I needed all this.
Java, perhaps, my favorite programming language. In addition, this love is strengthened by the fact that you can solve any problem and come up with a bike is not necessary.
So, it took me to create a bunch of client-server running a database that would allow the client to remotely make entries in the database server. Needless to be checking input data, etc. and the like, but it's not about that.
As a principle of work, I, without hesitation, chose the transmission of information in the form of xml file. Of the following types:
Make it easier to read any further, except to say that it is the information about doctors institutions. Last name, first name, unique id, and so on. In general, the data series. This file safely got on the server side, and then start parsing the file.
Of the two options parsing (SAX vs DOM) I chose SAX view of the fact that he works more bright, and he was the first I fell into the hands :)
So. As you know, to work successfully with the parser, we need to override the needed methods DefaultHandler's. To begin, connect the required packages.
Now we can start writing our parser
Let's start with the method startDocument (). He, as the name implies, reacts to an event beginning of the document. Here you can hang a variety of actions such as memory allocation, or to reset the values, but our example is pretty simple, so just mark the beginning of work of an appropriate message:
Next. The parser goes through the document meets the element of its structure. Starts method startElement (). And in fact, his appearance this: startElement (String namespaceURI, String localName, String qName, Attributes atts). Here namespaceURI - the namespace, localName - the local name of the element, qName- a combination of local name with a namespace (separated by a colon) and atts - the attributes of this element. In this case, all simple. It suffices to use qName'om and throw it into some service line thisElement. Thus we mark in which the element at the moment we are.
Next, meeting item we get to its meaning. Here include methods characters (). He has the form: characters (char [] ch, int start, int length). Well here everything is clear. ch - a file containing the string itself self-importance within this element. start and length - the number of service indicating the starting point in the line and length.
Ah, yes. I almost forgot. As the object of which will be to fold naparsennye data speaks to the type of Doctors. This class is defined and has all the necessary setters-getters.
Next obvious element ends and it is followed by the next. Responsible for ending the endElement (). It signals to us that the item has ended and you can do anything at this time. Will proceed. Cleanse Element.
Coming so the entire document, we come to the end of the file. Work endDocument (). In it, we can free up memory, do some diagnostichesuyu printing, etc. In our case, just write about what parsing ends.
So we got a class to parse xml our format. Here is the full text:
I hope the topic helped to easily present the essence of the SAX parser.
Do not judge strictly first article :) I hope it was at least someone useful.
UPD: To run this parser, you can use this code: