之后使用 SAX 解析 XML 时出错

发布于 2024-12-17 11:11:35 字数 4068 浏览 0 评论 0原文

<description>
SEBI : Decision taken by a listed investment company to dispose of a part of its
       investment is not “price sensitive information” within meaning of SEBI
      (Prohibition of Insider Trading) Regulations, 1992<br>;
      By <b>  [2011] 15 taxmann.com 229 (SAT)</b> 
</description>

这是我想在
之后解析数据的 xml。我可以在
之前解析,但无法在
之后解析

这是我的句柄类代码:

package com.exercise;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class RSSHandler extends DefaultHandler {

    final int state_unknown = 0;
    final int state_title = 1;
    final int state_description = 2;
    final int state_link = 3;
    final int state_pubdate = 4;
    int currentState = state_unknown;

    RSSFeed feed;
    RSSItem item;

    boolean itemFound = false;

    RSSHandler(){
    }

    RSSFeed getFeed(){
        return feed;
    }

    @Override
    public void startDocument() throws SAXException {
        // TODO Auto-generated method stub
        feed = new RSSFeed();
        item = new RSSItem();

    }



    @Override
    public void startElement(String uri, String localName, String qName,
            Attributes attributes) throws SAXException {
        // TODO Auto-generated method stub

        if (localName.equalsIgnoreCase("item")){
            itemFound = true;
            item = new RSSItem();
            currentState = state_unknown;
        }
        else if (localName.equalsIgnoreCase("title")){
            currentState = state_title;
        }
        else if (localName.equalsIgnoreCase("description")){
            currentState = state_description;
        }
        else if (localName.equalsIgnoreCase("link")){
            currentState = state_link;
        }
        else if (localName.equalsIgnoreCase("pubdate")){
            currentState = state_pubdate;
        }
        else{
            currentState = state_unknown;
        }

    }


    @Override
    public void endElement(String uri, String localName, String qName)
            throws SAXException {
        // TODO Auto-generated method stub
        currentState = state_unknown;
        if (localName.equalsIgnoreCase("item")){
            feed.addItem(item);
        }


    }

    @Override
    public void characters(char ch[], int start, int length)
            throws SAXException {
        //super.characters(ch, start, length);
        // TODO Auto-generated method stub
        StringBuilder buf=new StringBuilder();


        if (buf!=null) {
            for (int i=start; i<start+length; i++) {
                buf.append(ch[i]);


            }

            String strCharacters=buf.toString();





                if (itemFound==true){
        // "item" tag found, it's item's parameter
            switch(currentState){
            case state_title:
                item.setTitle(strCharacters);
                break;
            case state_description:
                item.setDescription(strCharacters);  //here data coming
                break;
            case state_link:
                item.setLink(strCharacters);
                break;
            case state_pubdate:
                item.setPubdate(strCharacters);
                break;  
            default:
                break;
            }

        }

        else{
        // not "item" tag found, it's feed's parameter
            switch(currentState){
            case state_title:
                feed.setTitle(strCharacters);
                break;
            case state_description:
                feed.setDescription(strCharacters);
                break;
            case state_link:
                feed.setLink(strCharacters);
                break;
            case state_pubdate:
                feed.setPubdate(strCharacters);
                break;  
            default:
                break;
            }
        }

        currentState = state_unknown;
    }


}


}
<description>
SEBI : Decision taken by a listed investment company to dispose of a part of its
       investment is not “price sensitive information” within meaning of SEBI
      (Prohibition of Insider Trading) Regulations, 1992<br>;
      By <b>  [2011] 15 taxmann.com 229 (SAT)</b> 
</description>

This is xml I want to parse data after <br>. I'm able parse before <br> but not able to parse after <br>

This is my handle class code :

package com.exercise;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class RSSHandler extends DefaultHandler {

    final int state_unknown = 0;
    final int state_title = 1;
    final int state_description = 2;
    final int state_link = 3;
    final int state_pubdate = 4;
    int currentState = state_unknown;

    RSSFeed feed;
    RSSItem item;

    boolean itemFound = false;

    RSSHandler(){
    }

    RSSFeed getFeed(){
        return feed;
    }

    @Override
    public void startDocument() throws SAXException {
        // TODO Auto-generated method stub
        feed = new RSSFeed();
        item = new RSSItem();

    }



    @Override
    public void startElement(String uri, String localName, String qName,
            Attributes attributes) throws SAXException {
        // TODO Auto-generated method stub

        if (localName.equalsIgnoreCase("item")){
            itemFound = true;
            item = new RSSItem();
            currentState = state_unknown;
        }
        else if (localName.equalsIgnoreCase("title")){
            currentState = state_title;
        }
        else if (localName.equalsIgnoreCase("description")){
            currentState = state_description;
        }
        else if (localName.equalsIgnoreCase("link")){
            currentState = state_link;
        }
        else if (localName.equalsIgnoreCase("pubdate")){
            currentState = state_pubdate;
        }
        else{
            currentState = state_unknown;
        }

    }


    @Override
    public void endElement(String uri, String localName, String qName)
            throws SAXException {
        // TODO Auto-generated method stub
        currentState = state_unknown;
        if (localName.equalsIgnoreCase("item")){
            feed.addItem(item);
        }


    }

    @Override
    public void characters(char ch[], int start, int length)
            throws SAXException {
        //super.characters(ch, start, length);
        // TODO Auto-generated method stub
        StringBuilder buf=new StringBuilder();


        if (buf!=null) {
            for (int i=start; i<start+length; i++) {
                buf.append(ch[i]);


            }

            String strCharacters=buf.toString();





                if (itemFound==true){
        // "item" tag found, it's item's parameter
            switch(currentState){
            case state_title:
                item.setTitle(strCharacters);
                break;
            case state_description:
                item.setDescription(strCharacters);  //here data coming
                break;
            case state_link:
                item.setLink(strCharacters);
                break;
            case state_pubdate:
                item.setPubdate(strCharacters);
                break;  
            default:
                break;
            }

        }

        else{
        // not "item" tag found, it's feed's parameter
            switch(currentState){
            case state_title:
                feed.setTitle(strCharacters);
                break;
            case state_description:
                feed.setDescription(strCharacters);
                break;
            case state_link:
                feed.setLink(strCharacters);
                break;
            case state_pubdate:
                feed.setPubdate(strCharacters);
                break;  
            default:
                break;
            }
        }

        currentState = state_unknown;
    }


}


}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

乖乖兔^ω^ 2024-12-24 11:11:35

您粘贴的第一个文本有问题。
尝试在代码模式下再次发布 XML(每行开头 4 个空格)。

我怀疑您拥有 url 编码格式的 xml,并且在开始处理它之前必须对其进行解码。

Something is wrong with the first text you pasted.
Try posting the XML again in code mode (4 spaces in the beginning of each line).

My suspicion is that you're having the xml in url-encoded format and that you'll have to decode it before you start handling it.

征﹌骨岁月お 2024-12-24 11:11:35

& 是 XML 实体引用,表示 &。

默认情况下,SAX 将为您执行转换,因此如果您的源 XML 表示 hello&goodbye,您应该会看到 hello&goodbye。
通过链接。它可能会解决你的问题

& is an XML entity reference and means &.

By default, SAX will do the conversion for you, so if your source XML says hello&goodbye you should see hello&goodbye.
go through This link. It might solve ur problem

山川志 2024-12-24 11:11:35

正如所发布的 XML 无效一样,您可能还需要转义文档中的引号。

我不知道这是否是你的问题,但这将是一个贡献者。

(报价围绕“价格敏感信息”)

As posted that XML is not valid, you will probably need to escape the quotes in the doc as well.

I don't know if that is your issue, but it will be a contributor.

(the quotes are around "price sensitive information")

岛歌少女 2024-12-24 11:11:35

我认为在您的情况下,问题在于您正在 characters() 内初始化 StringBuilder,因此每次都会创建新对象。不要在 characters() 中初始化它,而是尝试在 startElement() 中初始化它

@Override
    public void startElement(String uri, String localName, String qName,
            Attributes attributes) throws SAXException {

         StringBuilder buf=new StringBuilder()
..........
}

I think in your case the problem is that you are initializing the StringBuilder inside the characters() so new object is created everytime. Instead of intializing it in characters() try to initialize it in the startElement()

@Override
    public void startElement(String uri, String localName, String qName,
            Attributes attributes) throws SAXException {

         StringBuilder buf=new StringBuilder()
..........
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文