使用 JDOM 解析 XML 时出错 - 序言中不允许内容

发布于 2024-10-15 09:26:50 字数 1374 浏览 7 评论 0 原文


我在使用 JDOM 解析 xml 文件时收到此错误。
发生的情况是,我收到了一个数据流,它是一个 xml 与一个 pdf 结合在一起作为其中的附件。因此,当我尝试创建它的文档时,会抛出此错误。
我尝试打印此流,并在控制台上得到以下内容,它包含很多垃圾字符(pdf 内容),但在写字板中它看起来像 -

------=_Part_2_23286828.1296553488632
Content-Type: text/xml; charset=utf-8

<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
....
....
....
    <Attachment>
        <URI>Filename.pdf</URI>
    </Attachment>
</SOAP-ENV:Envelope>
------=_Part_2_23286828.1296553488632
Content-Type: application/pdf; name="Filename.pdf"
Content-Transfer-Encoding: binary
Content-ID: </Attachment[1]/URI[1]>
Content-Disposition: attachment; filename="Filename.pdf"

%PDF-1.4
%âãÏÓ
4 0 obj <</Type/XObject/ColorSpace/DeviceRGB/Subtype/Image/BitsPerComponent 8/Width 579/Length 52722/Height 480/Filter/DCTDecode>>stream
ÿØÿà 

请注意 之间的 xml ; 格式良好。
我怎样才能用它创建一个 JDOM 文档呢?我想,通过删除 xml 开始/结束标记之前和之后的内容,但如何以干净的方式?
我读到来自 Apache IO Commons 的 BOMInputStream 很有帮助,但我相信它是版本 2.* 并且我使用的是版本 1.3.1

我希望这可以解释我的问题,如果不能,请告诉我。
谢谢。

更新
起初我没有意识到会这么麻烦。
实际上,我正在使用 HttpURLConnection 从一个 servlet 到另一个 servlet (doPost) 进行调用。返回就是这个流的形式。
现在,我也在尝试探索是否可以使用 Http/URLConnection 提供的一些方法来提取 xml 部分。
如果有人能对此提供更多说明,我将不胜感激。

I get this error while parsing an xml file using JDOM.
What is happening is, I receive a stream of data which is an xml combined with a pdf as an attachment within it. So when I try to create a document of it, this error is thrown.
I tried to print this stream and on the console I get the following, It is with lot of junk chars(the pdf contents) but in Wordpad it looks like -

------=_Part_2_23286828.1296553488632
Content-Type: text/xml; charset=utf-8

<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
....
....
....
    <Attachment>
        <URI>Filename.pdf</URI>
    </Attachment>
</SOAP-ENV:Envelope>
------=_Part_2_23286828.1296553488632
Content-Type: application/pdf; name="Filename.pdf"
Content-Transfer-Encoding: binary
Content-ID: </Attachment[1]/URI[1]>
Content-Disposition: attachment; filename="Filename.pdf"

%PDF-1.4
%âãÏÓ
4 0 obj <</Type/XObject/ColorSpace/DeviceRGB/Subtype/Image/BitsPerComponent 8/Width 579/Length 52722/Height 480/Filter/DCTDecode>>stream
ÿØÿà 

Please note that the xml between <SOAP-ENV:Envelope> and </SOAP-ENV:Envelope> is well-formed.
How could I go about and create a JDOM document out of it? I guess, by removing the content before and after the xml start/end tags but how in a clean way?
I read that BOMInputStream from Apache IO Commons is helpful but I believe it is in version 2.* and I am using version 1.3.1

I hope this explains my problem, if not pls let me know.
Thank you.

UPDATE
At first I didnt realize it would be this cumbersome.
Actually, I am making a call from one servlet to another(doPost) using HttpURLConnection. The return is in the form of this stream.
Now, I am also trying to explore if in any way I can extract the xml part using some of the methods provided by Http/URLConnection.
Appreciate if anyone could shed some more light on this.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

旧伤慢歌 2024-10-22 09:26:50

此消息符合 SOAP with Attachment 规范 (http://www.w3.org/TR/SOAP-attachments)。在 java 中,解析这些消息的方法是使用 SAAJ(Java 的带有附件 API 的 Soap:http://download.oracle.com/javaee/5/tutorial/doc/bnbhf.html。)SAAJ 有几种不同的实现。我个人最喜欢的是 Spring-WS 实现,另一个选择是 Apache Axiom。

我对您的建议是使用 Spring-WS 或 Apache Axis 来处理此消息,而不是尝试从输入流手动执行此操作。您想在服务器端还是客户端执行此操作?

This message conforms to the SOAP with Attachment specification (http://www.w3.org/TR/SOAP-attachments). In java the way to parse these messages is to use an implementation of the SAAJ (Soap with Attachments API for Java: http://download.oracle.com/javaee/5/tutorial/doc/bnbhf.html.) There are a couple of different implementations of SAAJ out there. My personal favorite is the Spring-WS implementation another option is Apache Axiom.

My suggestion to you would be use either Spring-WS or Apache Axis to process this message rather than trying to do it manually from an input stream. Are you trying to do this on the server side or on the client side?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文