jdom和ssis之间奇怪的交互
对于这么长的帖子,我深表歉意,但这个问题并不容易说明。
我最近写了一段Java来为同事重新配置一些SSIS包,使用jdom来解析和操作XML。该程序可以运行,但生成的文件崩溃了。我们能够将崩溃追溯到原始文件中一个奇怪的大多数非打印字符,该字符没有在 jdom 编写的文件中重现。
这个角色的奇怪之处在于它并没有出现在所有编辑器中。例如,Oxygen XML 编辑器甚至看不到它。然而,在记事本中,原始版权声明如下所示:
<DTS:Property DTS:Name="TaskContact">Execute SQL Task; Microsoft Corporation; Microsoft
SQL Server v9; © 2004 Microsoft Corporation; All Rights
Reserved;http://www.microsoft.com/sql/support/default.asp;1</DTS:Property>
以及同一元素的转换版本:(
<DTS:Property DTS:Name="TaskContact">Execute SQL Task; Microsoft Corporation; Microsoft
SQL Server v9; © 2004 Microsoft Corporation; All Rights
Reserved;http://www.microsoft.com/sql/support/default.asp;1</DTS:Property>
问题字符是版权符号之前的 ¡)
对有问题的包运行全局替换,其中 -> “”和©-> “(c)”,使问题消失,但现在事实证明,当将未修改的元素放入修改后的包中时,问题又回来了,所以现在我不确定问题的根源是什么。
再次,我对这么长的帖子感到抱歉,但我不想遗漏任何细节。任何见解或建议将不胜感激;我很困惑。
我的同事将向我发送他尝试加载这些错误消息,如果它们有用,我可以发布这些消息。
I apologize for the long post, but this problem is not easily stated.
I recently wrote a piece of Java to reconfigure some SSIS packages for a colleague, using jdom to parse and manipulate the XML. The program worked, but the resulting files crashed. We were able to trace the crash to an odd mostly-nonprinting character in the original files, which was not reproduced in the files written by jdom.
What's strange about this character is that it doesn't show up in all editors. The Oxygen XML editor, for example, doesn't even see it. However, in notepad, the original copyright notice appears like this:
<DTS:Property DTS:Name="TaskContact">Execute SQL Task; Microsoft Corporation; Microsoft
SQL Server v9; © 2004 Microsoft Corporation; All Rights
Reserved;http://www.microsoft.com/sql/support/default.asp;1</DTS:Property>
and the transformed version of the same element:
<DTS:Property DTS:Name="TaskContact">Execute SQL Task; Microsoft Corporation; Microsoft
SQL Server v9; © 2004 Microsoft Corporation; All Rights
Reserved;http://www.microsoft.com/sql/support/default.asp;1</DTS:Property>
(the problem character is the  just before the copyright symbol)
Running a global replace on the packages in question, where  -> "" and © -> "(c)", made the problem go away, but now it turns out that the problem comes back when unmodified elements are put into the modified packages, so now I'm not as sure what is at the root of the problem.
Again, I'm sorry for the long post, but I didn't want to leave out any details. Any insights or suggestions would be greatly appreciated; I'm pretty well stumped.
My colleague will be sending me error messages from his attempts to load these, I can post those if they're useful.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
至于问题的根源:以一种编码写入并读取另一种编码。请参阅我对这个问题的回答。 £ 变成 £ 为什么? XML ISO 编码问题?
只需将英镑符号 £ 替换为版权符号 © (unicode U+00A9)。希望您能找到发生编码混淆的地方。
As to the root of the problem: writing in one encoding and reading another. See my answer to this question. £ becomes £ Why? XML ISO encoding issue?
Just replace the pound sign, £, with the copyright symbol, © (unicode U+00A9). Hopefully you can find the place where the encoding mixup is occurring.