RDF 读取/解析错误
我有一些 RDF 文件,我想将其导入到 Tripplestore(AllegroGraph) 中,但在第一个文件中,我收到 SAX 解析器错误,指出存在无法识别的字符。删除有问题的行后,一切都很好。 然后我尝试使用 W3C RDF 验证器和带有错误行的 RDF 上的 Jena,但我得到的只是一些有关未定义语言的警告(绝对没有关于错误行的信息)。 您能否建议一种方法(如果可能的话,使用 java)来查找 RDF 文件中的错误?
编辑:有问题的行是:
<gn:alternateName xml:lang="got">𐌰𐍆𐌲𐌰𐌽𐌹𐍃𐍄𐌰𐌽</gn:alternateName>
I have some RDF files which I want to import into a tripplestore(AllegroGraph), but at the first file I get a SAX parser error, stating there is an unrecognized character. After removing the line in question, everything is great.
Then I have tried using the W3C RDF validator and Jena on the RDF with the error-line, but all I got was some warnings regarding undefined languages(absolutely nothing about the error-line).
Could you please suggest a method(java if possible) to finding errors in RDF files?
Edit: The line in question is:
<gn:alternateName xml:lang="got">𐌰𐍆𐌲𐌰𐌽𐌹𐍃𐍄𐌰𐌽</gn:alternateName>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以使用 Sesame 的 Rio 解析器进行验证。 这篇博文 一般如何与 Rio 合作。具体来说,对于验证,技巧是创建并附加 ParseErrorListener 从解析器接收详细的警告和错误。
然而,由于您提到您遇到的问题是在 SAX/XML 级别,因此您也可以使用通用 XML 验证器来查看问题所在。最可能的原因(但在没有更多细节的情况下很难判断)是某个地方的字符编码不正确。
You can use Sesame's Rio parser to do validation. There's instructions in this blogpost on how to work with Rio in general. For validation specifically, the trick is to create and attach a ParseErrorListener that receives detailed warning and errors from the parser.
However, since you mention that the problem you encounter is at the level of SAX / XML, you could also just use a generic XML validator to see what's wrong. The most likely cause (but it's hard to tell without more details) is that you have an incorrectly encoded character in there somewhere.