使用 Jena 读取本体,为其提供 RDF 三元组,并生成正确的 RDF 字符串输出

发布于 2024-09-02 01:34:12 字数 2112 浏览 6 评论 0原文

我有一个本体论,我与 Jena 一起阅读它,以帮助我从网站上抓取一些 RDFa 三元组。我目前没有将这些三元组存储在耶拿模型中,但这相当简单,它在我的下一个待办事项列表中。

不过,我正在努力解决的问题是让 Jena 为我拥有的本体输出正确的 RDF。本体使用 Owl 和 RDFS 定义,但是当我将一些示例三元组传递到模型中时,它们无法正确显示。几乎就好像它对本体一无所知。然而,输出仍然是有效的 RDF,只是它没有以我希望的形式出现。

我是否正确地认为 Jena 应该能够根据本体论对我收集的三元组生成写得好的 RDF(不仅仅是有效的),还是这超出了它的能力?

非常感谢您的任何意见。

更新 1

示例:

这是我们目前拥有的:

<rdf:Description rdf:about='http://theinternet.com/%3fq=Club/325'>
        <j.0:hasName>Manchester United</j.0:hasName>
       <j.0:hasPlayer>
             <rdf:Description rdf:about='http://theinternet.com/%3fq=player/291/'>
             </rdf:Description>
       </j.0:hasPlayer>
       <j.0:hasEmblem>http://theinternet.com/images/manutd.jpg</j.0:hasEmblem>
       <j.0:hasWebsite>http://www.manutd.com/</j.0:hasWebsite>
</rdf:Description>

</rdf:RDF>

这就是我们理想中想要的:

<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
      xmlns:owl="http://www.w3.org/2002/07/owl#"
      xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
      xmlns:ontology="http://theinternet.com/ontology.rdf#">

<rdf:Description rdf:about='http://theinternet.com/%3fq=Club/325'>
<rdf:type rdf:resource='ontology:Club' />
       <ontology:hasName>Manchester United</ontology:hasName>
       <ontology:hasPlayer>
             <rdf:Description rdf:about='http://theinternet.com/%3fq=player/291/'>
                 <rdf:type rdf:resource='ontology:Player' />
             </rdf:Description>
       </ontology:hasPlayer>
       <ontology:hasEmblem>http://theinternet.com/images/manutd.jpg</ontology:hasEmblem>
       <ontology:hasWebsite>http://www.manutd.com/</ontology:hasWebsite>
</rdf:Description>

</rdf:RDF>

对我来说,Jena 似乎缺少与本体有关的东西,例如资源类型等。我有这种感觉,我错误地使用了 Jena。

I have an ontology, which I read in with Jena to help me scrape some RDFa triples from a website. I don't currently store these triples in a Jena model, but that is fairly straight forward to do, its on my to do next list.

The area I am struggling with, though, is to get Jena to output correct RDF for the ontology I have. The ontology uses Owl and RDFS definitions, but when I pass some example triples into the model, they don't appear correctly. Almost as if it doesn't know anything about the ontology. The output is, however, still valid RDF, just it's not coming out in the form I was hoping for.

Am I correct in thinking that Jena should be able to produce well written RDF (not just valid) about the triples I have collected, based on the ontology or does this out stretch what it is capable of?

Many thanks for any input.

Update 1

Examples:

This is what we currently have:

<rdf:Description rdf:about='http://theinternet.com/%3fq=Club/325'>
        <j.0:hasName>Manchester United</j.0:hasName>
       <j.0:hasPlayer>
             <rdf:Description rdf:about='http://theinternet.com/%3fq=player/291/'>
             </rdf:Description>
       </j.0:hasPlayer>
       <j.0:hasEmblem>http://theinternet.com/images/manutd.jpg</j.0:hasEmblem>
       <j.0:hasWebsite>http://www.manutd.com/</j.0:hasWebsite>
</rdf:Description>

</rdf:RDF>

This is what we ideally want:

<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
      xmlns:owl="http://www.w3.org/2002/07/owl#"
      xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
      xmlns:ontology="http://theinternet.com/ontology.rdf#">

<rdf:Description rdf:about='http://theinternet.com/%3fq=Club/325'>
<rdf:type rdf:resource='ontology:Club' />
       <ontology:hasName>Manchester United</ontology:hasName>
       <ontology:hasPlayer>
             <rdf:Description rdf:about='http://theinternet.com/%3fq=player/291/'>
                 <rdf:type rdf:resource='ontology:Player' />
             </rdf:Description>
       </ontology:hasPlayer>
       <ontology:hasEmblem>http://theinternet.com/images/manutd.jpg</ontology:hasEmblem>
       <ontology:hasWebsite>http://www.manutd.com/</ontology:hasWebsite>
</rdf:Description>

</rdf:RDF>

To me it just looks like Jena is missing things to do with the ontology, such as the resource types etc. I have this feeling I'm using Jena wrongly.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

云之铃。 2024-09-09 01:34:12

如果您想要写得好 rdf(我假设是xml),请使用RDF/XML-ABBREV编写器。默认值通常很好,但是您可以此处找到调整说明

如果没有问题输出的示例,就很难知道您的问题是什么。您是否看到类似 的内容?这是一个前缀问题。如果它们是在原始 RDFa 文档中定义的,那么您已经以某种方式丢失了它们,但它应该很容易修复。否则,您可以使用 中的方法在模型上手动设置它们PrefixMappingModel 扩展)。

更新答案

感谢您提供的示例。前缀是这里的主要问题。

model.setNsPrefix("ontology", "http://theinternet.com/ontology.rdf#");
model.setNsPrefix("dc",   DC_11.NS);
model.setNsPrefix("owl",  OWL.NS);
model.setNsPrefix("rdfs", RDFS.NS);
model.setNsPrefix("xsd",  XSD.NS);

DC_11.NS jena 词汇包

请注意,rdf:resource(如 rdf:about)需要一个完整的URI,所以

<rdf:type rdf:resource='ontology:Club' />

不起作用。使用 showDoctypeDeclaration 选项 将使用 XML 实体进行缩写。

顺便问一下,您使用的是哪个 RDFa 解析器?前缀定义应该通过。

If you want well written rdf (xml I assume) use the RDF/XML-ABBREV writer. The default is usually fine, however you will find tuning instructions here.

Without an example of the problem output it's difficult to know what you problem is. Are you seeing things like <j.0:SomeClass>? That's a prefix issue. If they are defined in the original RDFa document then you've lost them somehow, but it ought to be easy to fix. Otherwise you can set them manually on the model using the methods in PrefixMapping (which Model extends).

Updated answer

Thanks for the example. Prefixes are the main issue here.

model.setNsPrefix("ontology", "http://theinternet.com/ontology.rdf#");
model.setNsPrefix("dc",   DC_11.NS);
model.setNsPrefix("owl",  OWL.NS);
model.setNsPrefix("rdfs", RDFS.NS);
model.setNsPrefix("xsd",  XSD.NS);

(DC_11.NS et al are defined in the the jena vocabulary package)

Note that rdf:resource (like rdf:about) takes a full URI, so

<rdf:type rdf:resource='ontology:Club' />

does not work. Using the showDoctypeDeclaration option will abbreviate using XML entities.

Incidentally, which RDFa parser did you use? The prefix definitions ought to pass through.

赠我空喜 2024-09-09 01:34:12

您缺少 rdf:type 属性,因为您尚未加载任何包含所需 rdfs:domain 或 rdfs:range 语句的本体,并且我认为您没有使用任何推理器来做出这些推论。

您可以将域或范围语句与其余数据一起加载,或者 jena 具有在看到 owl:imports 语句时自动加载本体的功能。我建议前者让事情变得简单。

此处记录的 jena RdfsInferencer http://jena.sourceforge.net/inference/ 将进行推理你想要的。

顺便说一句,我发现对于大规模的东西来说 sesame 比 jena 更容易使用并且更强大,尽管对于刮掉几个三元组都可以。

bbtw,Turtle(N3 的子集)比 RDF/XML 更容易阅读和编辑。非常值得学习。过去 3 年我一直在使用 rdf,现在在处理任何原始数据之前将所有 RDF/XML 转换为 Turtle(尽管我确实有一个很好的工具,可以按有用的顺序编写所有内容并自动插入反向引用注释等)。 )

祝你好运

You are missing the rdf:type properties because you haven't loaded any ontology containing the required rdfs:domain or rdfs:range statements and I don't think you've used any reasoner to make these inferences.

You can load the domain or range statements along with the rest of the data or jena has a facility for automatically loading an ontology when it sees and owl:imports statement. I'd suggest the former to keep things simple.

The jena RdfsInferencer documented here http://jena.sourceforge.net/inference/ will do the reasoning you want.

btw, I've found sesame to be a lot easier to use and more robust than jena for large scale stuff although for scraping a few triples either would be fine.

bbtw, Turtle (a subset of N3) is much easier to read and edit than RDF/XML. It's well worth learning. I've been working with rdf constantly for the last 3 years and now convert all RDF/XML to Turtle before dealing with any raw data (although I do have a nice tool that writes everything in a useful order and automatically inserts backreference comments etc.)

good luck

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文