如何使用 Jena 处理 DBpedia 页面的 rdf 版本?

发布于 2024-10-16 07:58:46 字数 1126 浏览 7 评论 0原文

在所有 dbpedia 页面中,例如

http://dbpedia.org/page/Ireland

都有一个指向 RDF 的链接文件。 在我的应用程序中,我需要分析 rdf 代码并对其运行一些逻辑。 我可以依赖 dbpedia SPARQL 端点,但我更喜欢在本地下载 rdf 代码并解析它,以完全控制它。

我安装了JENA,我正在尝试解析代码并提取例如名为“geo:geometry”的属性。

我正在尝试:

StringReader sr = new StringReader( node.rdfCode )      
Model model = ModelFactory.createDefaultModel()
model.read( sr, null )

如何查询模型以获得我需要的信息?

例如,如果我想得到这样的语句:

<rdf:Description rdf:about="http://dbpedia.org/resource/Ireland">
<geo:geometry xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" rdf:datatype="http://www.openlinksw.com/schemas/virtrdf#Geometry">POINT(-7 53)</geo:geometry>
</rdf:Description>

或者

<rdf:Description rdf:about="http://dbpedia.org/resource/Ireland">
<dbpprop:countryLargestCity xmlns:dbpprop="http://dbpedia.org/property/" xml:lang="en">Dublin</dbpprop:countryLargestCity>
</rdf:Description>

什么是正确的过滤器?

非常感谢! 穆隆

In all dbpedia pages, e.g.

http://dbpedia.org/page/Ireland

there's a link to a RDF file.
In my application I need to analyse the rdf code and run some logic on it.
I could rely on the dbpedia SPARQL endpoint, but I prefer to download the rdf code locally and parse it, to have full control over it.

I installed JENA and I'm trying to parse the code and extract for example a property called: "geo:geometry".

I'm trying with:

StringReader sr = new StringReader( node.rdfCode )      
Model model = ModelFactory.createDefaultModel()
model.read( sr, null )

How can I query the model to get the info I need?

For example, if I wanted to get the statement:

<rdf:Description rdf:about="http://dbpedia.org/resource/Ireland">
<geo:geometry xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" rdf:datatype="http://www.openlinksw.com/schemas/virtrdf#Geometry">POINT(-7 53)</geo:geometry>
</rdf:Description>

Or

<rdf:Description rdf:about="http://dbpedia.org/resource/Ireland">
<dbpprop:countryLargestCity xmlns:dbpprop="http://dbpedia.org/property/" xml:lang="en">Dublin</dbpprop:countryLargestCity>
</rdf:Description>

What is the right filter?

Many thanks!
Mulone

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

顾北清歌寒 2024-10-23 07:58:46

在 Jena 模型中解析文件后,您可以使用以下内容进行迭代和过滤:

//Property to filter the model
Property geoProperty = 
    model. createProperty("http://www.w3.org/2003/01/geo/wgs84_pos#",
                          "geometry");

//Iterator based on a Simple selector
StmtIterator iter =
  model.listStatements(new SimpleSelector(null, geoProperty, (RDFNode)null)); 

//Loop to traverse the statements that match the SimpleSelector
while (iter.hasNext()) {
   Statement stmt = iter.nextStatement();
   System.out.print(stmt.getSubject().toString());
   System.out.print(stmt.getPredicate().toString());
   System.out.println(stmt.getObject().toString());

}

SimpleSelector 允许您传递任何(主语、谓词、宾语)模式来匹配模型中的语句。在您的情况下,如果您只关心特定谓词,则构造函数的第一个和第三个参数为 null。

允许过滤两个不同的属性

要允许更复杂的过滤,您可以在
SimpleSelector 界面如下所示:

Property geoProperty = /* like before */;
Property countryLargestCityProperty = 
    model. createProperty("http://dbpedia.org/property/",
                          "countryLargestCity");

SimpleSelector selector = new SimpleSelector(null, null, (RDFNode)null) {
    public boolean selects(Statement s)
        { return s.getPredicate().equals(geoProperty) || 
                 s.getPredicate().equals(countryLargestCityProperty) ;}
}
StmtIterator iter = model.listStatements(selector);
while(it.hasNext()) {
     /* same as in the previous example */
}

编辑:包括完整示例

此代码包括一个适合我的完整示例。

import com.hp.hpl.jena.util.FileManager;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.SimpleSelector;
import com.hp.hpl.jena.rdf.model.Property;
import com.hp.hpl.jena.rdf.model.RDFNode;
import com.hp.hpl.jena.rdf.model.Literal;
import com.hp.hpl.jena.rdf.model.StmtIterator;
import com.hp.hpl.jena.rdf.model.Statement;

public class TestJena {

    public static void main(String[] args) {
        FileManager fManager = FileManager.get();
        fManager.addLocatorURL();
        Model model = fManager.loadModel("http://dbpedia.org/data/Ireland.rdf");

        Property geoProperty = 
        model. createProperty("http://www.w3.org/2003/01/geo/wgs84_pos#",
                                  "geometry");

        StmtIterator iter =
            model.listStatements(new SimpleSelector(null, geoProperty,(RDFNode) null)); 

        //Loop to traverse the statements that match the SimpleSelector
        while (iter.hasNext()) {
            Statement stmt = iter.nextStatement();
            if (stmt.getObject().isLiteral()) {
                Literal obj = (Literal) stmt.getObject();
                System.out.println("The geometry predicate value is " + 
                                                          obj.getString());
            }   
        }   
    }   

}

这个完整的示例打印出:

The geometry predicate value is POINT(-7 53)

关于链接数据的注释

http://dbpedia.org/page/Ireland 是资源 http:// 的 HTML 文档版本dbpedia.org/resource/Ireland

为了获得 RDF,您应该解析:

http://dbpedia.org/data/Ireland.rdf

http://dbpedia .org/resource/Ireland + HTTP 标头中的 Accept: application/rdfxml
对于 curl ,它会类似于:

curl -L -H 'Accept: application/rdf+xml' http://dbpedia.org/resource/Ireland

Once you have the file parsed in a Jena model you can iterate and filter with something like:

//Property to filter the model
Property geoProperty = 
    model. createProperty("http://www.w3.org/2003/01/geo/wgs84_pos#",
                          "geometry");

//Iterator based on a Simple selector
StmtIterator iter =
  model.listStatements(new SimpleSelector(null, geoProperty, (RDFNode)null)); 

//Loop to traverse the statements that match the SimpleSelector
while (iter.hasNext()) {
   Statement stmt = iter.nextStatement();
   System.out.print(stmt.getSubject().toString());
   System.out.print(stmt.getPredicate().toString());
   System.out.println(stmt.getObject().toString());

}

The SimpleSelector allows you to pass any (subject,predicate,object) pattern to match statements in the model. In your case if you only care about a specific predicate then first and third parameters of the constructor are null.

Allowing filtering two different properties

To allow more complex filtering you can implement the selects method in the
SimpleSelector interface like here:

Property geoProperty = /* like before */;
Property countryLargestCityProperty = 
    model. createProperty("http://dbpedia.org/property/",
                          "countryLargestCity");

SimpleSelector selector = new SimpleSelector(null, null, (RDFNode)null) {
    public boolean selects(Statement s)
        { return s.getPredicate().equals(geoProperty) || 
                 s.getPredicate().equals(countryLargestCityProperty) ;}
}
StmtIterator iter = model.listStatements(selector);
while(it.hasNext()) {
     /* same as in the previous example */
}

Edit: including a full example

This code includes a full example that works for me.

import com.hp.hpl.jena.util.FileManager;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.SimpleSelector;
import com.hp.hpl.jena.rdf.model.Property;
import com.hp.hpl.jena.rdf.model.RDFNode;
import com.hp.hpl.jena.rdf.model.Literal;
import com.hp.hpl.jena.rdf.model.StmtIterator;
import com.hp.hpl.jena.rdf.model.Statement;

public class TestJena {

    public static void main(String[] args) {
        FileManager fManager = FileManager.get();
        fManager.addLocatorURL();
        Model model = fManager.loadModel("http://dbpedia.org/data/Ireland.rdf");

        Property geoProperty = 
        model. createProperty("http://www.w3.org/2003/01/geo/wgs84_pos#",
                                  "geometry");

        StmtIterator iter =
            model.listStatements(new SimpleSelector(null, geoProperty,(RDFNode) null)); 

        //Loop to traverse the statements that match the SimpleSelector
        while (iter.hasNext()) {
            Statement stmt = iter.nextStatement();
            if (stmt.getObject().isLiteral()) {
                Literal obj = (Literal) stmt.getObject();
                System.out.println("The geometry predicate value is " + 
                                                          obj.getString());
            }   
        }   
    }   

}

This full example prints out:

The geometry predicate value is POINT(-7 53)

Notes on Linked Data

http://dbpedia.org/page/Ireland is the HTML document version of the resource http://dbpedia.org/resource/Ireland

In order to get the RDF you should resolve :

http://dbpedia.org/data/Ireland.rdf

or

http://dbpedia.org/resource/Ireland + Accept: application/rdfxml in the HTTP header.
With curl it'd be something like:

curl -L -H 'Accept: application/rdf+xml' http://dbpedia.org/resource/Ireland

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文