Java n-triple RDF 解析

发布于 2024-11-04 06:14:00 字数 174 浏览 7 评论 0原文

我想解析一个 n-triple 形式的 RDF 文件。

我可以编写自己的解析器,但我宁愿使用库,而耶拿为此目的似乎不必要地复杂(或者至少我看不到他们的文档解释如何以合理的方式读取 n 三元组)。

您能否给我指出任何有用的库,或者如果您很了解 Sesame 或 Jena,您可能知道他们如何解决这个问题。

I want to parse an RDF file which is in n-triple form.

I can write my own parser but I would rather use a library, and Jena seems unecessarily complicated for this purpose (or at least I can't see their docs explaining how to read n-triples in a sensible way).

Could you please either point me to any useful libraries or if you know either Sesame or Jena well, you might know something about how they can solve this.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

情痴 2024-11-11 06:14:00

对于 Jena 来说,这并不是那么困难:

给定一个文件 rdfexample.ntriple,其中包含 N-TRIPLE 形式的以下 RDF(示例取自 此处):

<http://www.recshop.fake/cd/Hide your heart> <http://www.recshop.fake/cd#year> "1988" .
<http://www.recshop.fake/cd/Hide your heart> <http://www.recshop.fake/cd#price> "9.90" .
<http://www.recshop.fake/cd/Hide your heart> <http://www.recshop.fake/cd#company> "CBS Records" .
<http://www.recshop.fake/cd/Hide your heart> <http://www.recshop.fake/cd#country> "UK" .
<http://www.recshop.fake/cd/Hide your heart> <http://www.recshop.fake/cd#artist> "Bonnie Tyler" .
<http://www.recshop.fake/cd/Empire Burlesque> <http://www.recshop.fake/cd#year> "1985" .
<http://www.recshop.fake/cd/Empire Burlesque> <http://www.recshop.fake/cd#price> "10.90" .
<http://www.recshop.fake/cd/Empire Burlesque> <http://www.recshop.fake/cd#company> "Columbia" .
<http://www.recshop.fake/cd/Empire Burlesque> <http://www.recshop.fake/cd#country> "USA" .
<http://www.recshop.fake/cd/Empire Burlesque> <http://www.recshop.fake/cd#artist> "Bob Dylan" .

以下代码

public static void main(String[] args) {
    String fileNameOrUri = "src/a/rdfexample.ntriple";
    Model model = ModelFactory.createDefaultModel();
    InputStream is = FileManager.get().open(fileNameOrUri);
    if (is != null) {
        model.read(is, null, "N-TRIPLE");
        model.write(System.out, "TURTLE");
    } else {
        System.err.println("cannot read " + fileNameOrUri);;
    }
}

读取该文件,并以 TURTLE 形式打印出来:

<http://www.recshop.fake/cd/Hide your heart>
      <http://www.recshop.fake/cd#artist>
              "Bonnie Tyler" ;
      <http://www.recshop.fake/cd#company>
              "CBS Records" ;
      <http://www.recshop.fake/cd#country>
              "UK" ;
      <http://www.recshop.fake/cd#price>
              "9.90" ;
      <http://www.recshop.fake/cd#year>
              "1988" .

<http://www.recshop.fake/cd/Empire Burlesque>
      <http://www.recshop.fake/cd#artist>
              "Bob Dylan" ;
      <http://www.recshop.fake/cd#company>
              "Columbia" ;
      <http://www.recshop.fake/cd#country>
              "USA" ;
      <http://www.recshop.fake/cd#price>
              "10.90" ;
      <http://www.recshop.fake/cd#year>
              "1985" .

因此,使用 Jena,您可以轻松地将 RDF(任何形式)解析为 < code>com.hp.hpl.jena.rdf.model.Model 对象,它允许您以编程方式操作它。

With Jena it is not so difficult:

Given a file rdfexample.ntriple containing the following RDF in N-TRIPLE form (example taken from here):

<http://www.recshop.fake/cd/Hide your heart> <http://www.recshop.fake/cd#year> "1988" .
<http://www.recshop.fake/cd/Hide your heart> <http://www.recshop.fake/cd#price> "9.90" .
<http://www.recshop.fake/cd/Hide your heart> <http://www.recshop.fake/cd#company> "CBS Records" .
<http://www.recshop.fake/cd/Hide your heart> <http://www.recshop.fake/cd#country> "UK" .
<http://www.recshop.fake/cd/Hide your heart> <http://www.recshop.fake/cd#artist> "Bonnie Tyler" .
<http://www.recshop.fake/cd/Empire Burlesque> <http://www.recshop.fake/cd#year> "1985" .
<http://www.recshop.fake/cd/Empire Burlesque> <http://www.recshop.fake/cd#price> "10.90" .
<http://www.recshop.fake/cd/Empire Burlesque> <http://www.recshop.fake/cd#company> "Columbia" .
<http://www.recshop.fake/cd/Empire Burlesque> <http://www.recshop.fake/cd#country> "USA" .
<http://www.recshop.fake/cd/Empire Burlesque> <http://www.recshop.fake/cd#artist> "Bob Dylan" .

the following code

public static void main(String[] args) {
    String fileNameOrUri = "src/a/rdfexample.ntriple";
    Model model = ModelFactory.createDefaultModel();
    InputStream is = FileManager.get().open(fileNameOrUri);
    if (is != null) {
        model.read(is, null, "N-TRIPLE");
        model.write(System.out, "TURTLE");
    } else {
        System.err.println("cannot read " + fileNameOrUri);;
    }
}

reads the file, and prints it out in TURTLE form:

<http://www.recshop.fake/cd/Hide your heart>
      <http://www.recshop.fake/cd#artist>
              "Bonnie Tyler" ;
      <http://www.recshop.fake/cd#company>
              "CBS Records" ;
      <http://www.recshop.fake/cd#country>
              "UK" ;
      <http://www.recshop.fake/cd#price>
              "9.90" ;
      <http://www.recshop.fake/cd#year>
              "1988" .

<http://www.recshop.fake/cd/Empire Burlesque>
      <http://www.recshop.fake/cd#artist>
              "Bob Dylan" ;
      <http://www.recshop.fake/cd#company>
              "Columbia" ;
      <http://www.recshop.fake/cd#country>
              "USA" ;
      <http://www.recshop.fake/cd#price>
              "10.90" ;
      <http://www.recshop.fake/cd#year>
              "1985" .

So, with Jena you can easily parse RDF (in any form) into a com.hp.hpl.jena.rdf.model.Model object, which allows you to programmatically manipulate it.

花伊自在美 2024-11-11 06:14:00

如果您只想解析 NTriples 并且除了基本处理和查询之外不需要执行任何操作,那么您可以尝试 Nx解析器。这是一段非常简单的 Java 代码,它将传递任何 NTriples 之类的格式(例如 NQuads 等),这为您提供了文件中语句的迭代器。如果您只想要 NTriples,您可以轻松忽略少于/多于 3 个项目的语句。

调整链接页面上的示例将给出以下简单代码:

NxParser nxp = new NxParser(new FileInputStream("filetoparse.nq"),false);

while (nxp.hasNext()) 
{
  Node[] ns = nxp.next();
  if (ns.length == 3)
  {
    //Only Process Triples  
    //Replace the print statements with whatever you want
    for (Node n: ns) 
    {
      System.out.print(n.toN3());
      System.out.print(" ");
    }
    System.out.println(".");
  }
}

If you just want to parse the NTriples and don't need to do anything other than basic processing and querying then you could try the NxParser. It is a very simple bit of Java code that'll pass any NTriples like format (so NQuads etc) which gives you an iterator over the statements in the file. If you only want NTriples you can easily ignore statements with less/more than 3 items.

Adapting the example on the linked page would give the following simple code:

NxParser nxp = new NxParser(new FileInputStream("filetoparse.nq"),false);

while (nxp.hasNext()) 
{
  Node[] ns = nxp.next();
  if (ns.length == 3)
  {
    //Only Process Triples  
    //Replace the print statements with whatever you want
    for (Node n: ns) 
    {
      System.out.print(n.toN3());
      System.out.print(" ");
    }
    System.out.println(".");
  }
}
忘你却要生生世世 2024-11-11 06:14:00

老问题,但既然你明确询问不同的库,我想我会展示如何使用 Eclipse 进行简单的 RDF 解析RDF4JRio 解析器 (披露:我是 RDF4J 开发人员之一)。

例如,要解析文件并将所有三元组放入 Model 中,只需执行以下操作:

FileInputStream in = new FileInputStream("/path/to/file.nt");

Model m = Rio.parse(in, RDFFormat.NTRIPLES);

如果您想立即将解析器输出打印到 stdout(例如以 Turtle 格式),请执行以下操作:

FileInputStream in = new FileInputStream("/path/to/file.nt");

RDFParser parser = Rio.createParser(RDFFormat.NTRIPLES);
parser.parse(in, "", Rio.createWriter(RDFFormat.TURTLE, System.out));

当然,还有更多方法可以使用这些基本工具,请查看工具包文档以了解详细信息。

顺便说一句,Rio 解析器可作为单独的 Maven 工件使用,因此,如果您希望仅使用解析器,而不使用其余的 RDF4J 工具,您可以这样做。

Old question, but since you explicitly ask about different libraries, I'd thought I'd show how to do simple RDF parsing with Eclipse RDF4J's Rio parser (disclosure: I am one of the RDF4J developers).

For example, to parse the file and put all triples in a Model, just do this:

FileInputStream in = new FileInputStream("/path/to/file.nt");

Model m = Rio.parse(in, RDFFormat.NTRIPLES);

If you want to immediately print the parser output to stdout (for example in Turtle format), do something like this:

FileInputStream in = new FileInputStream("/path/to/file.nt");

RDFParser parser = Rio.createParser(RDFFormat.NTRIPLES);
parser.parse(in, "", Rio.createWriter(RDFFormat.TURTLE, System.out));

And of course there are more ways to play with these basic tools, have a look at the toolkit documentation for details.

The Rio parsers are available as separate maven artifacts by the way, so if you wish to use only the parsers, without the rest of the RDF4J tools, you can do so.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文