为 RDBMS（MySQL 数据库）创建 SPARQL 端点的最佳方法

发布于 2024-09-02 18:53:07 字数 498 浏览 5 评论 0原文

我正在（想做）一些链接开放数据集的实验，特别是政府推出的实验。

我有一个 RDBMS（更具体地说是 MySQL）。我设计它时考虑了语义网络的想法，即我将信息存储为对象、谓词和定义对象的类。反过来，所有对象通过主语 --> 形式的陈述相互关联。谓词 -->对象（其中主题来自对象表）。

我希望能够从我的应用程序查询其他 RDF 三元组存储，并让其他三元组存储查询我的数据。是否有可能“设置一些东西”以使这成为可能？

我看过耶拿。使用 Jena 似乎意味着我必须将它作为存储应用程序而不是 MySQL - 唯一的问题是我包含了一个称为类别的新概念（我不认为它是语义网络语言的一部分）。我将使用类别来帮助显示信息（它们没有任何其他含义），但使用 Jena 似乎意味着我无法在类别下组织谓词以方便查看。

我使用的是 Java，所以首选 JAVA API。

我也可能误解了耶拿的目的，也许这有用，但我不确定如何用。

我确信四天后这个问题会显得相当愚蠢，但目前我对如何继续感到有些困惑。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

高跟鞋的旋律 2024-09-09 18:53:07

我不确定你所说的“一个称为类别的新概念”是什么意思，也许你可以举个例子？

如果您的意思是您想要添加额外的元数据，也许作为在用户界面中组织信息的一种方式，则无需扩展语义网络语言或存储系统 - 它们已经可以做您想做的事情。

假设您有来自英国政府学校数据集的学校数据（为简洁起见，使用 Turtle 编码)：

@prefix sch-ont:  <http://education.data.gov.uk/def/school/>.
<http://education.data.gov.uk/id/school/135412>
a sch-ont:School;
sch-ont:establishmentStatus 
    <http://education.data.gov.uk/def/school/EstablishmentStatus_Open>;
sch-ont:MSOA <http://statistics.data.gov.uk/id/msoa/E02000001>;
sch-ont:establishmentName "Guildhall School of Music and Drama";
...

您可以直接从 SPARQL 端点查询该数据，也可以下载数据并将其本地存储在您自己的三重存储中。无论哪种方式，您都可以完全自由地添加对用户有用的额外信息。例如：

@prefix ankurs-app: <http://ankur.org/example/app/vocab/display#>.
<http://education.data.gov.uk/id/school/135412> 
        ankurs-app:category ankurs-app:wkdCool.

您可以将此新的三元组存储在与下载的数据相同的图中，也可以将其存储在单独的命名图中，以表明它的信息与源数据具有不同的来源。不管怎样，从 Jena 以编程方式或通过 SPARQL 查询来查询它都很简单。

为高效查询无模式三重中心数据进行布局是一个经过充分研究的难题。大多数 RDF 平台（包括 Jena）都具有经过良好优化的代码，用于根据自己的数据库方案查询和更新三元组。您必须有充分的理由开始自己的关系表布局:)

如果您确实需要采用现有的关系表方案并将其映射到 Jena RDF 模型，请查看 D2RQ。

I'm not sure what you mean by "a new concept called category", perhaps you can give an example?

If you mean that you want to add additional metadata, perhaps as a way of organizing information in the user interface, there is no need to extend the semantic web languages or storage systems - they can already do what you want.

Suppose you have data for a school from the UK Government schools dataset (using Turtle encoding for brevity):

@prefix sch-ont:  <http://education.data.gov.uk/def/school/>.
<http://education.data.gov.uk/id/school/135412>
a sch-ont:School;
sch-ont:establishmentStatus 
    <http://education.data.gov.uk/def/school/EstablishmentStatus_Open>;
sch-ont:MSOA <http://statistics.data.gov.uk/id/msoa/E02000001>;
sch-ont:establishmentName "Guildhall School of Music and Drama";
...

You can directly query that data from the SPARQL end-point, or you can download the data and store it locally in your own triple store. Either way, you're perfectly at liberty to add extra information that's useful to your users. For example:

@prefix ankurs-app: <http://ankur.org/example/app/vocab/display#>.
<http://education.data.gov.uk/id/school/135412> 
        ankurs-app:category ankurs-app:wkdCool.

You can store this new triple in the same graph as the downloaded data, or you can store it in a separate named-graph to indicate that it's information that has a different provenance than the source data. Either way, it's then simple to query it either programmatically from Jena, or via a SPARQL query.

Doing a layout for efficiently querying schemaless triple-centric data is a well-studied, and hard, problem. Most of the RDF platforms, including Jena, have well-optimised code for querying and updating triples from their own database schemes. You would have to have very good reasons for embarking on your own relational table layout :)

If you really do need to take an existing relational table scheme and map it to a Jena RDF model, look at D2RQ.

回复收藏 0 原文