如何提高SDB的SPARQL查询性能?

发布于 2024-12-05 01:38:23 字数 1173 浏览 0 评论 0原文

在我的应用程序中,我使用的SPARQL数据库是耶拿的SDB,数据库服务器是DB2。但我发现SPARQL的查询性能非常低。

谁能帮我解决这个问题?如何提高sparql查询性能,特别是SDB的查询性能吗?

下面是我的测试用例数据和SPARQL:

测试用例

rdf三元组总数为13294。查询结果三元组计数为420。 查询花费了 42 秒。

SPARQL 为

SELECT DISTINCT ?s ?name ?ownerId ?status ?time 
  ?value ?startTime ?endTime ?description 
WHERE 
{
  ?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> "http://www.w3c.com/schemas/cp#Event" .
  ?s <http://www.w3c.com/schemas/cp#time> ?time .
  ?s <http://www.w3c.com/schemas/cp#ownerId> ?ownerId .
  ?s <http://www.w3c.com/schemas/cp#name>  ?name .
  ?s <http://www.w3c.com/schemas/cp#value> ?value .
  ?s <http://www.w3c.com/schemas/cp#_status> ?status .
  ?s <http://www.w3c.com/schemas/cp#start_Time> ?startTime .
  ?s <http://www.w3c.com/schemas/cp#end_Time> ?endTime .
  ?s <http://www.w3c.com/schemas/cp#description> ?description .
  FILTER(xsd:dateTime(?time) >= "2011-08-12T00:00:00"^^xsd:dateTime  
    && xsd:dateTime(?time) <= "2011-09-18T23:59:59"^^xsd:dateTime) 
}

In my application, i used the SPARQL database is SDB of Jena, and the database server is DB2. but i find the query performance of SPARQL is very low.

who can help me to solve this problem? how to improve the sparql query performance,special is the query performance of SDB?

Below is my test case data and the SPARQL:

Test case:

total rdf triple counts are 13294. the query result triple counts are 420.
the query spent 42 seconds.

the SPARQL is:

SELECT DISTINCT ?s ?name ?ownerId ?status ?time 
  ?value ?startTime ?endTime ?description 
WHERE 
{
  ?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> "http://www.w3c.com/schemas/cp#Event" .
  ?s <http://www.w3c.com/schemas/cp#time> ?time .
  ?s <http://www.w3c.com/schemas/cp#ownerId> ?ownerId .
  ?s <http://www.w3c.com/schemas/cp#name>  ?name .
  ?s <http://www.w3c.com/schemas/cp#value> ?value .
  ?s <http://www.w3c.com/schemas/cp#_status> ?status .
  ?s <http://www.w3c.com/schemas/cp#start_Time> ?startTime .
  ?s <http://www.w3c.com/schemas/cp#end_Time> ?endTime .
  ?s <http://www.w3c.com/schemas/cp#description> ?description .
  FILTER(xsd:dateTime(?time) >= "2011-08-12T00:00:00"^^xsd:dateTime  
    && xsd:dateTime(?time) <= "2011-09-18T23:59:59"^^xsd:dateTime) 
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

简美 2024-12-12 01:38:23

任何 Triplestore(如 SDB)的查询性能总是比本机 Triplestore 差,因为 SQL 支持的 Triplestore(如 SDB)必须将 SPARQL 向下编译为 SQL,这通常会创建极其复杂的 SQL 查询。

因此,以您的示例为例,您要求匹配 9 个三元组模式,这将生成一个包含 9 个 INNER JOIN 操作的 SQL SELECT,这将花费大量时间开始。

然后,您将 FILTER 应用于这些三重模式,您遇到的问题是,除非过滤器表达式非常简单或足够接近 SQL,可以将其转换为 FILTER code> 必须在内存中的 Java 代码中进行计算。这在实践中意味着您要选择三元组中所有可能的事件,然后使用 Java 过滤内存中的日期范围,这总是会使您的查询变慢。

除非有特定原因您想使用 SDB,否则我真的建议您查看 Jena 的本机三重存储 TDB TDB2。它旨在更有效地执行 SPARQL 查询所需的联接类型,并且它存储数据的方式允许它更快地执行更复杂的过滤器(例如日期范围过滤器)。

The query performance of any Triplestore like SDB is always going to be worse than a native triplestore because an SQL backed triplestore like SDB has to down-compile SPARQL into SQL which often creates horrendously complex SQL queries.

So taking your example you've asked for 9 triple patterns to be matched which will generate an SQL SELECT containing 9 INNER JOIN operations which will take a lot of time to start with.

Then you are applying a FILTER to those triple patterns, the problem you have with this is that unless the filter expression is very simple or close enough to SQL to be converted into it a FILTER has to be evaluated in Java code in memory. What this means in practise is that you are selecting our all the possible events in the triplestore and then filtering for date range in-memory using Java which is always going to make your query slower.

Unless there is a specific reason you want to use SDB I'd really suggest looking at Jena's native triple store TDB or TDB2. It is designed to do the types of Joins required by SPARQL queries much more efficiently and the way it stores the data allows it to do more complicated filters like your date range one much much faster.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文