如何提高SDB的SPARQL查询性能？

发布于 2024-12-05 01:38:23 字数 1173 浏览 5 评论 0原文

在我的应用程序中，我使用的SPARQL数据库是耶拿的SDB，数据库服务器是DB2。但我发现SPARQL的查询性能非常低。

谁能帮我解决这个问题？如何提高sparql查询性能，特别是SDB的查询性能吗？

下面是我的测试用例数据和SPARQL：

测试用例：

rdf三元组总数为13294。查询结果三元组计数为420。查询花费了 42 秒。

SPARQL 为：

SELECT DISTINCT ?s ?name ?ownerId ?status ?time 
  ?value ?startTime ?endTime ?description 
WHERE 
{
  ?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> "http://www.w3c.com/schemas/cp#Event" .
  ?s <http://www.w3c.com/schemas/cp#time> ?time .
  ?s <http://www.w3c.com/schemas/cp#ownerId> ?ownerId .
  ?s <http://www.w3c.com/schemas/cp#name>  ?name .
  ?s <http://www.w3c.com/schemas/cp#value> ?value .
  ?s <http://www.w3c.com/schemas/cp#_status> ?status .
  ?s <http://www.w3c.com/schemas/cp#start_Time> ?startTime .
  ?s <http://www.w3c.com/schemas/cp#end_Time> ?endTime .
  ?s <http://www.w3c.com/schemas/cp#description> ?description .
  FILTER(xsd:dateTime(?time) >= "2011-08-12T00:00:00"^^xsd:dateTime  
    && xsd:dateTime(?time) <= "2011-09-18T23:59:59"^^xsd:dateTime) 
}

原文

In my application, i used the SPARQL database is SDB of Jena, and the database server is DB2. but i find the query performance of SPARQL is very low.

who can help me to solve this problem? how to improve the sparql query performance,special is the query performance of SDB?

Below is my test case data and the SPARQL:

Test case:

total rdf triple counts are 13294. the query result triple counts are 420.
the query spent 42 seconds.

the SPARQL is:

SELECT DISTINCT ?s ?name ?ownerId ?status ?time 
  ?value ?startTime ?endTime ?description 
WHERE 
{
  ?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> "http://www.w3c.com/schemas/cp#Event" .
  ?s <http://www.w3c.com/schemas/cp#time> ?time .
  ?s <http://www.w3c.com/schemas/cp#ownerId> ?ownerId .
  ?s <http://www.w3c.com/schemas/cp#name>  ?name .
  ?s <http://www.w3c.com/schemas/cp#value> ?value .
  ?s <http://www.w3c.com/schemas/cp#_status> ?status .
  ?s <http://www.w3c.com/schemas/cp#start_Time> ?startTime .
  ?s <http://www.w3c.com/schemas/cp#end_Time> ?endTime .
  ?s <http://www.w3c.com/schemas/cp#description> ?description .
  FILTER(xsd:dateTime(?time) >= "2011-08-12T00:00:00"^^xsd:dateTime  
    && xsd:dateTime(?time) <= "2011-09-18T23:59:59"^^xsd:dateTime) 
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

简美 2024-12-12 01:38:23

任何 Triplestore（如 SDB）的查询性能总是比本机 Triplestore 差，因为 SQL 支持的 Triplestore（如 SDB）必须将 SPARQL 向下编译为 SQL，这通常会创建极其复杂的 SQL 查询。

因此，以您的示例为例，您要求匹配 9 个三元组模式，这将生成一个包含 9 个 INNER JOIN 操作的 SQL SELECT，这将花费大量时间开始。

然后，您将 FILTER 应用于这些三重模式，您遇到的问题是，除非过滤器表达式非常简单或足够接近 SQL，可以将其转换为 FILTER code> 必须在内存中的 Java 代码中进行计算。这在实践中意味着您要选择三元组中所有可能的事件，然后使用 Java 过滤内存中的日期范围，这总是会使您的查询变慢。

除非有特定原因您想使用 SDB，否则我真的建议您查看 Jena 的本机三重存储 TDB 或 TDB2。它旨在更有效地执行 SPARQL 查询所需的联接类型，并且它存储数据的方式允许它更快地执行更复杂的过滤器（例如日期范围过滤器）。