Query times tend to be much slower than for conventional DBs, even with simple queries. Also, many RDF stores don't support standard DB features like transactions, crash recovery, ...
One of the shortcomings we have come across in using RDF triple stores for general programming is that most engines don't support aggregation in queries (min, max, group by).
A checklist we use to decide between RDBMS is the following
RDBMS if
static schema
very large amount of data
no RDF export needed
Lucene support needed (easy via Hibernate Search for example)
strong data consistency requirements (money involved etc)
RDF if
not fixed or dynamic schema
small to large amount of data
RDF export needed
loose data consistency requirements
Refactoring RDFBMS schemas for ongoing projects can be quite an overhead if you don't have the correct tools.
Lucene support is provided by some RDF engines as well, but is not as well documented and supported as in the case of Hibernate Search.
Scalability of RDF engines is also improving steadily, where ideas of the NoSQL side are incorporated into RDF engines, but if you go with the standard engines of Jena and Sesame, this division is still quite valid.
One issue that has not been mentioned yet is that updating triplestores to reflect changing data is often more work than for a RDBMS or OODBMS, because there is no notion of an 'object' or 'row' - only triples and resources. Deleting a domain object therefore requires care or you will end up with a lot of garbage left in the triplestore. The absence of cascading deletes is a closely-related issue.
On the plus side, RDF can be helpful even for everyday applications because you can flexibly add new subclasses, relationships or sub-relationships between entities without necessarily breaking any code, and easily add annotations, comments etc to resources.
Further to Peteris's answer there are some key differences between how you model data for a Triple Store vs other techniques like OOP, relational databases, XML e.g. rows, classes, properties etc
It very much depends what you want to do whether they are appropriate and whether you can find one with the right performance characteristics for your application.
People have a tendency to characterise triple-stores as being schema-less databases but realistically unless you are using some form of schema/ontology then they aren't particularly useful. If you want to use SPARQL to get stuff out then there needs to be some schema patterns in the store that you can write queries against.
Personally I would still use relational databases for a lot of things and still do, while I'm using RDF and triple stores for an increasing amount of stuff that doesn't mean I'm ready to throw out what works well.
As a final point even if you go with a relational database for the time being there are technologies like DB2RDF which can convert relational databases to RDF so you can stick with a DB for now and then export your database to RDF in the future as desired
发布评论
评论(5)
即使是简单的查询,查询时间也往往比传统数据库慢得多。此外,许多 RDF 存储不支持标准数据库功能,例如事务、崩溃恢复……
Query times tend to be much slower than for conventional DBs, even with simple queries. Also, many RDF stores don't support standard DB features like transactions, crash recovery, ...
我们在使用 RDF 三元组存储进行一般编程时遇到的缺点之一是大多数引擎不支持查询中的聚合(min、max、group by)。
我们用来在 RDBMS 之间做出决定的清单如下
RDBMS 如果是
RDF 如果
如果您没有正确的工具,为正在进行的项目重构 RDFBMS 架构可能会产生相当大的开销。
一些 RDF 引擎也提供 Lucene 支持,但没有像 Hibernate Search 那样有很好的文档和支持。
RDF引擎的可扩展性也在稳步提高,其中NoSQL方面的思想被融入到RDF引擎中,但如果你使用Jena和Sesame的标准引擎,这种划分仍然是相当有效的。
One of the shortcomings we have come across in using RDF triple stores for general programming is that most engines don't support aggregation in queries (min, max, group by).
A checklist we use to decide between RDBMS is the following
RDBMS if
RDF if
Refactoring RDFBMS schemas for ongoing projects can be quite an overhead if you don't have the correct tools.
Lucene support is provided by some RDF engines as well, but is not as well documented and supported as in the case of Hibernate Search.
Scalability of RDF engines is also improving steadily, where ideas of the NoSQL side are incorporated into RDF engines, but if you go with the standard engines of Jena and Sesame, this division is still quite valid.
尚未提及的一个问题是,更新三元组存储以反映不断变化的数据通常比 RDBMS 或 OODBMS 需要更多工作,因为没有“对象”或“行”的概念 - 只有三元组和资源。因此,删除域对象需要小心,否则最终会在三元组存储中留下大量垃圾。缺少级联删除是一个密切相关的问题。
从好的方面来说,RDF 即使对于日常应用程序也很有帮助,因为您可以灵活地在实体之间添加新的子类、关系或子关系,而不必破坏任何代码,并且可以轻松地向资源添加注释、注释等。
One issue that has not been mentioned yet is that updating triplestores to reflect changing data is often more work than for a RDBMS or OODBMS, because there is no notion of an 'object' or 'row' - only triples and resources. Deleting a domain object therefore requires care or you will end up with a lot of garbage left in the triplestore. The absence of cascading deletes is a closely-related issue.
On the plus side, RDF can be helpful even for everyday applications because you can flexibly add new subclasses, relationships or sub-relationships between entities without necessarily breaking any code, and easily add annotations, comments etc to resources.
除了Peteris的回答之外,三重存储的数据建模方式与OOP、关系数据库、XML(例如行、类、属性等)等其他技术之间存在一些关键区别,
这在很大程度上取决于您想要做什么,它们是否合适以及是否合适。您可以找到一款具有适合您的应用程序的性能特征的产品。
人们倾向于将三重存储描述为无模式数据库,但实际上,除非您使用某种形式的模式/本体,否则它们并不是特别有用。如果您想使用 SPARQL 来获取内容,那么存储中需要有一些模式模式可供您编写查询。
就我个人而言,我仍然会在很多事情上使用关系数据库,并且仍然会这样做,而我正在使用 RDF 和三元组存储来处理越来越多的事情,这并不意味着我准备放弃那些行之有效的东西。
最后一点,即使您暂时使用关系数据库,也有诸如 DB2RDF 之类的技术它可以将关系数据库转换为 RDF,这样您就可以暂时使用 DB,然后根据需要将数据库导出为 RDF
Further to Peteris's answer there are some key differences between how you model data for a Triple Store vs other techniques like OOP, relational databases, XML e.g. rows, classes, properties etc
It very much depends what you want to do whether they are appropriate and whether you can find one with the right performance characteristics for your application.
People have a tendency to characterise triple-stores as being schema-less databases but realistically unless you are using some form of schema/ontology then they aren't particularly useful. If you want to use SPARQL to get stuff out then there needs to be some schema patterns in the store that you can write queries against.
Personally I would still use relational databases for a lot of things and still do, while I'm using RDF and triple stores for an increasing amount of stuff that doesn't mean I'm ready to throw out what works well.
As a final point even if you go with a relational database for the time being there are technologies like DB2RDF which can convert relational databases to RDF so you can stick with a DB for now and then export your database to RDF in the future as desired
没有一种万能的工具。如今,三重存储适用于某些类型的任务,但不适用于其他类型的任务。
类似的问题在semanticoverflow.com上被问到,常见的答案是相同的:“使用任何合适的东西”。
There are no one-size-fits-it-all tools. Triple stores are appropriate and usable today for some kinds of tasks and not for others.
A similar question was asked on semanticoverflow.com and the common answer was the same: "use whatever is appropriate".