适用于大型数据集的 Triplestore

发布于 2024-10-16 08:34:11 字数 1539 浏览 5 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

墨落成白 2024-10-23 08:34:11

您应该考虑使用 OpenLink Virtuoso 商店。它可通过开源许可证获得,并可扩展到数十亿个三倍。您可以通过 Sesame 和 Jena API 使用它。

有关大型三重存储的概述,请参阅此处。 Virtuoso 绝对比 BigData 更容易设置。除此之外,我还使用了 Sesame NativeStore,它的扩展性不太好。

4Store也是一个不错的选择,虽然我没有使用过。 Virtuoso 相对于 4Store 的优势之一是您可以轻松地将标准关系模型与 RDF 混合在一起,因为 Virtuoso 的底层是关系数据库。

You should consider using the OpenLink Virtuoso store. It is available via an OpenSource license and scales to billions of triples. You can use it via the Sesame and Jena APIs.

See here for an overview of large scale triple stores. Virtuoso is definitely easier to set up than BigData. Beside that I have used the Sesame NativeStore, which doesn't scale too well.

4Store is also a good choice, although I haven't used it. One benefit of Virtuoso over 4Store is that you can easily mix standard relational models with RDF, since Virtuoso is under the hood a relational database.

北斗星光 2024-10-23 08:34:11

4store:可扩展的 RDF 存储

引用 4store Web ...

4store 的主要优势是
性能、可扩展性和
稳定。它并没有提供很多
超越 RDF 存储的功能
和 SPARQL 查询,但如果您是
寻找可扩展、安全、快速的
和高效的RDF存储,然后是4store
应该在您的候选名单上。

我个人已经用非常大的数据库(高达 20 亿个三元组)测试了 4store,结果非常好。 4store是用C编写的,运行在Linux/Unix 64位平台上,当前版本1.1.1已部分实现SPARQL 1.1

4store 可以部署在商用服务器集群上,这可以提高查询性能,断言吞吐量可以达到 100 KTriples/秒。但即使您在单个服务器中使用它,您也会获得相当不错的性能。

南安普顿大学是我们研究项目中非常大的数据集以及我们的网站管理员团队的选择,请参阅南安普顿和 ECS 开放数据的数据存储

此处还有可用于查询和管理 4store 客户端库 的所有库的列表。此外,4store 的 IRC 频道 拥有一个活跃的用户社区,如果您遇到任何问题,该社区将为您提供帮助。

如果你是Linux/Unix用户4store绝对是一个不错的选择。

4store: Scalable RDF storage

Quoting 4store Web ...

4store's main strengths are its
performance, scalability and
stability. It does not provide many
features over and above RDF storage
and SPARQL queries, but if your are
looking for a scalable, secure, fast
and efficient RDF store, then 4store
should be on your shortlist.

Personally I have tested 4store with very large databases (up to 2 billion triples) with very good results. 4store is written in C, runs on Linux/Unix 64 bit platforms and the current version 1.1.1 has partially implemented SPARQL 1.1.

4store can be deployed on a cluster of commodity servers which may boost the performance of your queries and assertion throughput can get up to 100 KTriples/second. But even if you use it in a single server you will get quite a decent performance.

Here at the University of Southampton is our choice for very big datasets in research projects and also for our Webmaster team, see Data Stores for Southampton and ECS Open Data.

Here you have also a list of all the libraries that you can use to query and administrate 4store Client Libraries. Also, 4store's IRC channel has an active community of users that will help if you run into any issues.

If you are a Linux/Unix user 4store is definitely a good choice.

风吹短裙飘 2024-10-23 08:34:11

我还推荐 4store,但本着充分披露的精神,我是首席架构师:)

如果您想利用 RDF 存储的标准化,那么您应该考虑使用实现 SPARQL 的 Java 库,而不是使用一个本地公开 JAVA API。

否则,由于在它们之间移动的工作量,您最终可能会被困在您首先选择的任何存储中,这是典型的 SQL 迁移地狱。

I would also recommend 4store, but in the spirit of full disclosure, I was the lead architect :)

If you want to take advantage of the standardisation of RDF stores then you should look to use a Java library that implements SPARQL, rather than using one that exposes a JAVA API natively.

Otherwise you could end up being stuck with whatever store you choose first, due to the effort of moving between them, which is typical SQL migration hell.

迷你仙 2024-10-23 08:34:11

我个人对 GraphDB 非常满意。它在具有 150 亿个三元组的中等硬件(256GB RAM 服务器)上运行得很好。可以通过 sesame 和 jena 界面访问。 (尽管 jena 是 beta 版本)。

如果您能负担得起 Oracle 12c 实例还不错。并且可能适合现有的预言机基础设施(备份等......)。

Virtuoso 7.1 扩展性非常好,可以以合理的成本处理巨大的数据量。不幸的是,它的 SPARQL 标准合规性参差不齐

I am personally quite happy with GraphDB . Which runs quite well on medium hardware (256GB ram server) with 15 billion triples. Which is accesible both via the sesame and jena interfaces. (Although jena is beta'ish).

If you can afford it an Oracle 12c instance is not bad. And might fit in with an existing oracle infrastructure (back-ups etc...).

Virtuoso 7.1 scales very well and can deal with humongous data volumes for reasonable cost. Unfortunately its SPARQL standards compliance is spotty

柠檬色的秋千 2024-10-23 08:34:11

@Steve - 不知道如何发表评论,所以我想我要一次回答两个问题。

以下 SPARQL 的 JDBC 驱动程序:

http://code.google.com/p/jdbc4sparql/

支持SPARQL协议和SPARUL(通过SPARQL协议作为更新,而不是通过SPARUL协议)。

@myahya

4Store 是强烈推荐的,因此值得作为候选者进行评估。

Virtuoso 还拥有原生 JDBC 驱动程序并支持大型数据集(高达 120 亿个三元组)

www.openlinksw.com/wiki/main/Main/

此外,Oracle 也有一些东西,但要准备好支付大笔费用:

http://www.oracle.com/technetwork/database/options/semantic-tech/index.html html

@Steve - don't know how to comment so I guess I am going to answer 2 questions at once.

JDBC driver for SPARQL below:

http://code.google.com/p/jdbc4sparql/

supports SPARQL Protocol and SPARUL (over the SPARQL protocol as an update, not over the SPARUL protocol).

@myahya

4Store is highly recommended, so worth appraising as a candidate.

Virtuoso also has native JDBC drivers and supports large datasets (up to 12 billion triples)

www.openlinksw.com/wiki/main/Main/

Also, Oracle have something, but be prepared to pay big bucks:

http://www.oracle.com/technetwork/database/options/semantic-tech/index.html

迟到的我 2024-10-23 08:34:11

除了 4Store、Virtuoso 和 Owlim 之外,Bigdata 也值得关注。

In addition to 4Store, Virtuoso, and Owlim, Bigdata is also worth looking at.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文