在 SPARQL 中选择一些不同的标签和一些不不同的标签

发布于 2024-10-25 13:14:06 字数 1502 浏览 8 评论 0原文

我正在尝试查询 DBPedia 以获取与本体中给定类相关的属性列表,但由于人类可读的“标签”并不总是清晰的,我还想提供数据库中的示例。问题是,虽然我想选择所有不同的属性,但我只想要每个属性的一个示例。 这是我的查询在不捕获示例的情况下的外观:

SELECT DISTINCT ?prop ?title WHERE {
    ?thing ?prop [].
    ?thing a <http://dbpedia.org/ontology/Currency>.
    ?prop rdf:type rdf:Property.
    ?prop rdfs:label ?title.
} ORDER BY DESC(COUNT(DISTINCT ?thing))
LIMIT 100

如果我在 这样,我开始获取 ?prop 的重复值:

SELECT DISTINCT ?prop ?title ?example WHERE {
    ?thing ?prop ?example.
    ?thing a <http://dbpedia.org/ontology/Currency>.
    ?prop rdf:type rdf:Property.
    ?prop rdfs:label ?title.
} ORDER BY DESC(COUNT(DISTINCT ?thing))
LIMIT 100

我对使用 SPARQL 非常陌生,并且一般来说,数据库查询,所以我根本不清楚如何做到这一点。理想情况下,我会有类似 DISTINCT(?prop) ?title ?example 的东西,它选择 prop 的每个唯一值,并返回其标题和示例。

I'm trying to query DBPedia for a list of properties relating to a given class in the ontology, but since the human-readable "labels" aren't always clear, I'd also like to provide an example from the database. The problem is that while I want to select all distinct properties, I only want a single example of each property. Here's how my query looks without capturing the example:

SELECT DISTINCT ?prop ?title WHERE {
    ?thing ?prop [].
    ?thing a <http://dbpedia.org/ontology/Currency>.
    ?prop rdf:type rdf:Property.
    ?prop rdfs:label ?title.
} ORDER BY DESC(COUNT(DISTINCT ?thing))
LIMIT 100

If I change it in this way, I start getting duplicate values for ?prop:

SELECT DISTINCT ?prop ?title ?example WHERE {
    ?thing ?prop ?example.
    ?thing a <http://dbpedia.org/ontology/Currency>.
    ?prop rdf:type rdf:Property.
    ?prop rdfs:label ?title.
} ORDER BY DESC(COUNT(DISTINCT ?thing))
LIMIT 100

I'm very new to using SPARQL and database queries in general, so it's not at all clear to me how to do this. Ideally, I'd have something like DISTINCT(?prop) ?title ?example, which selects every unique value for prop, and returns its title and an example.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

拒绝两难 2024-11-01 13:14:07

这是通过子查询实现您想要的效果的另一种方法:

SELECT ?prop ?title ?example 
WHERE 
{
    ?thing a <http://dbpedia.org/ontology/Currency>.
    ?prop rdf:type rdf:Property.
    { SELECT ?title ?example WHERE { ?thing ?prop ?example . ?prop rdfs:label ?title. } LIMIT 1 }
}
LIMIT 100

这具有符合 SPARQL 1.1 标准的优点,正如我在评论中所述,标准不允许按聚合排序,因此您使用的是供应商特定的扩展将限制查询的可移植性。

如果您确实希望以可跨 SPARQL 1.1 实现移植的方式按聚合值排序,那么您必须首先像这样投影它:

SELECT ?s (COUNT(?p) AS ?predicates) WHERE
{
  ?s ?p ?o
} GROUP BY ?s ORDER BY DESC(?predicates)

Here is an alternative way to achieve what you want with subqueries:

SELECT ?prop ?title ?example 
WHERE 
{
    ?thing a <http://dbpedia.org/ontology/Currency>.
    ?prop rdf:type rdf:Property.
    { SELECT ?title ?example WHERE { ?thing ?prop ?example . ?prop rdfs:label ?title. } LIMIT 1 }
}
LIMIT 100

This has the advantage that it is SPARQL 1.1 standards compliant, as I stated in my comment ordering by an aggregate is not permitted by the standard so you are using a vendor specific extension which will limit the portability of your query.

If you do want to order by an aggregated value in a way that is portable across SPARQL 1.1 implementations then you must first project it like so:

SELECT ?s (COUNT(?p) AS ?predicates) WHERE
{
  ?s ?p ?o
} GROUP BY ?s ORDER BY DESC(?predicates)
凉宸 2024-11-01 13:14:07

如果您不关心示例但关心速度,SAMPLE 可能比 GROUP BY 快得多,

SELECT ?prop (SAMPLE(?title) AS ?title) (SAMPLE(?example) AS ?example) 
WHERE {
    ?thing ?prop ?example.
    ?thing a <http://dbpedia.org/ontology/Currency>.
    ?prop rdf:type rdf:Property.
    ?prop rdfs:label ?title.
} LIMIT 100

您可能不会注意到 dbpedia 上的差异,因为它会缓存查询结果,但我注意到使用其他端点时存在巨大差异。

我在创建查询多个 sparql 端点的自动完成服务时遇到了同样的问题。我需要找到与搜索词相关的单个链接,其中链接本身并不是很重要,但查询速度非常很重要。

If you don't care about the example but you care about speed, SAMPLE can be much faster than GROUP BY

SELECT ?prop (SAMPLE(?title) AS ?title) (SAMPLE(?example) AS ?example) 
WHERE {
    ?thing ?prop ?example.
    ?thing a <http://dbpedia.org/ontology/Currency>.
    ?prop rdf:type rdf:Property.
    ?prop rdfs:label ?title.
} LIMIT 100

You probably won't notice the difference on dbpedia since it caches query results, but I noticed a huge difference when using other endpoints.

I ran into the same issue op had while creating an autocomplete service that queries multiple sparql endpoints. I needed to find a single link related to a search term, of which the link itself wasn't very important, but the speed of the query was very important.

装纯掩盖桑 2024-11-01 13:14:06

在第二个查询中,不同值适用于 ?prop ?title?example 值的组合。因此,您不会得到任何重复项,例如在第二个查询中获得的以下两行:

dbpedia2:subunitName    "subunit name "@en  "cent"@en
dbpedia2:subunitName    "subunit name "@en  "centavo"@en

它们不是重复项,因为第三行 ?example 有两个不同的值 "cent" @en"centavo"@en

解决这个问题的一种可能方法是使用 GROUP BYMIN 来获取?label?example 的最低排名值,即:

SELECT ?prop MIN(?title) MIN(?example) WHERE {
    ?thing ?prop ?example.
    ?thing a <http://dbpedia.org/ontology/Currency>.
    ?prop rdf:type rdf:Property.
    ?prop rdfs:label ?title.
} GROUP BY ?prop

In your second queries the distinct applies to the combination of values of ?prop ?title and ?example. Therefore you're not getting any duplicates, for instance for the following two rows obtained in the second query:

dbpedia2:subunitName    "subunit name "@en  "cent"@en
dbpedia2:subunitName    "subunit name "@en  "centavo"@en

they aren't duplicates because the third row ?example has two different values "cent"@en and "centavo"@en

One posible way to solve this is to use GROUP BY and MIN to get just the lowest ranked value for ?label and ?example, i.e:

SELECT ?prop MIN(?title) MIN(?example) WHERE {
    ?thing ?prop ?example.
    ?thing a <http://dbpedia.org/ontology/Currency>.
    ?prop rdf:type rdf:Property.
    ?prop rdfs:label ?title.
} GROUP BY ?prop
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文