在 SPARQL 中选择一些不同的标签和一些不不同的标签
我正在尝试查询 DBPedia 以获取与本体中给定类相关的属性列表,但由于人类可读的“标签”并不总是清晰的,我还想提供数据库中的示例。问题是,虽然我想选择所有不同的属性,但我只想要每个属性的一个示例。 这是我的查询在不捕获示例的情况下的外观:
SELECT DISTINCT ?prop ?title WHERE {
?thing ?prop [].
?thing a <http://dbpedia.org/ontology/Currency>.
?prop rdf:type rdf:Property.
?prop rdfs:label ?title.
} ORDER BY DESC(COUNT(DISTINCT ?thing))
LIMIT 100
如果我在 这样,我开始获取 ?prop 的重复值:
SELECT DISTINCT ?prop ?title ?example WHERE {
?thing ?prop ?example.
?thing a <http://dbpedia.org/ontology/Currency>.
?prop rdf:type rdf:Property.
?prop rdfs:label ?title.
} ORDER BY DESC(COUNT(DISTINCT ?thing))
LIMIT 100
我对使用 SPARQL 非常陌生,并且一般来说,数据库查询,所以我根本不清楚如何做到这一点。理想情况下,我会有类似 DISTINCT(?prop) ?title ?example 的东西,它选择 prop 的每个唯一值,并返回其标题和示例。
I'm trying to query DBPedia for a list of properties relating to a given class in the ontology, but since the human-readable "labels" aren't always clear, I'd also like to provide an example from the database. The problem is that while I want to select all distinct properties, I only want a single example of each property. Here's how my query looks without capturing the example:
SELECT DISTINCT ?prop ?title WHERE {
?thing ?prop [].
?thing a <http://dbpedia.org/ontology/Currency>.
?prop rdf:type rdf:Property.
?prop rdfs:label ?title.
} ORDER BY DESC(COUNT(DISTINCT ?thing))
LIMIT 100
If I change it in this way, I start getting duplicate values for ?prop:
SELECT DISTINCT ?prop ?title ?example WHERE {
?thing ?prop ?example.
?thing a <http://dbpedia.org/ontology/Currency>.
?prop rdf:type rdf:Property.
?prop rdfs:label ?title.
} ORDER BY DESC(COUNT(DISTINCT ?thing))
LIMIT 100
I'm very new to using SPARQL and database queries in general, so it's not at all clear to me how to do this. Ideally, I'd have something like DISTINCT(?prop) ?title ?example, which selects every unique value for prop, and returns its title and an example.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这是通过子查询实现您想要的效果的另一种方法:
这具有符合 SPARQL 1.1 标准的优点,正如我在评论中所述,标准不允许按聚合排序,因此您使用的是供应商特定的扩展将限制查询的可移植性。
如果您确实希望以可跨 SPARQL 1.1 实现移植的方式按聚合值排序,那么您必须首先像这样投影它:
Here is an alternative way to achieve what you want with subqueries:
This has the advantage that it is SPARQL 1.1 standards compliant, as I stated in my comment ordering by an aggregate is not permitted by the standard so you are using a vendor specific extension which will limit the portability of your query.
If you do want to order by an aggregated value in a way that is portable across SPARQL 1.1 implementations then you must first project it like so:
如果您不关心示例但关心速度,
SAMPLE
可能比GROUP BY
快得多,您可能不会注意到 dbpedia 上的差异,因为它会缓存查询结果,但我注意到使用其他端点时存在巨大差异。
我在创建查询多个 sparql 端点的自动完成服务时遇到了同样的问题。我需要找到与搜索词相关的单个链接,其中链接本身并不是很重要,但查询速度非常很重要。
If you don't care about the example but you care about speed,
SAMPLE
can be much faster thanGROUP BY
You probably won't notice the difference on dbpedia since it caches query results, but I noticed a huge difference when using other endpoints.
I ran into the same issue op had while creating an autocomplete service that queries multiple sparql endpoints. I needed to find a single link related to a search term, of which the link itself wasn't very important, but the speed of the query was very important.
在第二个查询中,不同值适用于
?prop
?title
和?example
值的组合。因此,您不会得到任何重复项,例如在第二个查询中获得的以下两行:它们不是重复项,因为第三行
?example
有两个不同的值"cent" @en
和"centavo"@en
解决这个问题的一种可能方法是使用
GROUP BY
和MIN
来获取?label
和?example
的最低排名值,即:In your second queries the distinct applies to the combination of values of
?prop
?title
and?example
. Therefore you're not getting any duplicates, for instance for the following two rows obtained in the second query:they aren't duplicates because the third row
?example
has two different values"cent"@en
and"centavo"@en
One posible way to solve this is to use
GROUP BY
andMIN
to get just the lowest ranked value for?label
and?example
, i.e: