使用 Python 的 SPARQLwrapper 对 DBPedia 端点进行 SPARQL 查询时出现 HTTP 错误 500

发布于 2024-12-10 05:43:25 字数 1619 浏览 5 评论 0原文

我正在尝试接收标签包含特定字符串的所有属性。我使用以下查询：

    SELECT ?p ?l count(?p) as ?count WHERE { 
    ?someobj ?p ?s .
    ?p a <http://www.w3.org/1999/02/22-rdf-syntax-ns#Property> .        
    ?p <http://www.w3.org/2000/01/rdf-schema#label> ?l . 
    ?l bif:contains "string" . 
    FILTER (lang(?l) = 'en'). 
    FILTER (!isLiteral(?someobj)). 
    } ORDER BY DESC(?count) LIMIT 5

通过公共 DBPedia 端点 @ http://dbpedia.org/sparql 发出查询时，它有效，并返回我想要的内容。然而，当我在 Python 脚本中通过 SPARQLWrapper 执行相同操作时，我不断收到：

File "E:\thesis\sem_web21.py", line 254, in findWord
  results = sparql.query().convert()
File "build/bdist.linux-i686/egg/SPARQLWrapper/Wrapper.py", line 355, in query
  return QueryResult(self._query())
File "build/bdist.linux-i686/egg/SPARQLWrapper/Wrapper.py", line 334, in _query
  raise e
HTTPError: HTTP Error 500: SPARQL Request Failed

我尝试了查询的变体，有或没有计数和排序，有或没有限制。我不断收到 HTTP 500。我不认为端点不稳定，因为我对同一脚本中的其他查询没有问题，它只会因该查询而停止。

检索对象的类似查询工作正常（都在公共端点通过我的脚本）：

    SELECT ?s ?l count(?s) as ?count WHERE { 
    ?someobj ?p ?s . 
    ?s <http://www.w3.org/2000/01/rdf-schema#label> ?l . 
    ?l bif:contains "computer" . 
    FILTER (!regex(str(?s), '^http://dbpedia.org/resource/Category:')). 
    FILTER (!regex(str(?s), '^http://dbpedia.org/resource/List')). 
    FILTER (!regex(str(?s), '^http://sw.opencyc.org/')). 
    FILTER (lang(?l) = 'en'). 
    FILTER (!isLiteral(?someobj)). 
    } ORDER BY DESC(?count) LIMIT 20

知道是什么原因造成的吗？或者知道如何检索更具体的错误？提前致谢。

原文

I am trying to receive all properties whose labels contain a certain string. I use the following query:

    SELECT ?p ?l count(?p) as ?count WHERE { 
    ?someobj ?p ?s .
    ?p a <http://www.w3.org/1999/02/22-rdf-syntax-ns#Property> .        
    ?p <http://www.w3.org/2000/01/rdf-schema#label> ?l . 
    ?l bif:contains "string" . 
    FILTER (lang(?l) = 'en'). 
    FILTER (!isLiteral(?someobj)). 
    } ORDER BY DESC(?count) LIMIT 5

When issueing the query through the public DBPedia endpoint @ http://dbpedia.org/sparql, it works, and returns what I want. However when I do the same through the SPARQLWrapper in my Python script, I keep getting:

File "E:\thesis\sem_web21.py", line 254, in findWord
  results = sparql.query().convert()
File "build/bdist.linux-i686/egg/SPARQLWrapper/Wrapper.py", line 355, in query
  return QueryResult(self._query())
File "build/bdist.linux-i686/egg/SPARQLWrapper/Wrapper.py", line 334, in _query
  raise e
HTTPError: HTTP Error 500: SPARQL Request Failed

I have tried variations on the query, with and without counting and sorting, with and without limiting. I keep getting HTTP 500s. I don't think it's the endpoint being instable, as I have no problem with other queries in the same script, it only stops with this query.

Similar queries to retrieve objects work fine (both at the public endpoint as through my script):

    SELECT ?s ?l count(?s) as ?count WHERE { 
    ?someobj ?p ?s . 
    ?s <http://www.w3.org/2000/01/rdf-schema#label> ?l . 
    ?l bif:contains "computer" . 
    FILTER (!regex(str(?s), '^http://dbpedia.org/resource/Category:')). 
    FILTER (!regex(str(?s), '^http://dbpedia.org/resource/List')). 
    FILTER (!regex(str(?s), '^http://sw.opencyc.org/')). 
    FILTER (lang(?l) = 'en'). 
    FILTER (!isLiteral(?someobj)). 
    } ORDER BY DESC(?count) LIMIT 20

Any idea what could be causing this? Or any idea how I could retrieve a more specific error? Thanks in advance.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

云仙小弟 2024-12-17 05:43:25

我认为这是 dbpedia 的超时错误，因为它在不同的图表中查找它。当您通过 dbpedia Web 界面尝试它时，它始终包含您正在查询的图表的 uri。因此，请尝试将其添加到您的查询中：

SELECT ?p ?l count(?p) as ?count FROM <http://dbpedia.org> WHERE { 
?someobj ?p ?s .
?p a <http://www.w3.org/1999/02/22-rdf-syntax-ns#Property> .        
?p <http://www.w3.org/2000/01/rdf-schema#label> ?l . 
?l bif:contains "string" . 
FILTER (lang(?l) = 'en'). 
FILTER (!isLiteral(?someobj)). 
} ORDER BY DESC(?count) LIMIT 5

然后再试一次。

使用以下 python 脚本进行了尝试：

import sys
import urllib,urllib2

def query_e(query,epr,soft_limit=True):
   try:
       params = urllib.urlencode({'query': query})
       opener = urllib2.build_opener(urllib2.HTTPHandler)
       request = urllib2.Request(epr+'?'+params)
       request.add_header('Accept', 'application/json')
       request.get_method = lambda: 'GET'
       url = opener.open(request)
       data = url.read()
       return data
    except Exception, e:
       traceback.print_exc(file=sys.stdout)
       raise e

I think it's a time out error on dbpedia's part because it looks it up in different graphs. When you are trying it through the dbpedia web interface it always includes the uri of the graph you are querying. So try adding that to your query:

SELECT ?p ?l count(?p) as ?count FROM <http://dbpedia.org> WHERE { 
?someobj ?p ?s .
?p a <http://www.w3.org/1999/02/22-rdf-syntax-ns#Property> .        
?p <http://www.w3.org/2000/01/rdf-schema#label> ?l . 
?l bif:contains "string" . 
FILTER (lang(?l) = 'en'). 
FILTER (!isLiteral(?someobj)). 
} ORDER BY DESC(?count) LIMIT 5

and try it again.

Tried it using the following python script:

import sys
import urllib,urllib2

def query_e(query,epr,soft_limit=True):
   try:
       params = urllib.urlencode({'query': query})
       opener = urllib2.build_opener(urllib2.HTTPHandler)
       request = urllib2.Request(epr+'?'+params)
       request.add_header('Accept', 'application/json')
       request.get_method = lambda: 'GET'
       url = opener.open(request)
       data = url.read()
       return data
    except Exception, e:
       traceback.print_exc(file=sys.stdout)
       raise e

回复收藏 0 原文