JSON字符串解码错误
我正在调用 URL:
http://code.google.com/feeds/issues/p/chromium/issues/full/291?alt=json
使用 urllib2 并使用 json 模块进行解码,
url = "http://code.google.com/feeds/issues/p/chromium/issues/full/291?alt=json"
request = urllib2.Request(query)
response = urllib2.urlopen(request)
issue_report = json.loads(response.read())
但遇到以下错误:
ValueError: Invalid control character at: line 1 column 1120 (char 1120)
我尝试检查标头,得到以下信息:
Content-Type: application/json; charset=UTF-8
Access-Control-Allow-Origin: *
Expires: Sun, 03 Jul 2011 17:38:38 GMT
Date: Sun, 03 Jul 2011 17:38:38 GMT
Cache-Control: private, max-age=0, must-revalidate, no-transform
Vary: Accept, X-GData-Authorization, GData-Version
GData-Version: 1.0
ETag: W/"CUEGQX47eCl7ImA9WxJaFEw."
Last-Modified: Tue, 04 Aug 2009 19:20:20 GMT
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
Server: GSE
Connection: close
我还尝试添加编码参数,如下所示:
issue_report = json.loads(response.read() , encoding = 'UTF-8')
我仍然遇到相同的错误。
I am calling the URL :
http://code.google.com/feeds/issues/p/chromium/issues/full/291?alt=json
using urllib2 and decoding using the json module
url = "http://code.google.com/feeds/issues/p/chromium/issues/full/291?alt=json"
request = urllib2.Request(query)
response = urllib2.urlopen(request)
issue_report = json.loads(response.read())
I run into the following error :
ValueError: Invalid control character at: line 1 column 1120 (char 1120)
I tried checking the header and I got the following :
Content-Type: application/json; charset=UTF-8
Access-Control-Allow-Origin: *
Expires: Sun, 03 Jul 2011 17:38:38 GMT
Date: Sun, 03 Jul 2011 17:38:38 GMT
Cache-Control: private, max-age=0, must-revalidate, no-transform
Vary: Accept, X-GData-Authorization, GData-Version
GData-Version: 1.0
ETag: W/"CUEGQX47eCl7ImA9WxJaFEw."
Last-Modified: Tue, 04 Aug 2009 19:20:20 GMT
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
Server: GSE
Connection: close
I also tried adding an encoding parameter as follows :
issue_report = json.loads(response.read() , encoding = 'UTF-8')
I still run into the same error.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
此时,提要中包含来自 JPEG 的原始数据; JSON 格式错误,所以这不是你的错。向 Google 报告错误。
The feed has raw data from a JPEG in it at that point; the JSON is malformed, so it's not your fault. Report a bug to Google.
您可以考虑使用
lxml
代替,因为 JSON 格式错误。它的 XPath 支持使得使用 XML 变得非常简单:相当容易操作元素,例如从标签中剥离名称空间、转换为字典:
You could consider using
lxml
instead, since the JSON is malformed. It's XPath support makes working with XML pretty straight-forward:Fairly easy to manipulate elements, for instance to strip namespace from tags, convert to dict: