雅虎 BOSS Python 库，ExpatError

发布于 2024-08-04 14:37:55 字数 1607 浏览 13 评论 0原文

我尝试安装 Yahoo BOSS mashup 框架，但在运行提供的示例时遇到问题。示例 1、2、5 和 6 可以工作，但是 3 和 5 可以工作。 4 给出 Expat 错误。以下是 ex3.py 的输出：

gpython examples/ex3.py
    examples/ex3.py:33: Warning: 'as' will become a reserved keyword in Python 2.6
Traceback (most recent call last):
  File "examples/ex3.py", line 27, in <module>
    digg = db.select(name="dg", udf=titlef, url="http://digg.com/rss_search?search=google+android&area=dig&type=both&section=news")
  File "/usr/lib/python2.5/site-packages/yos/yql/db.py", line 214, in select
    tb = create(name, data=data, url=url, keep_standards_prefix=keep_standards_prefix)
  File "/usr/lib/python2.5/site-packages/yos/yql/db.py", line 201, in create
    return WebTable(name, d=rest.load(url), keep_standards_prefix=keep_standards_prefix)
  File "/usr/lib/python2.5/site-packages/yos/crawl/rest.py", line 38, in load
    return xml2dict.fromstring(dl)
  File "/usr/lib/python2.5/site-packages/yos/crawl/xml2dict.py", line 41, in fromstring
    t = ET.fromstring(s)
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 963, in XML
    parser.feed(text)
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 1245, in feed
    self._parser.Parse(data, 0)
    xml.parsers.expat.ExpatError: syntax error: line 1, column 0

看起来两个示例在尝试查询 Digg.com 时都失败了。以下是在 ex3.py 的代码中构造的查询：

diggf = lambda r: {"title": r["title"]["value"], "diggs": int(r["diggCount"]["value"])}
digg = db.select(name="dg", udf=diggf, url="http://digg.com/rss_search?search=google+android&area=dig&type=both&section=news")

原文

I tried to install the Yahoo BOSS mashup framework, but am having trouble running the examples provided. Examples 1, 2, 5, and 6 work, but 3 & 4 give Expat errors. Here is the output from ex3.py:

gpython examples/ex3.py
    examples/ex3.py:33: Warning: 'as' will become a reserved keyword in Python 2.6
Traceback (most recent call last):
  File "examples/ex3.py", line 27, in <module>
    digg = db.select(name="dg", udf=titlef, url="http://digg.com/rss_search?search=google+android&area=dig&type=both§ion=news")
  File "/usr/lib/python2.5/site-packages/yos/yql/db.py", line 214, in select
    tb = create(name, data=data, url=url, keep_standards_prefix=keep_standards_prefix)
  File "/usr/lib/python2.5/site-packages/yos/yql/db.py", line 201, in create
    return WebTable(name, d=rest.load(url), keep_standards_prefix=keep_standards_prefix)
  File "/usr/lib/python2.5/site-packages/yos/crawl/rest.py", line 38, in load
    return xml2dict.fromstring(dl)
  File "/usr/lib/python2.5/site-packages/yos/crawl/xml2dict.py", line 41, in fromstring
    t = ET.fromstring(s)
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 963, in XML
    parser.feed(text)
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 1245, in feed
    self._parser.Parse(data, 0)
    xml.parsers.expat.ExpatError: syntax error: line 1, column 0

It looks like both examples are failing when trying to query Digg.com. Here is the query that is constructed in ex3.py's code:

diggf = lambda r: {"title": r["title"]["value"], "diggs": int(r["diggCount"]["value"])}
digg = db.select(name="dg", udf=diggf, url="http://digg.com/rss_search?search=google+android&area=dig&type=both§ion=news")

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

暗藏城府 2024-08-11 14:37:55

问题是 digg 搜索字符串。应该是“s=”。不是“搜索=”

回复收藏 0 原文

彩虹直至黑白 2024-08-11 14:37:55

我相信示例中一定是一个错误：它正在获取 JSON 结果（实际上，如果您在浏览器中复制并粘贴该 URL，您将下载一个文件名 search.json ，该文件以 ie 开头，

{"results":[{"profile_image_url":
"http://a3.twimg.com/profile_images/255524395/KEN_OMALLEY_REVISED_normal.jpg",
"created_at":"Mon, 14 Sep 2009 14:52:07 +0000","from_user":"twilightlords",

完全正常的 JSON；但随后改为使用 json 或 simplejson 等模块解析它时，它会尝试将其解析为 XML —— 显然这种尝试失败了

（可能需要引起维护该代码的人的注意，以便他们可以合并它）。）要么要求 XML 而不是 JSON 输出，要么使用适当的方法解析生成的 JSON，而不是尝试将其视为 XML（不确定如何最好地实现任一更改，因为我不熟悉该代码）。

I believe that must be an error in the example: it's getting a JSON result (indeed if you copy and paste that URL in your browser, you'll download a file names search.json which starts with

{"results":[{"profile_image_url":
"http://a3.twimg.com/profile_images/255524395/KEN_OMALLEY_REVISED_normal.jpg",
"created_at":"Mon, 14 Sep 2009 14:52:07 +0000","from_user":"twilightlords",

i.e. perfectly normal JSON; but then instead of parsing it with modules such as json or simplejson, it tries to parse it as XML -- and obviously this attempt fails.

I believe the fix (which probably needs to be brought to the attention of whoever maintains that code so they can incorporate it) is either to ask for XML instead of JSON output, OR to parse the resulting JSON with appropriate means instead of trying to look at it as XML (not sure how to best implement either change, as I'm not familiar with that code).

回复收藏 0 原文

~没有更多了~