将书籍作者分类为小说与非小说

发布于 2024-10-15 21:21:42 字数 479 浏览 3 评论 0原文

出于我个人的目的,我有大约 300 名各种书籍的作者(全名)。我想将此列表分为“小说作者”和“非小说作者”。如果作者同时写了这两篇文章,那么大多数人都会获得投票权。

我查看了 Amazon Product Search API:我可以按作者搜索(in Python),但无法找到图书类别(小说与其他):

>>> node = api.item_search('Books', Author='Richard Dawkins')
>>> for book in node.Items.Item:
...     print book.ItemAttributes.Title

我有什么选择?我更喜欢用 Python 来做这件事。

For my own personal purposes, I have about ~300 authors (full name) of various books. I want to partition this list into "fiction authors" and "non-fiction authors". If an author writes both, then the majority gets the vote.

I looked at Amazon Product Search API: I can search by author (in Python), but there is no way to find the book category (fiction vs rest):

>>> node = api.item_search('Books', Author='Richard Dawkins')
>>> for book in node.Items.Item:
...     print book.ItemAttributes.Title

What are my options? I prefer to do this in Python.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

轻拂→两袖风尘 2024-10-22 21:21:42

好吧,您可以尝试其他服务 - Google 图书搜索 API。要使用 Python,您可以查看 gdata-python-api。在其协议中,结果提要中有一个节点 - 可能 这就是您所需要的:

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"
      xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/"
      xmlns:gbs="http://schemas.google.com/books/2008" 
      xmlns:dc="http://purl.org/dc/terms"
      xmlns:gd="http://schemas.google.com/g/2005">
  <id>http://www.google.com/books/feeds/volumes</id>
  <updated>2008-08-12T23:25:35.000</updated>

<!--  a loot of information here, just removed those nodes to save space.. -->

    <dc:creator>Jane Austen</dc:creator>
    <dc:creator>James Kinsley</dc:creator>
    <dc:creator>Fiona Stafford</dc:creator>
    <dc:date>2004</dc:date>
    <dc:description>
      If a truth universally acknowledged can shrink quite so rapidly into 
      the opinion of a somewhat obsessive comic character, the reader may reasonably feel ...
    </dc:description>
    <dc:format>382</dc:format>
    <dc:identifier>8cp-Z_G42g4C</dc:identifier>
    <dc:identifier>ISBN:0192802380</dc:identifier>
    <dc:publisher>Oxford University Press, USA</dc:publisher>
    <dc:subject>Fiction</dc:subject>
    <dc:title>Pride and Prejudice</dc:title>
    <dc:title>A Novel</dc:title>
  </entry>
</feed>

当然,此协议为您提供了一些与本书相关的开销信息(例如在 Google 图书上是否可见等)

Well, you can try another service - Google Book Search API. To use Python you can have a look at gdata-python-api. In its protocol, in result feed there is a node <dc:subject> - probably that's what you need:

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"
      xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/"
      xmlns:gbs="http://schemas.google.com/books/2008" 
      xmlns:dc="http://purl.org/dc/terms"
      xmlns:gd="http://schemas.google.com/g/2005">
  <id>http://www.google.com/books/feeds/volumes</id>
  <updated>2008-08-12T23:25:35.000</updated>

<!--  a loot of information here, just removed those nodes to save space.. -->

    <dc:creator>Jane Austen</dc:creator>
    <dc:creator>James Kinsley</dc:creator>
    <dc:creator>Fiona Stafford</dc:creator>
    <dc:date>2004</dc:date>
    <dc:description>
      If a truth universally acknowledged can shrink quite so rapidly into 
      the opinion of a somewhat obsessive comic character, the reader may reasonably feel ...
    </dc:description>
    <dc:format>382</dc:format>
    <dc:identifier>8cp-Z_G42g4C</dc:identifier>
    <dc:identifier>ISBN:0192802380</dc:identifier>
    <dc:publisher>Oxford University Press, USA</dc:publisher>
    <dc:subject>Fiction</dc:subject>
    <dc:title>Pride and Prejudice</dc:title>
    <dc:title>A Novel</dc:title>
  </entry>
</feed>

Of course, this protocol gives you some overhead information, related to this book (like visible or not on Google Books etc.)

空袭的梦i 2024-10-22 21:21:42

您查看过 BrowseNodes 吗?对我(之前没有使用过这个 API)来说,BrowseNodes 似乎对应于亚马逊的产品类别。也许您可以在那里找到更多信息。

Did you look at BrowseNodes? To me (who has not been using this API before) it seems BrowseNodes correspond to Amazon's product categories. Maybe you find more information there.

傲鸠 2024-10-22 21:21:42

在花了一些时间搞乱 Amazon API 后,看起来他们没有提供您想要的信息。

他们在文档中没有提及该类型的类别,并且如果您序列化 api 发送给您的内容,则不会提及小说或非小说类别。

您可以使用它打印出一个漂亮的 XML 字符串(您可能希望将其定向到一个文件以便于阅读)以及 api 发送的所有内容。

from lxml import etree

node = api.item_search('Books', Author='Richard Dawkins')

print etree.tostring(node, pretty_print=True)

After spending some time messing with the Amazon API it looks like they don't provide the kind of information you want.

They don't mention categories of that type in their documentation and if you serialise the stuff the api sends you there is not a single mention of fiction or non-fiction catergories.

You can use this to print out a nice XML string (you might want to direct it at a file for easy reading) with all of the stuff the api sends.

from lxml import etree

node = api.item_search('Books', Author='Richard Dawkins')

print etree.tostring(node, pretty_print=True)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文