使用 gdata 库从公共 Google 电子表格检索数据?
我正在使用 Python 工作,并尝试从公共 Google 电子表格中检索数据 (this一个),但与开发者文档。
如果可能的话,我想避免客户端身份验证,因为它是公共电子表格。
这是我当前的代码,使用 gdata 库:
client = gdata.spreadsheet.service.SpreadsheetsService()
key = '0Atncguwd4yTedEx3Nzd2aUZyNmVmZGRHY3Nmb3I2ZXc'
worksheets_feed = client.GetWorksheetsFeed(key)
这在第 3 行失败,出现 BadStatusLine。
如何从电子表格中读取数据?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
首先,我想回应您的观点,即文档确实很差。但是,这是迄今为止我所能弄清楚的。
公开发布
将电子表格“发布到网络”而不是仅仅“在网络上公开”非常重要。第一个是通过转到“文件 -> 发布到网络...”菜单项来实现的。第二个是通过单击电子表格左上角的“共享”按钮来实现的。
我检查过,您的电子表格键 = '0Atncguwd4yTedEx3Nzd2aUZyNmVmZGRHY3Nmb3I2ZXc' 只是“在网络上公开”。我制作了它的副本以供我的示例代码使用。我的副本有一个键 = '0Aip8Kl9b7wdidFBzRGpEZkhoUlVPaEg2X0F2YWtwYkE',您稍后将在我的示例代码中看到。
这种“在网络上公开”与“在网络上发布”的废话显然是一个常见的混淆点。它实际上记录在“可见性和预测”部分的红色框中主要 API 文档。然而,阅读该文档确实很难。
可见性和预测
正如同一份文件所述,除了“完整”之外,还有其他预测。事实上(未记录),“full”似乎与“public”的可见性不能很好地配合,在进行未经身份验证的调用时,设置“public”也很重要。
您可以从 pydocs 中收集到SpreadsheetsService 对象上的许多方法都可以采用“可见性”和“投影”参数。我只知道“公共”和“私人”可见性。如果您了解其他人,我也想了解他们。在进行未经身份验证的呼叫时,似乎应该使用“public”。
至于预测,则更加复杂。我知道“完整”、“基本”和“值”预测。我很幸运,通过阅读优秀的 Tabletop javascript 库的源代码找到了“值”投影。而且,你猜怎么着,这就是让事情发挥作用的秘密缺失要素。
工作代码
以下是一些代码,您可以使用它们从我的电子表格副本中查询工作表。
** 尖端 **
我发现在使用详细记录的 python API 时使用 dir( ) 正在运行的 python 解释器中的方法,以了解有关我可以从 python 对象获取的信息类型的更多信息。在这种情况下,它并没有太大帮助,因为基于 XML 和 URL 的 API 之上的抽象非常差。
顺便说一句,我确信您会想要开始处理电子表格中的实际数据,因此我将继续添加一个指针。可以使用 GetListFeed(key,sheet_key,visibility='public',projection='values').entry[0].custom 找到组织为字典的每行数据
I want to start out by echoing your sentiment that the Documentation is really poor. But, here's what I've been able to figure out so far.
Published of Public
It is very important that your spreadsheet be "Published to The Web" as opposed to just being "Public on the web." The first is achieved by going to the "File -> Publish to The Web ..." menu item. The second is achieved by clicking the "Share" button in the upper left-hand corner of the spreadsheet.
I checked, and your spreadsheet with key = '0Atncguwd4yTedEx3Nzd2aUZyNmVmZGRHY3Nmb3I2ZXc' is only "Public on the web." I made a copy of it to play around with for my example code. My copy has a key = '0Aip8Kl9b7wdidFBzRGpEZkhoUlVPaEg2X0F2YWtwYkE' which you will see in my sample code later.
This "Public on the Web" vs. "Published on The Web" nonsense is obviously a point of common confusion. It is actually documented in a red box in the "Visibilities and Projections" sections of the main API documentation. However, it is really hard to read that document.
Visibility and Projections
As that same document says, there are projections other than "full." And in fact (undocumented), "full" doesn't seem to play nicely with a visibility of "public" which is also important to set when making unauthenticated calls.
You can kind of glean from the pydocs that many of the methods on the SpreadsheetsService object can take "visibility" and "projection" parameters. I know only of "public" and "private" visibilities. If you learn of any others, I'd like to know about them too. It seems that "public" is what you should use when making unauthenticated calls.
As for Projections, it is even more complicated. I know of "full", "basic", and "values" projections. I only got lucky and found the "values" projection by reading the source code to the excellent Tabletop javascript library. And, guess what, that's the secret missing ingredient to make things work.
Working Code
Here is some code you can use to query the worksheets from my copy of your spreadsheet.
** Tips **
I find it really helpful when working with terribly documented python APIs to use the dir() method in a running python interpreter to find out more about the kind of information I can get from the python objects. In this case, it doesn't help too much because the abstraction above the XML and URL based API is pretty poor.
By the way, I'm sure you are going to want to start dealing with the actual data in the spreadsheet, so I'll go ahead and toss in one more pointer. The data for each row organized as a dictionary can be found using GetListFeed(key, sheet_key, visibility='public', projection='values').entry[0].custom