从元组中提取信息 (Python)
我目前正在使用 Python 2.7 中的 httplib 库从网站获取一些标头,以确定 a) 下载的文件大小和 b) 文件的最后修改日期。我使用过一些在线工具,这些细节确实存在。
我目前正在编写我的 Python 代码脚本,它似乎可以正常工作并返回所需的信息。尽管如此,包含标头信息的响应是包含多个元组的列表。响应示例如下:-
[('content-length', '2501479'),
('accept-ranges', 'bytes'),
('vary', 'Accept-Encoding'),
('server', 'off'),
('last-modified', 'Thu, 20 Oct 2011 04:30:01 GMT'),
('etag', '"2c8171a-262b67-4afb368edfffc"'),
('date', 'Thu, 20 Oct 2011 16:01:11 GMT'),
('content-type', 'text/plain')]
我想做的是基本上删除文件大小(“2501479”)和日期(“2011 年 10 月 20 日星期四 04:30:01 GMT”)。我有什么想法可以去做这件事吗?我最初尝试了 variable[0]
但返回了 "'content-length', '2501479'"
。如何仅返回文件大小(理论上是列表中第一个元组的第二部分!)。
I'm currently using the httplib library in Python 2.7 to obtain some headers from a website to establish a) the filesize of a download and b) the last modified date of the file. I've used some online tools and these details do exist.
I'm currently scripting my Python code and it appears to work correctly bringing back the required information. Nonetheless, the response containing the header information is a list containing a number of tuples. A sample of the response is below:-
[('content-length', '2501479'),
('accept-ranges', 'bytes'),
('vary', 'Accept-Encoding'),
('server', 'off'),
('last-modified', 'Thu, 20 Oct 2011 04:30:01 GMT'),
('etag', '"2c8171a-262b67-4afb368edfffc"'),
('date', 'Thu, 20 Oct 2011 16:01:11 GMT'),
('content-type', 'text/plain')]
What I am looking to do is strip out basically the file size ("2501479") and the date ("Thu, 20 Oct 2011 04:30:01 GMT"). Any ideas how I can go about doing this? I originally tried variable[0]
but this returns "'content-length', '2501479'"
. How can I return the filesize solely (in theory the second part of the first tuple in the list!).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
首先,您可以通过将元组列表转换为字典来使其更容易使用:
对于日期,我会将其转换为 datetime 对象使用 email.utils.parsedate 函数:
First, you can make it a little easier to work with by turning your list of tuples into a dictionary:
For the date, I would turn it into a datetime object using the email.utils.parsedate function:
首先,将元组转换为
dict
,然后然后将值转换为int
以获得数字:First, convert the tuples into a
dict
, and then convert the value toint
to get a number:您只需再次对其进行索引即可访问该元组。比如
尺寸和最后修改的日期。
注意:仅当
content-length
和last-modified
的索引始终相同时才有效。You simply have to index it again in order to access the tuple. Like
for size and the date of last modification.
Note: This only works when the indices of
content-length
andlast-modified
are always the same.你在数组中有元组...幸运的是,你可以以同样的方式引用(或根据你的术语取消引用它们)...
所以 v = x[0] 会在你声明元组时给你(“'content-长度', '2501479'") 和
v[0] 将为您提供“内容长度”,而 v[1] 将为您提供“2501479”(尽管您可能希望对此进行 int(v[0]) 并进行一些错误检查。
您可能会更好 不过,将该数组放入字典中;这样,如果顺序发生变化,您就可以确定得到内容长度。
值得庆幸的是,语法几乎相同 - 它使用 [] 运算符。给你看在 python 手册页上查看如何转换数组 -> dict (不能为你做所有事情!!)
You've got tuples inside an array... Luckily you can reference (or dereference them depending on your terminology) the same way...
so v = x[0] will give you as you state the tuple ("'content-length', '2501479'") and
v[0] will give you 'content-length' and v[1] will give you '2501479' (although you probably want to do an int(v[0]) on that with perhaps some error checking.
You may be better putting that array into a dict though; so you can be certain you are getting out the content length if the order should ever change.
Thankfully, the syntax is almost the same - it uses the [] operator. However, I am going to leave it to you to look at the python man pages to see how to convert an array -> dict (can't do everything for you!!)