从 Python 2.6 中的 url 下载 wmv
我在特定的 url 有一个 wmv 文件,我想使用 Python 抓取并保存为文件。我的脚本使用 urllib2 来验证和读取字节并将它们分块保存在本地。但是,一旦我打开该文件,没有视频播放器可以识别它。当我从浏览器手动下载 wmv 时,该文件播放得很好,但奇怪的是,它比我最终使用 Python 得到的文件小了大约 500kb。这是怎么回事?是否有我需要以某种方式排除的标题信息?
I have a wmv file at a particular url that I want to grab and save as a file using Python. My script uses urllib2 to authenticate and read the bytes and save them locally in chunks. However, once I open the file, no video player recognizes it. When I download the wmv manually from a browser, the file plays fine, but oddly enough ends up being about 500kb smaller than the file I end up with using Python. What's going on? Is there header information I need to somehow exclude?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
服务器发回的传输编码是什么?我敢打赌它会发回 Transfer-Encoding: chunked,它最终会出现在您的数据中。
http://en.wikipedia.org/wiki/Chunked_transfer_encoding
What Transfer-Encoding is the server sending back? I would bet it is sending back Transfer-Encoding: chunked, which is ending up in your data.
http://en.wikipedia.org/wiki/Chunked_transfer_encoding
据我了解,urllib 在 HTTP 级别工作,应该正确删除后续块中的标头。我看了一下 read() 返回的数据,都是字节。
From what I understand, urllib works at the HTTP level and should properly remove headers in subsequent chunks. I took a look at the data returned by read() and it's all bytes.
我在 Windows 机器上以“w”模式编写文件。写入二进制数据应使用模式“wb”完成,否则 EOL 将不正确。
I was writing my file with mode 'w' on a Windows machine. Writing binary data should be done with mode 'wb' or the EOLs will be incorrect.