将本地文件与 HTTP 服务器位置同步(在 Python 中)
我有一个 HTTP 服务器,它托管一些大文件,并有 python 客户端(GUI 应用程序)下载它。
我希望客户端仅在需要时下载文件,但每次运行时都有最新的文件。
我认为每个客户端都会在每次运行时使用 If-Modified-Since HTTP 标头以及现有文件的文件时间(如果有)下载文件。有人可以建议如何在 python 中做到这一点吗?
有人可以建议一种替代的、简单的方法来实现我的目标吗?
I have an HTTP server which host some large file and have python clients (GUI apps) which download it.
I want the clients to download the file only when needed, but have an up-to-date file on each run.
I thought each client will download the file on each run using the If-Modified-Since HTTP header with the file time of the existing file, if any. Can someone suggest how to do it in python?
Can someone suggest an alternative, easy, way to achieve my goal?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可以添加一个名为
ETag
的标头(文件的哈希值、md5sum 或 sha256 等),以比较两个文件是否不同(而不是上次修改日期)You can add a header called
ETag
, (hash of your file, md5sum or sha256 etc ), to compare if two files are different instead of last-modified date我现在假设一些事情,但是..
一种解决方案是在服务器上有一个单独的 HTTP 文件 (check.php),它为您托管的每个文件创建哈希/校验和。如果文件与本地文件不同,则客户端将下载该文件。这意味着如果服务器上文件的内容发生更改,客户端将注意到该更改,因为校验和会有所不同。
对文件内容进行 MD5 哈希,将其放入数据库或其他内容中,并在下载任何内容之前对其进行检查。
您的解决方案可以工作,但它要求服务器在 GET 请求的标头中实际包含“修改”日期(某些服务器软件不这样做)。
我想说建立一个看起来像这样的数据库:
[ID] [File_name] [File_hash]
0001 moo.txt asd124kJKJhj124kjh12j
I'm assuming some things right now, BUT..
One solution would be to have a separate HTTP file on the server (check.php) which creates a hash/checksum of each files you're hosting. If the files differ from the local files, then the client will download the file. This means that if the content of the file on the server changes, the client will notice the change since the checksum will differ.
do a MD5 hash of the file contents, put it in a database or something and check against it before downloading anything.
Your solution would work to, but it requires the server to actually include the "modified" date in the Header for the GET request (some server softwares does not do this).
I'd say putting up a database that looks something like:
[ID] [File_name] [File_hash]
0001 moo.txt asd124kJKJhj124kjh12j
在我看来,最简单的解决方案是将文件托管在 Mercurial 中,并使用 Mercurial api 查找文件的哈希值,如果哈希值已更改,则下载文件。
计算哈希值可以作为这个问题的答案来完成;下载文件
urllib
就足够了。It seems to me the easiest solution is hosting the file in mercurial and using mercurial api to find the file's hash, downloading the file if the hash has changed.
Calculating the hash can be done as the answer to this question; for downloading the file
urllib
will be enough.