Python 中的缓存选项或加速 urlopen

发布于 2024-09-14 05:28:58 字数 241 浏览 1 评论 0原文

大家好,我有一个为最终用户查找信息的网站,它是用 Python 编写的,需要几个 urlopen 命令。因此,加载页面需要一些时间。我想知道是否有办法让它更快?有没有一种简单的 Python 缓存方法或者让 urlopen 脚本最后变得有趣的方法?

urlopens 访问 Amazon API 来获取价格,因此该网站需要保持最新状态。我能想到的唯一选择是制作一个脚本来制作 mySQL 数据库并时不时地运行它,但这会很麻烦。

谢谢!

Hey all, I have a site that looks up info for the end user, is written in Python, and requires several urlopen commands. As a result it takes a bit for a page to load. I was wondering if there was a way to make it faster? Is there an easy Python way to cache or a way to make the urlopen scripts fun last?

The urlopens access the Amazon API to get prices, so the site needs to be somewhat up to date. The only option I can think of is to make a script to make a mySQL db and run it ever now and then, but that would be a nuisance.

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

要走就滚别墨迹 2024-09-21 05:28:58

httplib2 理解 http 请求缓存,在某种程度上抽象了 urllib/urllib2 的混乱,并且还有其他好处,比如 gzip 支持。

http://code.google.com/p/httplib2/

但除了使用它获取数据,如果数据集不是很大,我还会实现某种功能缓存/记忆。
例子:
http://wiki.python.org/moin/PythonDecoratorLibrary#Memoize

不会的修改该装饰器以允许基于时间的到期并不困难,例如仅将结果缓存 15 分钟。

如果结果更大,则需要开始研究 memcached/redis。

httplib2 understands http request caching, abstracts urllib/urllib2's messiness somewhat and has other goodies, like gzip support.

http://code.google.com/p/httplib2/

But besides using that to get the data, if the dataset is not very big, I would also implement some kind of function caching / memoizing.
Example:
http://wiki.python.org/moin/PythonDecoratorLibrary#Memoize

It wouldn't be too hard to modify that decorator to allow for time based expiry, e.g. only cache the result for 15 mins.

If the results are bigger, you need to start looking into memcached/redis.

流年已逝 2024-09-21 05:28:58

您可以做几件事。

  • urllib 缓存机制是 暂时禁用,但您可以通过将从 Amazon 获得的数据存储在内存或某个文件中来轻松推出自己的数据。

  • 与上面类似,您可以有一个单独的脚本来经常刷新价格,并且 cron 它每半小时运行一次(比如说)。这些可以存储在任何地方。

  • 您可以在新的线程/进程中运行 URL 获取,因为它基本上都在等待。

There are several things you can do.

  • The urllib caching mechanism is temporarily disabled, but you could easily roll your own by storing the data you get from Amazon in memory or in a file somewhere.

  • Similarly to the above, you could have a separate script that refreshes the prices every so often, and cron it to run every half an hour (say). These could be stored wherever.

  • You could run the URL fetching in a new thread/process, since it is mostly waiting anyway.

看海 2024-09-21 05:28:58

价格多久变动一次?如果它们相当恒定(例如每天一次,或每小时左右),只需编写一个 cron 脚本(或等效脚本)来检索值并将其存储在数据库或文本文件或您需要的任何内容中。

我不知道您是否可以从 Amazon API 检查时间戳数据 - 如果他们报告此类事情。

How often do the price(s) change? If they're pretty constant (say once a day, or every hour or so), just go ahead and write a cron script (or equivalent) that retrieves the values and stores it in a database or text file or whatever it is you need.

I don't know if you can check the timestamp data from the Amazon API - if they report that sort of thing.

各空 2024-09-21 05:28:58

你可以使用memcached。它就是为此而设计的,这样您就可以轻松地与不同的程序/脚本共享缓存。从 Python 中使用它确实很容易,请检查:

在 Python 中使用 python-memcache (memcached) 的好例子?

然后,当密钥不存在时以及从某个 cron 脚本更新 memcached,就可以开始了。

另一个更简单的选项是烘焙您自己的缓存,可能将数据存储在字典中和/或使用 cPickle 将其序列化到磁盘(如果您希望数据在不同的运行之间共享)。

You could use memcached. It is designed for that, and this way you could easily share the cache with different program/scripts. And it is really easy to use from Python, check:

Good examples of python-memcache (memcached) being used in Python?

Then you update the memcached when a key is not there and also from some cron script, and you're ready to go.

Another, simpler, option would be to cook you own cache, probably storing the data in a dictionary and/or using cPickle to serialize it to disk (if you want the data to be shared between different runs).

绻影浮沉 2024-09-21 05:28:58

如果您需要一次从多个站点获取数据,您可以尝试使用 asyncore http://docs.python .org/library/asyncore.html

这样您就可以轻松地一次加载多个页面。

If you need to grab from multiple sites at once you might try whit asyncore http://docs.python.org/library/asyncore.html

This way you can easily load multiple pages at once.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文