Google App Engine 中的 Memcache 1 MB 限制

发布于 2024-10-18 09:16:15 字数 64 浏览 1 评论 0原文

如何在 memcache 中存储大小大于 1 MB 的对象?有没有办法将其拆分,但仍然可以使用相同的密钥访问数据?

How do you store a object with size bigger than 1 MB in memcache? Is there a way to split it up, but have the data still be accessible with the same key?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

笑梦风尘 2024-10-25 09:16:15

我使用以下模块(“blobcache”)在 GAE 的内存缓存中存储大小大于 1Mb 的值。

import pickle
import random
from google.appengine.api import memcache


MEMCACHE_MAX_ITEM_SIZE = 900 * 1024


def delete(key):
  chunk_keys = memcache.get(key)
  if chunk_keys is None:
    return False
  chunk_keys.append(key)
  memcache.delete_multi(chunk_keys)
  return True


def set(key, value):
  pickled_value = pickle.dumps(value)

  # delete previous entity with the given key
  # in order to conserve available memcache space.
  delete(key)

  pickled_value_size = len(pickled_value)
  chunk_keys = []
  for pos in range(0, pickled_value_size, MEMCACHE_MAX_ITEM_SIZE):
    # TODO: use memcache.set_multi() for speedup, but don't forget
    # about batch operation size limit (32Mb currently).
    chunk = pickled_value[pos:pos + chunk_size]

    # the pos is used for reliable distinction between chunk keys.
    # the random suffix is used as a counter-measure for distinction
    # between different values, which can be simultaneously written
    # under the same key.
    chunk_key = '%s%d%d' % (key, pos, random.getrandbits(31))

    is_success = memcache.set(chunk_key, chunk)
    if not is_success:
      return False
    chunk_keys.append(chunk_key)
  return memcache.set(key, chunk_keys)


def get(key):
  chunk_keys = memcache.get(key)
  if chunk_keys is None:
    return None
  chunks = []
  for chunk_key in chunk_keys:
    # TODO: use memcache.get_multi() for speedup.
    # Don't forget about the batch operation size limit (currently 32Mb).
    chunk = memcache.get(chunk_key)
    if chunk is None:
      return None
    chunks.append(chunk)
  pickled_value = ''.join(chunks)
  try:
    return pickle.loads(pickled_value)
  except Exception:
    return None

I use the following module ("blobcache") for storing values with sizes greater than 1Mb in GAE's memcache.

import pickle
import random
from google.appengine.api import memcache


MEMCACHE_MAX_ITEM_SIZE = 900 * 1024


def delete(key):
  chunk_keys = memcache.get(key)
  if chunk_keys is None:
    return False
  chunk_keys.append(key)
  memcache.delete_multi(chunk_keys)
  return True


def set(key, value):
  pickled_value = pickle.dumps(value)

  # delete previous entity with the given key
  # in order to conserve available memcache space.
  delete(key)

  pickled_value_size = len(pickled_value)
  chunk_keys = []
  for pos in range(0, pickled_value_size, MEMCACHE_MAX_ITEM_SIZE):
    # TODO: use memcache.set_multi() for speedup, but don't forget
    # about batch operation size limit (32Mb currently).
    chunk = pickled_value[pos:pos + chunk_size]

    # the pos is used for reliable distinction between chunk keys.
    # the random suffix is used as a counter-measure for distinction
    # between different values, which can be simultaneously written
    # under the same key.
    chunk_key = '%s%d%d' % (key, pos, random.getrandbits(31))

    is_success = memcache.set(chunk_key, chunk)
    if not is_success:
      return False
    chunk_keys.append(chunk_key)
  return memcache.set(key, chunk_keys)


def get(key):
  chunk_keys = memcache.get(key)
  if chunk_keys is None:
    return None
  chunks = []
  for chunk_key in chunk_keys:
    # TODO: use memcache.get_multi() for speedup.
    # Don't forget about the batch operation size limit (currently 32Mb).
    chunk = memcache.get(chunk_key)
    if chunk is None:
      return None
    chunks.append(chunk)
  pickled_value = ''.join(chunks)
  try:
    return pickle.loads(pickled_value)
  except Exception:
    return None
总攻大人 2024-10-25 09:16:15

有 memcache 方法 set_multiget_multi,采用字典和前缀作为参数。

如果您可以将数据拆分为块字典,则可以使用它。基本上,前缀将成为您的新密钥名称。

您必须以某种方式跟踪块的名称。此外,任何块都可能随时从内存缓存中逐出,因此您还需要某种方式来重建部分数据。

There are memcache methods set_multi and get_multi that take a dictionary and a prefix as arguments.

If you could split your data into a dictionary of chunks you could use this. Basically, the prefix would become your new key name.

You'd have to keep track of the names of the chunks somehow. Also, ANY of the chunks could be evicted from memcache at any time, so you'd also need someway to reconstitute partial data.

铜锣湾横着走 2024-10-25 09:16:15

将大量数据存储到内存缓存中的最佳方法是将其分成块并使用 set_multiget_multi 有效地存储和检索数据。

但请注意,某些部分可能会从缓存中删除,而其他部分可能会保留。

您还可以通过将数据存储在全局变量中来缓存应用程序实例中的数据,但这不太理想,因为它不会在实例之间共享,并且更有可能消失。

GAE 路线图支持从应用内上传到 blobstore,您可能需要留意这一点,以及与 Google Storage 的集成。

The best way to store a large blob of data into memcache is to split it up into chunks and use set_multi and get_multi to efficiently store and retrieve the data.

But be aware that it's possible for some parts to drop from cache and others to remain.

You can also cache data in the application instance by storing it in a global variable, but this is less ideal as it won't be shared across instances and is more likely to disappear.

Support for uploading to the blobstore from within the application is on the GAE roadmap, you might want to keep an eye out for that, as well as integration with Google Storage.

著墨染雨君画夕 2024-10-25 09:16:15

正如其他人提到的,您可以添加和< a href="http://code.google.com/appengine/docs/python/memcache/clientclass.html#Client_get_multi" rel="nofollow">一次从内存缓存中检索多个值。有趣的是,虽然应用引擎博客说< /a> 这些批量操作最多可以处理 32mb,官方文档仍然说它们限制为 1mb。所以一定要测试一下,也许还会缠着谷歌更新他们的文档。还要记住,您的某些块可能会先于其他块从内存缓存中逐出。

我建议在将对象发送到内存缓存之前使用谷歌搜索 python compress string 并考虑序列化和压缩对象。

您可能还想询问这个人他的意思一个允许他在内存缓存中存储更大对象的扩展。

As other guys have mentioned, you can add and retrieve multiple values from memcache at once. Interestingly, while the app engine blog says these bulk operations can handle up to 32mb, the official documentation still says they're limited to 1mb. So definitely test it out, and maybe pester Google about updating their documentation. And also keep in mind that some of your chunks might get evicted from memcache before others.

I'd recommend googling python compress string and thinking about serializing and compressing your object before sending it to memcache.

You might also want to ask this guy what he means about having an extension that allows him to store larger objects in memcache.

梦开始←不甜 2024-10-25 09:16:15

一个很好的解决方法是使用layer_cache.py,这是一个在可汗学院(开源)编写和使用的Python 类。基本上,它是内存缓存(cachepy 模块)与 memcache 的组合,用作通过实例同步内存缓存的方式。 在此处查找源代码并阅读 Ben Kamens 的博客文章此处

A nice workaround is to use layer_cache.py, a python class written and used at Khan Academy (open source). Basically it's a combination of in-memory cache (cachepy module) with memcache being used as a way of syncing the in-memory cache through instances. find the source here and read Ben Kamens blog post about it here.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文