Google App Engine内存中存储300MB的解决方案
我正在 Python 中使用 Google App Engine。我的数据库中有 5000 人。包含 5000 个人物对象的整个列表占用 300 MB 内存。
我一直在尝试使用 blobcache 将其存储在内存中,这是一个[此处][1]编写的模块。
我遇到了pickle“OutOfMemory”问题,并且正在寻找一种解决方案,该解决方案涉及将这5000个对象存储到数据库中,然后一次检索它们。
我的人物模型看起来像这样。
class PersonDB(db.Model):
serialized = db.BlobProperty()
pid = db.StringProperty()
每个人都是一个具有许多与之关联的属性和方法的对象,因此我决定对每个人对象进行pickle并将其存储为序列化字段。 pid 只允许我通过他们的 id 查询该人。我的人看起来像这样
class Person():
def __init__(self, sex, mrn, age):
self.sex = sex;
self.age = age; #exact age
self.record_number = mrn;
self.locations = [];
def makeAgeGroup(self, ageStr):
ageG = ageStr
return int(ageG)
def addLocation(self, healthdistrict):
self.locations.append(healthdistrict)
当我一次将所有 5000 人存储到我的数据库中时,我收到服务器 500 错误。有谁知道为什么?我的代码如下:
#People is my list of 5000 people objects
def write_people(self, people):
for person in people:
personDB = PersonDB()
personDB.serialized = pickle.dumps(person)
personDB.pid = person.record_number
personDB.put()
如何在我的 App Engine 方法中一次检索所有 5000 个这些对象?
我的想法是做这样的事情
def get_patients(self):
#Get my list of 5000 people back from the database
people_from_db = db.GqlQuery("SELECT * FROM PersonDB")
people = []
for person in people_from_db:
people.append(pickle.loads(person.serialized))
提前感谢您的帮助,我已经坚持了一段时间了!
I am using Google App Engine in Python. I have 5000 people in my database. The entire list of 5000 people objects takes up 300 MB of memory.
I have been trying to store this in memory using blobcache, a module written [here][1].
I am running into pickle "OutOfMemory" issues, and am looking for a solution that involves storing these 5000 objects into a database, and then retrieving them all at once.
My person model looks like this.
class PersonDB(db.Model):
serialized = db.BlobProperty()
pid = db.StringProperty()
Each person is an object that has many attributes and methods associated with it, so I decided to pickle each person object and store it as the serialized field. The pid just allows me to query the person by their id. My person looks something like this
class Person():
def __init__(self, sex, mrn, age):
self.sex = sex;
self.age = age; #exact age
self.record_number = mrn;
self.locations = [];
def makeAgeGroup(self, ageStr):
ageG = ageStr
return int(ageG)
def addLocation(self, healthdistrict):
self.locations.append(healthdistrict)
When I store all 5000 people at once into my database, I get a Server 500 error. Does anyone know why? My code for this is as follows:
#People is my list of 5000 people objects
def write_people(self, people):
for person in people:
personDB = PersonDB()
personDB.serialized = pickle.dumps(person)
personDB.pid = person.record_number
personDB.put()
How would I retrieve all 5000 of these objects at once in my App Engine method?
My idea is to do something like this
def get_patients(self):
#Get my list of 5000 people back from the database
people_from_db = db.GqlQuery("SELECT * FROM PersonDB")
people = []
for person in people_from_db:
people.append(pickle.loads(person.serialized))
Thanks for the help in advance, I've been stuck on this for a while!!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您不应该同时将所有 5000 个用户存储在内存中。只检索您需要的那个。
You should not have all 5000 users in memory at once. Only retrieve the one you need.
对于这种大小的数据,为什么不使用 blobstore 和 memcache?
就性能而言(从最高到最低):
查看今年的 Google IO 视频,有一个很棒的视频是关于使用 blobstore 来完成此类事情的。对于某些用例,数据库会带来显着的性能(和成本)损失。
(对于迂腐的读者来说,后三个的读取性能实际上是相同的,但写入时间/成本有显着差异)
For this size of data why not use a blobstore and memcache?
In terms of performance (from highest to lowest):
Check out the Google IO videos from this year, there is a great one on using the blobstore for exactly this sort of thing. There is a significant performance (and cost) penalty associated with the DB for some use cases.
(for the pedantic readers, the read performance of the last three will be effectively the same, but there are significant differences in write time/cost)
您还可以检查项目绩效应用引擎
https://github.com/ocanbascil/PerformanceEngine
You would also check a project performance appengine
https://github.com/ocanbascil/PerformanceEngine