Google App Engine内存中存储300MB的解决方案

发布于 2024-11-18 02:37:58 字数 1535 浏览 5 评论 0原文

我正在 Python 中使用 Google App Engine。我的数据库中有 5000 人。包含 5000 个人物对象的整个列表占用 300 MB 内存。

我一直在尝试使用 blobcache 将其存储在内存中，这是一个[此处][1]编写的模块。

我遇到了pickle“OutOfMemory”问题，并且正在寻找一种解决方案，该解决方案涉及将这5000个对象存储到数据库中，然后一次检索它们。

我的人物模型看起来像这样。

class PersonDB(db.Model):
    serialized = db.BlobProperty()
    pid = db.StringProperty()

每个人都是一个具有许多与之关联的属性和方法的对象，因此我决定对每个人对象进行pickle并将其存储为序列化字段。 pid 只允许我通过他们的 id 查询该人。我的人看起来像这样

class Person():
    def __init__(self, sex, mrn, age):
       self.sex = sex;
       self.age = age; #exact age
       self.record_number = mrn;
       self.locations = [];

    def makeAgeGroup(self, ageStr):
       ageG = ageStr
       return int(ageG)

    def addLocation(self, healthdistrict):
        self.locations.append(healthdistrict)

当我一次将所有 5000 人存储到我的数据库中时，我收到服务器 500 错误。有谁知道为什么？我的代码如下：

   #People is my list of 5000 people objects
def write_people(self, people):
    for person in people:
        personDB = PersonDB()
        personDB.serialized = pickle.dumps(person)
        personDB.pid = person.record_number
        personDB.put()

如何在我的 App Engine 方法中一次检索所有 5000 个这些对象？

我的想法是做这样的事情

def get_patients(self):
    #Get my list of 5000 people back from the database
    people_from_db = db.GqlQuery("SELECT * FROM PersonDB")
    people = []
    for person in people_from_db:
        people.append(pickle.loads(person.serialized))

提前感谢您的帮助，我已经坚持了一段时间了！

原文

I am using Google App Engine in Python. I have 5000 people in my database. The entire list of 5000 people objects takes up 300 MB of memory.

I have been trying to store this in memory using blobcache, a module written [here][1].

I am running into pickle "OutOfMemory" issues, and am looking for a solution that involves storing these 5000 objects into a database, and then retrieving them all at once.

My person model looks like this.

class PersonDB(db.Model):
    serialized = db.BlobProperty()
    pid = db.StringProperty()

Each person is an object that has many attributes and methods associated with it, so I decided to pickle each person object and store it as the serialized field. The pid just allows me to query the person by their id. My person looks something like this

class Person():
    def __init__(self, sex, mrn, age):
       self.sex = sex;
       self.age = age; #exact age
       self.record_number = mrn;
       self.locations = [];

    def makeAgeGroup(self, ageStr):
       ageG = ageStr
       return int(ageG)

    def addLocation(self, healthdistrict):
        self.locations.append(healthdistrict)

When I store all 5000 people at once into my database, I get a Server 500 error. Does anyone know why? My code for this is as follows:

   #People is my list of 5000 people objects
def write_people(self, people):
    for person in people:
        personDB = PersonDB()
        personDB.serialized = pickle.dumps(person)
        personDB.pid = person.record_number
        personDB.put()

How would I retrieve all 5000 of these objects at once in my App Engine method?

My idea is to do something like this

def get_patients(self):
    #Get my list of 5000 people back from the database
    people_from_db = db.GqlQuery("SELECT * FROM PersonDB")
    people = []
    for person in people_from_db:
        people.append(pickle.loads(person.serialized))

Thanks for the help in advance, I've been stuck on this for a while!!

分享到QQ

分享到微博