有什么方法可以让 django 远程 api 在 GAE 中运行得更快吗?

发布于 2024-11-17 22:38:13 字数 886 浏览 5 评论 0原文

此处跟进此问题。

我终于编写了一个代码生成工具,将我的所有数据库数据包装成这样:

Pdtfaamt(fano=212373,comsname='SMM',pdtcode='20PLFCL',kind='1',fatype='S',itemno='A3',itemamt=75,type=0).save()
Pdtfaamt(fano=212374,comsname='SMM',pdtcode='20PLFCL',kind='1',fatype='S',itemno='E1',itemamt=75,type=0).save()
Pdtfaamt(fano=212375,comsname='SMM',pdtcode='20PLFCL',kind='1',fatype='S',itemno='E6',itemamt=75,type=0).save()
Pdtfaamt(fano=212376,comsname='SMM',pdtcode='20PLFCL',kind='1',fatype='C',itemno='A3',itemamt=3,type=1).save()

是的,没错!我取出整个数据库并将数据转换为群体指令代码,以便我能够将数据库迁移到 GAE。

所以我部署了 django-nonrel 项目,使用 django-nonrel 远程 api 来触发数据填充过程。

它工作正常,但有一个问题:速度非常慢。谁能告诉我如何提高速度?我做了一些计算,可能需要长达 30 天的时间才能在 GAE 上启动并运行我的所有数据。

附注我使用 django-nonrel 和 djangoappengine 作为后端。

Following up this question here.

I finally wrote up a code generation tool to wrap all my database data into something like this:

Pdtfaamt(fano=212373,comsname='SMM',pdtcode='20PLFCL',kind='1',fatype='S',itemno='A3',itemamt=75,type=0).save()
Pdtfaamt(fano=212374,comsname='SMM',pdtcode='20PLFCL',kind='1',fatype='S',itemno='E1',itemamt=75,type=0).save()
Pdtfaamt(fano=212375,comsname='SMM',pdtcode='20PLFCL',kind='1',fatype='S',itemno='E6',itemamt=75,type=0).save()
Pdtfaamt(fano=212376,comsname='SMM',pdtcode='20PLFCL',kind='1',fatype='C',itemno='A3',itemamt=3,type=1).save()

Yes, that's right! I pulled the entire database out and transformed the data into population instruction codes so that I am able to migrate my database up to GAE.

So I deployed the django-nonrel project, used django-nonrel remote api to trigger the data population process.

It works okay, except that there is a problem: it's extremely slow. Could anyone tell me how I will be able to improve the speed? I have done some calculation, it may take up to 30 days to get all my data up and running there on GAE.

ps. I am using django-nonrel, and djangoappengine for the backend.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

笑叹一世浮沉 2024-11-24 22:38:13

编写导入脚本以利用 python 的多处理池

def import_thing(data):
    thing = ThingEntity(**data)
    thing.put()

def main():
    data = [{fano:'212374', comsname:'SMM', },
              {fano:'212374', comsname:'212375', },
              ...etc ]
    pool = multiprocessing.Pool(4) # split data into 4 parts to run in parallel
    pool.map(import_thing, data)

自 AppEngine 生产以来, 服务器喜欢有很多连接,您应该调整池大小以找到最佳数量。这不适用于导入到开发服务器,因为它是单线程的。

同样重要:确保将它们分批放置(例如 10-20 个),而不是一次放置一个,否则往返会影响您的性能。因此,改进的脚本应该分块工作,例如:

data = [
    [item1,item2,item3],
    [item4, item5, item6],
    [item7, item8, item9],
]
pool.map(import_batch, data)

Write your import script to take advantage of python's multiprocessing Pool

def import_thing(data):
    thing = ThingEntity(**data)
    thing.put()

def main():
    data = [{fano:'212374', comsname:'SMM', },
              {fano:'212374', comsname:'212375', },
              ...etc ]
    pool = multiprocessing.Pool(4) # split data into 4 parts to run in parallel
    pool.map(import_thing, data)

Since the AppEngine production servers like having lots of connections you should play around with the pool size to find the best number. This will not work for importing to the dev server as it's single-threaded.

Also important: Ensure you are putting them in batches of say 10-20 not putting one at a time, or the round-trips will be killing your performance. So an improved script should work in chunks like:

data = [
    [item1,item2,item3],
    [item4, item5, item6],
    [item7, item8, item9],
]
pool.map(import_batch, data)
焚却相思 2024-11-24 22:38:13

您可能想了解一下 Mapper API

You probably want to look into the Mapper API.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文