使用 20.000 个字符串填充 Google App Engine 应用程序的数据存储区
我正在尝试在本地数据存储中创建并存储 20000 个随机代码,然后在 appspot 中尝试此操作...这是模型
class PromotionCode (db.Model):
code = db.StringProperty(required=True)
这是处理填充请求的类(只有登录的管理员可以使用它)。它创建随机字母数字代码并尝试在数据存储中存储 20000 个代码:
class Populate(webapp.RequestHandler):
def GenerateCode(self):
chars = string.letters + string.digits
code = ""
for i in range(8):
code = code + choice(chars)
return code.upper()
def get(self):
codes = "";
code_list = []
for i in range(20000):
new_code = self.GenerateCode()
promotion_code = PromotionCode(code=new_code)
code_list.append(promotion_code)
codes = codes + "<br>" + new_code
db.put(code_list)
self.response.out.write("populating datastore...<br>")
self.response.out.write(codes)
我想我可以尝试对所有这些 put() 进行批处理,因此我创建了一个代码列表 (code_list)。本地完成需要2-5分钟。
不使用bulkuploader 选项是否可以更快地完成此操作?因为显然我收到了 500 服务器错误。或者也许可以通过连续的调用或步骤来完成......
I'm trying to create and store 20000 random codes in my local datastore, before trying this in appspot... This is the model
class PromotionCode (db.Model):
code = db.StringProperty(required=True)
And this is the class that handles the populate request (only a logged admin may use it). It creates random alphanumeric codes and tries to store 20000 of them in the datastore:
class Populate(webapp.RequestHandler):
def GenerateCode(self):
chars = string.letters + string.digits
code = ""
for i in range(8):
code = code + choice(chars)
return code.upper()
def get(self):
codes = "";
code_list = []
for i in range(20000):
new_code = self.GenerateCode()
promotion_code = PromotionCode(code=new_code)
code_list.append(promotion_code)
codes = codes + "<br>" + new_code
db.put(code_list)
self.response.out.write("populating datastore...<br>")
self.response.out.write(codes)
I thought I could try batching all those put(), so I created a list of codes (code_list). It takes 2-5 minutes to do it locally.
Is it possible to do it faster without using the bulkuploader option? Because I'm getting the 500 server error, obviously. Or maybe doing it in consecutive calls or steps...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
为什么不直接更改上面的代码以一次插入 100 个,然后
从命令行运行类似以下内容?无论如何,条目都是随机的,您不需要记住任何状态。
您可以通过手动登录并从浏览器检查 cookie 来获取 ACSID cookie。
请求之间的睡眠将阻止您启动大量实例或达到短期配额。
如果这是您需要自动化的事情,那么任务队列建议很好,但如果这是一次性的事情,您最好保持简单。
Why not just change your code above to insert 100 at a time, and just run something like:
from your command line? The entries are random anyway, you don't need to remember any state.
You can get the ACSID cookie by logging in manually and inspecting the cookies from your browser.
The sleep between requests will prevent you from spinning up a gigantic number of instances or hitting short-term quotas.
The task queue suggestion is good if this is something you need to automate, but if it's a one-time thing you might as well keep it simple.
您可以在任务队列中对进程进行批处理吗?
将批处理大小设置为较高的任务队列...
您可以更快地归档它
Can you batching the process in task queues.
Setting batch size high into task queue...
U can archive it doing faster
我不明白为什么你必须提前创建 20,000 个,而不是根据需要即时创建每个,但我敢打赌你可以大大加快你的代码速度。像这样的事情(未经测试):
不打印代码可能会节省时间。
我相信这里的其他人可以做得更好......
I don't understand why you have to create 20,000 in advance as opposed to creating each as needed on the fly, but I bet you could speed up your code quite a bit. Something like this (untested):
Not printing out the codes may save time.
I'm sure others here can do better...
如果您的任务无法在 30 秒的请求截止日期内完成,您可以将其分成多个块(这应该很容易,因为它们都在做相同的事情)并在任务队列上的任务中运行它们。无论如何,您可能应该在那里完成所有工作,因此您不会强迫用户在返回响应之前等待其完成。
不过,像杰夫一样,我很困惑为什么您要预先生成 20,000 个,而不是在需要时才生成它们。
If your task won't complete in the 30 second request deadline, you can break it up into chunks - which should be easy since they're all doing the same thing - and run them in tasks on the Task Queue. You should probably do all your work there anyway, so you don't force the user to wait for it to complete before returning a response.
Like Jeff, though, I'm puzzled why you'd want to generate 20,000 of these upfront rather than just generating them when you need them.