Amazon Elastic Map Reduce - 保持服务器处于活动状态?
我正在 EMR 中测试作业,每个测试都需要很长时间才能启动。有没有办法让服务器/主节点在 Amazon EMR 中保持活动状态?我知道这可以通过 API 来完成。但是,我想知道这是否可以在 aws 控制台中完成?
I am testing jobs in EMR and each and every test takes a lot of time to start up. Is there a way to keep the server/master node alive in Amazon EMR? I know this can be done with the API. But, I wanted to know if this can be done in the aws console?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您无法从 AWS 控制台执行此操作。引用开发者指南
您只能通过 CLI 和 API 创建作业流程,然后向其中添加步骤来执行此操作。
You cannot do this from the AWS console. To quote the developer guide
You can only do this via the CLI and API, by creating a job flow, then adding steps to it.
您无法使用 Web 控制台执行此操作 - 但通过 API 和编程工具,您将能够向长时间运行的作业添加多个步骤,这就是我所做的。这样,您就可以在同一个长期运行的集群上一个接一个地启动作业,而不必每次都重新创建一个新作业。
如果您熟悉 Python,我强烈推荐 Boto 库。其他 AWS API 工具也可以让您执行此操作。
如果您遵循 Boto EMR 教程,您会发现一些示例
:给你一个想法,这就是我所做的(对于流作业):
You can't do this with the web console - but through the API and programming tools, you will be able to add multiple steps to a long-running job, which is what I do. That way you can fire off jobs one after the other on the same long-running cluster, without having to re-create a new one each time.
If you are familiar with Python, I highly recommend the Boto library. The other AWS API tools let you do this as well.
If you follow the Boto EMR tutorial, you'll find some examples:
Just to give you an idea, this is what I do (with streaming jobs):
为了让机器保持活力,启动一个交互式的猪会话。然后机器就不会关机了。然后,您可以使用以下命令从命令行执行映射/归约逻辑:
to keep the machine alive start an interactive pig session. Then the machine won't shut down. You can then execute your map/reduce logic from the command line using: