当前位置：文江博客话题详情

Amazon Elastic Map Reduce - 保持服务器处于活动状态？

发布于 2024-08-26 15:27:55 字数 113 浏览 9 评论 0原文

我正在 EMR 中测试作业，每个测试都需要很长时间才能启动。有没有办法让服务器/主节点在 Amazon EMR 中保持活动状态？我知道这可以通过 API 来完成。但是，我想知道这是否可以在 aws 控制台中完成？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

白色秋天 2024-09-02 15:27:55

您无法从 AWS 控制台执行此操作。引用开发者指南

AWS 管理控制台中的 Amazon Elastic MapReduce 选项卡不支持向作业流程添加步骤。

您只能通过 CLI 和 API 创建作业流程，然后向其中添加步骤来执行此操作。

$ ./elastic-mapreduce --create --active --stream

You cannot do this from the AWS console. To quote the developer guide

The Amazon Elastic MapReduce tab in the AWS Management Console does not support adding steps to a job flow.

You can only do this via the CLI and API, by creating a job flow, then adding steps to it.

$ ./elastic-mapreduce --create --active --stream

回复收藏 0 原文

请帮我爱他 2024-09-02 15:27:55

您无法使用 Web 控制台执行此操作 - 但通过 API 和编程工具，您将能够向长时间运行的作业添加多个步骤，这就是我所做的。这样，您就可以在同一个长期运行的集群上一个接一个地启动作业，而不必每次都重新创建一个新作业。

如果您熟悉 Python，我强烈推荐 Boto 库。其他 AWS API 工具也可以让您执行此操作。

如果您遵循 Boto EMR 教程，您会发现一些示例

：给你一个想法，这就是我所做的（对于流作业）：

# Connect to EMR
conn = boto.connect_emr()

# Start long-running job, don't forget keep_alive setting
jobid = conn.run_jobflow(name='My jobflow',
                          log_uri='s3://<my log uri>/jobflow_logs',
                          keep_alive=True)

# Create your streaming job
step = StreamingStep(...)

# Add the step to the job
conn.add_jobflow_steps(jobid, [step])

# Wait till its complete
while True:
  state = conn.describe_jobflow(jobid).steps[-1].state
  if (state == "COMPLETED"):
    break
  if (state == "FAILED") or (state == "TERMINATED") or (state == "CANCELLED"):
    print >> sys.stderr, ("EMR job failed! Message = %s!") % (state)
    sys.exit(1)
  time.sleep (60)

# Create your next job here and add it to the EMR cluster
step = StreamingStep(...)
conn.add_jobflow_steps(jobid, [step])

# Repeat :)

You can't do this with the web console - but through the API and programming tools, you will be able to add multiple steps to a long-running job, which is what I do. That way you can fire off jobs one after the other on the same long-running cluster, without having to re-create a new one each time.

If you are familiar with Python, I highly recommend the Boto library. The other AWS API tools let you do this as well.

If you follow the Boto EMR tutorial, you'll find some examples:

Just to give you an idea, this is what I do (with streaming jobs):

# Connect to EMR
conn = boto.connect_emr()

# Start long-running job, don't forget keep_alive setting
jobid = conn.run_jobflow(name='My jobflow',
                          log_uri='s3://<my log uri>/jobflow_logs',
                          keep_alive=True)

# Create your streaming job
step = StreamingStep(...)

# Add the step to the job
conn.add_jobflow_steps(jobid, [step])

# Wait till its complete
while True:
  state = conn.describe_jobflow(jobid).steps[-1].state
  if (state == "COMPLETED"):
    break
  if (state == "FAILED") or (state == "TERMINATED") or (state == "CANCELLED"):
    print >> sys.stderr, ("EMR job failed! Message = %s!") % (state)
    sys.exit(1)
  time.sleep (60)

# Create your next job here and add it to the EMR cluster
step = StreamingStep(...)
conn.add_jobflow_steps(jobid, [step])

# Repeat :)

回复收藏 0 原文

终陌 2024-09-02 15:27:55

为了让机器保持活力，启动一个交互式的猪会话。然后机器就不会关机了。然后，您可以使用以下命令从命令行执行映射/归约逻辑：

cat infile.txt | yourMapper | sort | yourReducer > outfile.txt

to keep the machine alive start an interactive pig session. Then the machine won't shut down. You can then execute your map/reduce logic from the command line using:

cat infile.txt | yourMapper | sort | yourReducer > outfile.txt

回复收藏 0 原文

~没有更多了~

关于作者

猥︴琐丶欲为

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

Amazon Elastic Map Reduce - 保持服务器处于活动状态？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

我早已燃尽

就像说晚安

donghfcn

脱单之前绝不改名′

凡尘雨

鲜血染红嫁衣

友情链接

Amazon Elastic Map Reduce - 保持服务器处于活动状态？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

我早已燃尽

就像说晚安

donghfcn

脱单之前绝不改名′

凡尘雨

鲜血染红嫁衣

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。