远程启动 Amazon Elastic MapReduce 作业?
我正在开发一个小项目,以熟悉亚马逊网络服务。我正在尝试制作一个简单的网络应用程序;当按下按钮时,将启动 MapReduce 作业,并将输出返回到浏览器上。 最好的方法是什么?另外,有没有办法通过命令行启动亚马逊弹性映射缩减作业?
I'm working on a small project to get myself acquainted with the Amazon web services. I'm trying to make a simple web application; when a button is pressed a mapreduce job is launched and the output is returned on the browser.
What would be the best way to do this? Also, is there a way to launch an amazon elastic mapreduce job via the command line?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以使用以任何语言编写 Web 应用程序的 AWS 开发工具包来调用 EMR 来提交作业。我主要使用 python,因此我最熟悉 Python Boto 库,这使得将代码和数据上传到 s3、配置作业流程并启动该作业流程变得非常轻松。
您不想启动作业并在同一个 HTTP 请求中返回结果,因为在作业能够运行之前启动集群就需要几分钟的时间。页面几分钟内没有响应的 Web 应用程序并不是良好的用户体验。然而,仅仅提交一个作业流程似乎只需要几秒钟的时间。您需要创建作业流程并在 Web 应用程序中跟踪作业流程 ID。给定作业流 ID,当用户返回并且作业完成时,从作业流检索日志数据或输出应该不会有太多麻烦。
以下是如何使用 Boto 启动 Elastic MR 作业的示例:
You can use the AWS SDK in whatever language you're writing your web application in to make calls to EMR to submit a job. I work mostly with python so I'm most familiar with the Python Boto library which makes it pretty painless to upload code and data to s3, configure a jobflow and launch that job flow.
You won't want to launch the job and return the results in the same HTTP request as it will take several minutes just to start the cluster before the job will be able to run. A web application with pages that don't respond for minutes isn't a good user experience. However, just submitting a jobflow seems to only take a few seconds. You'll need to create the job flow and just keep track of the jobflow ids in your web application. Given a jobflow ID you shouldn't have too much trouble retrieving log data or output from the jobflow when the user comes back and the job is complete.
Here's an example of how one could launch an Elastic MR job with Boto:
你看过这个了吗? http://developer.amazonwebservices.com/connect/entry.jspa?externalID= 873 它来自开发方面,可能会对您有所帮助。
Did you give this a look yet? http://developer.amazonwebservices.com/connect/entry.jspa?externalID=873 It's from the dev side and might help you along.