Hadoop 流最大行长度
我正在为 Amazon Elastic MapReduce 开发 Hadoop 流式工作流程,它涉及序列化一些二进制对象并将其流式传输到 Hadoop 中。 Hadoop 对于流输入有最大行长度吗?
我开始用越来越大的线进行测试,但我想我会先在这里问。
I'm working on a Hadoop streaming workflow for Amazon Elastic Map Reduce and it involves serializing some binary objects and streaming those into Hadoop. Does Hadoop have a maximum line length for streaming input?
I started to just test with larger and larger lines but figured I would ask here first.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
线路长度似乎没有强加的限制。自从提出这个问题以来,我一直在编写代码来序列化二进制对象,将它们编码为 Base64,然后将它们放入流中进行处理。结果,有些线路很长。 Hadoop 运行正常,没有任何抱怨。
There appears to be no imposed limits on line length. Since asking the question I have been writing code that serializes binary objects, encodes them in base64, then puts them in a stream for processing. As a result, some of the lines are quite long. Hadoop chews right along with no complaints.