在 Amazon S3 中逐行读取文件?
是否可以使用 Amazon S3 逐行读取文件?我希望让人们在某个地方上传大文件,然后让一些代码(可能在亚马逊上运行)逐行读取他们的文件并用它做一些事情,可能以地图减少的多线程方式。或者也许一次只能加载 1000 行...有什么建议吗?
Is it possible to read a file line-by-line with Amazon S3? I'm looking to let people upload large files somewhere, then have some code (probably running on Amazon) read their file line-by-line and do something with it, probably in a map-reduced multithreaded fashion. Or maybe just being able to load 1000 lines at a time... Any suggestions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
Amazon S3 确实支持范围请求,但其设计目的不是逐行读取文件。
不过,看起来 Amazon Elastic MapReduce 可能很适合您的情况寻找。 S3 和所使用的 EC2 实例之间的传输速度非常快,然后您可以按照您喜欢的任何方式分配工作。
Amazon S3 does support range requests but its not designed to read a file line by line.
However it looks like Amazon Elastic MapReduce might be a good fit what you are looking for. Transfers between S3 and the EC2 instances used will be very fast and then you can divide up the work in any way you please.
下面是一个使用 PHP 7 和 Laravel 5 的简单示例,说明如何从 Amazon S3 逐行读取文件:
S3StreamReader.php
S3StreamFactory.php
使用示例:
即使您不使用 Laravel ,您仍然可以使用此代码,因为 Laravel 仅使用 league/flysystem-aws-s3-v3 包。
Here is a simple example using PHP 7 and Laravel 5 how to read a file line-by-line from Amazon S3:
S3StreamReader.php
S3StreamFactory.php
Example of usage:
Even if you don't use Laravel, you can still use this code, since Laravel just uses league/flysystem-aws-s3-v3 package.
这是 PHP 中的一个示例片段,它似乎可以满足您的要求(抓取 file.txt 中的前 1000 行并将它们连接起来)。这有点遗憾,但这个想法可以用其他语言或使用其他技术来实现。关键是像对待 Windows 或 Linux 等任何其他文件系统一样对待 S3,唯一的区别是您使用 S3 密钥凭据并将文件路径设置为 s3://your_directory_tree/your_file.txt”:
Here's an example snippet in PHP that seems to do what you're asking (grabs the first 1000 lines in file.txt and concatenates them). It's a bit contrite, but the idea can be implemented in other languages or using other techniques. The key is to treat S3 the same as you would any other file system like windows or linux, the only difference being that you use your S3 keys credentials and set the file path to s3://your_directory_tree/your_file.txt":