当前位置：文江博客话题详情

下载整个 S3 存储桶？

发布于 2024-12-23 00:34:53 字数 128 浏览 5 评论 0原文

我注意到似乎没有从 AWS 管理控制台下载整个 s3 存储桶的选项。

有没有一种简单的方法可以抓取我一个桶中的所有东西？我正在考虑将根文件夹公开，使用 wget 获取全部内容，然后再次将其设为私有，但我不知道是否有更简单的方法。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

陪你到最终 2024-12-30 00:34:54

如果您使用 Visual Studio，请下载“AWS Toolkit for Visual Studio”。

安装后，转到Visual Studio - AWS Explorer - S3 - 您的存储桶 - 双击

在窗口中您将能够选择所有文件。右键单击并下载文件。

回复收藏 0 原文

￡烟消云散 2024-12-30 00:34:54

对于 Windows，S3 浏览器是我发现的最简单的方法。它是一款优秀的软件，并且免费供非商业用途使用。

回复收藏 0 原文

笔芯 2024-12-30 00:34:54

将此命令与 AWS CLI 结合使用：

aws s3 cp s3://bucketname . --recursive

Use this command with the AWS CLI:

aws s3 cp s3://bucketname . --recursive

回复收藏 0 原文

无力看清 2024-12-30 00:34:54

另一个可以帮助某些 OS X 用户的选项是传输。

它是一个 FTP 程序，还可以让您连接到 S3 文件。而且，它还可以选择将任何 FTP 或 S3 存储挂载为 Finder 中的文件夹，但这仅适用于有限的时间。

回复收藏 0 原文

贪了杯 2024-12-30 00:34:54

AWS SDK API只是将整个文件夹和存储库上传到AWS S3以及在本地下载整个AWS S3存储桶的最佳选择。

要将整个文件夹上传到AWS S3：aws s3sync。 s3://BucketName

要在本地下载整个 AWS S3 存储桶：aws s3sync s3://BucketName 。

您还可以分配像 BucketName 这样的路径/Path 用于 AWS S3 存储桶中要下载的特定文件夹。

回复收藏 0 原文

江心雾 2024-12-30 00:34:54

我已经为 S3 做了一些开发，但还没有找到下载整个存储桶的简单方法。

如果您想使用 Java 进行编码，可以使用 jets3t lib 轻松创建列表存储桶并迭代该列表以下载它们。

首先，从 AWS 管理控制台获取一组公钥，以便您可以创建一个 S3service 对象：

AWSCredentials awsCredentials = new AWSCredentials(YourAccessKey, YourAwsSecretKey);
s3Service = new RestS3Service(awsCredentials);

然后，获取存储桶对象的数组：

S3Object[] objects = s3Service.listObjects(YourBucketNameString);

最后，迭代该数组以一次下载一个对象：

S3Object obj = s3Service.getObject(bucket, fileName);
            file = obj.getDataInputStream();

我将线程安全单例中的连接代码。由于显而易见的原因，必要的 try/catch 语法已被省略。

如果您更愿意使用 Python 编写代码，则可以使用 Boto。

查看 BucketExplorer 后，“下载整个存储桶”可能会执行以下操作你想要的。

I've done a bit of development for S3 and I have not found a simple way to download a whole bucket.

If you want to code in Java the jets3t lib is easy to use to create a list of buckets and iterate over that list to download them.

First, get a public private key set from the AWS management consule so you can create an S3service object:

AWSCredentials awsCredentials = new AWSCredentials(YourAccessKey, YourAwsSecretKey);
s3Service = new RestS3Service(awsCredentials);

Then, get an array of your buckets objects:

S3Object[] objects = s3Service.listObjects(YourBucketNameString);

Finally, iterate over that array to download the objects one at a time with:

S3Object obj = s3Service.getObject(bucket, fileName);
            file = obj.getDataInputStream();

I put the connection code in a threadsafe singleton. The necessary try/catch syntax has been omitted for obvious reasons.

If you'd rather code in Python you could use Boto instead.

After looking around BucketExplorer, "Downloading the whole bucket" may do what you want.

回复收藏 0 原文

自此以后，行同陌路 2024-12-30 00:34:54

AWS CLI 是本地下载整个 S3 存储桶的最佳选择。

安装 AWS CLI。< /p>
配置 AWS CLI 以使用默认安全凭证和默认 AWS 区域。
要下载整个 S3 存储桶，请使用命令
aws s3 同步 s3://yourbucketname 本地路径

针对不同 AWS 服务的 AWS CLI 参考：AWS 命令行界面

回复收藏 0 原文

殊姿 2024-12-30 00:34:54

如果您只想从 AWS 下载存储桶，请首先在您的计算机中安装 AWS CLI。在终端中，将目录更改为要下载文件的位置并运行此命令。

aws s3 sync s3://bucket-name .

如果您还想同步本地和 s3 目录（如果您在本地文件夹中添加了一些文件），请运行以下命令：

aws s3 sync . s3://bucket-name

If you only want to download the bucket from AWS, first install the AWS CLI in your machine. In terminal change the directory to where you want to download the files and run this command.

aws s3 sync s3://bucket-name .

If you also want to sync the both local and s3 directories (in case you added some files in local folder), run this command:

aws s3 sync . s3://bucket-name

回复收藏 0 原文

谜泪 2024-12-30 00:34:54

要添加另一个 GUI 选项，我们使用 WinSCP 的 S3 功能。连接非常简单，只需要用户界面中的访问密钥和秘密密钥。然后，您可以从任何可访问的存储桶中浏览和下载所需的任何文件，包括嵌套文件夹的递归下载。

由于通过安全性清除新软件可能是一项挑战，而且 WinSCP 相当普遍，因此仅使用它而不是尝试安装更专业的实用程序确实有益。

回复收藏 0 原文

-小熊_ 2024-12-30 00:34:54

aws s3 sync s3://<source_bucket> <local_destination>

是一个很好的答案，但如果对象位于存储类 Glacier 灵活检索中，即使文件已恢复，它也不起作用。在这种情况下，您需要添加标志 --force-glacier-transfer 。

aws s3 sync s3://<source_bucket> <local_destination>

is a great answer, but it won't work if the objects are in storage class Glacier Flexible Retrieval, even if the the files have been restored. In that case you need to add the flag --force-glacier-transfer .

回复收藏 0 原文

九命猫 2024-12-30 00:34:54

您可以使用 MinIO Client 执行此操作，如下所示：mc cp -r https://s3 -us-west-2.amazonaws.com/bucketName/ localdir

MinIO 还支持会话、可断点下载、上传等。 MinIO 支持 Linux、OS X 和 Windows 操作系统。它是用 Golang 编写的，并在 Apache Version 2.0 下发布。

回复收藏 0 原文

書生途 2024-12-30 00:34:54

如果您将 Firefox 与 S3Fox 一起使用，则可以选择所有文件（Shift-选择第一个和最后一个），然后右键单击并下载所有文件。

我已经完成了 500 多个文件，没有任何问题。

回复收藏 0 原文

等风来 2024-12-30 00:34:54

您可以使用同步下载整个 S3 存储桶。例如，下载当前目录下名为bucket1的整个存储桶。

aws s3 sync s3://bucket1 .

You can use sync to download whole S3 bucket. For example, to download whole bucket named bucket1 on current directory.

aws s3 sync s3://bucket1 .

回复收藏 0 原文

游魂 2024-12-30 00:34:54

除了对aws s3sync的建议之外，我还建议查看s5cmd。

根据我的经验，我发现对于多次下载或大量下载，这比 AWS CLI 快得多。

s5cmd 支持通配符，因此类似这样的内容可以工作：

s5cmd cp s3://bucket-name/* ./folder

回复收藏 0 原文

九命猫 2024-12-30 00:34:54

如果那里只有文件（没有子目录），快速解决方案是选择所有文件（第一个文件单击，最后一个文件Shift+单击）并点击输入或右键单击并选择打开。对于大多数数据文件，这会将它们直接下载到您的计算机上。

回复收藏 0 原文

意犹 2024-12-30 00:34:54

尝试以下命令：

aws s3sync yourBucketnameDirectory yourLocalDirectory

例如，如果您的存储桶名称为 myBucket 并且本地目录为 c:\local，则：

aws s3sync s3://myBucket c:\local

有关 awscli 的更多信息，请检查此
aws cli 安装

回复收藏 0 原文

抽个烟儿 2024-12-30 00:34:54

使用 awscli 将文件下载/上传到 s3 总是更好。同步将帮助您轻松恢复。

aws s3 sync s3://bucketname/ .

It's always better to use awscli for downloading / uploading files to s3. Sync will help you to resume without any hassle.

aws s3 sync s3://bucketname/ .

回复收藏 0 原文

泅渡 2024-12-30 00:34:54

在 Windows 中，我首选的 GUI 工具是 CloudBerry Explorer 免费软件
亚马逊S3。它有一个相当精美的文件浏览器和类似 FTP 的界面。

回复收藏 0 原文

我是男神闪亮亮 2024-12-30 00:34:54

以下是复制整个存储桶所需执行的操作的摘要：

1. 创建可以使用 AWS s3 存储桶进行操作的用户

请遵循此官方文章：配置基础知识

不要忘记：

勾选“编程访问”，以便能够通过以下方式处理 AWS命令行界面。
向您的用户添加正确的 IAM 策略，以允许他与 s3 存储桶交互

2. 下载、安装和配置 AWS CLI

请参阅此链接来进行配置：https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html

您可以使用以下命令添加您在创建应用程序时获得的密钥用户：

$ aws configure
AWS Access Key ID [None]: <your_access_key>
AWS Secret Access Key [None]: <your_secret_key>
Default region name [None]: us-west-2
Default output format [None]: json

3. 使用以下命令下载内容

您可以使用递归 cp 命令，但 awssync 命令是 f：

aws s3 sync s3://your_bucket /local/path

在真正执行操作之前查看下载的文件是什么下载后，您可以使用 --dryrun 选项。
要提高速度，您可以调整 s3 max_concurrent_requests 和 max_queue_size 属性。请参阅：http://docs.aws.amazon.com/ cli/latest/topic/s3-config.html
您可以使用 --exclude 和 --include 选项排除/包含某些文件。请参阅：https://docs.aws.amazon.com/cli/latest /reference/s3/

例如，以下命令将显示存储桶中存在的所有 .png 文件。重播不带 --dryrun 的命令以下载结果文件。

aws s3 sync s3://your_bucket /local/path --recursive --exclude "*" --include "*.png" --dryrun

Here is a summary of what you have to do to copy an entire bucket:

1. Create a user that can operate with AWS s3 bucket

Follow this official article: Configuration basics

Don't forget to:

tick "programmatic access" in order to have the possibility to deal with with AWS via CLI.
add the right IAM policy to your user to allow him to interact with the s3 bucket

2. Download, install and configure AWS CLI

See this link allowing to configure it: https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html

You can use the following command in order to add the keys you got when you created your user:

$ aws configure
AWS Access Key ID [None]: <your_access_key>
AWS Secret Access Key [None]: <your_secret_key>
Default region name [None]: us-west-2
Default output format [None]: json

3. Use the following command to download content

You can a recursive cp commande, but aws sync command is f:

aws s3 sync s3://your_bucket /local/path

To see what would be the dowloaded files before really do the download, you can use the --dryrun option.
To improve speed, you can adjust s3 max_concurrent_requests and max_queue_size properties. See: http://docs.aws.amazon.com/cli/latest/topic/s3-config.html
You can exclude/include some files using --exclude and --include options. See: https://docs.aws.amazon.com/cli/latest/reference/s3/

For example, the below command will show all the .png file presents in the bucket. Replay the command without --dryrun to make the resulting files be downloaded.

aws s3 sync s3://your_bucket /local/path --recursive --exclude "*" --include "*.png" --dryrun

回复收藏 0 原文

朦胧时间 2024-12-30 00:34:54

您只需要传递 --recursive & --在 aws s3 cp 命令中包含“*”，如下所示：aws --region "${BUCKET_REGION}" s3 cp s3://${BUCKET}$ {BUCKET_PATH}/ ${LOCAL_PATH}/tmp --recursive --include "*" 2>&1

回复收藏 0 原文

沧桑㈠ 2024-12-30 00:34:54

Windows 用户需要从此链接下载 S3EXPLORER，该链接还包含安装说明：- http://s3browser.com/ download.aspx
然后向您提供 AWS 凭证，例如 s3explorer 的密钥、访问密钥和区域，此链接包含 s3explorer 的配置说明：复制粘贴链接在浏览器中： s3browser.com/s3browser-first-run.aspx
现在您的所有 s3 存储桶将在 s3explorer 的左侧面板上可见。
只需选择存储桶，然后单击左上角的存储桶菜单，然后从菜单中选择将所有文件下载到选项。以下是相同的屏幕截图：

存储桶选择屏幕

然后浏览文件夹以下载特定位置的存储桶
单击“确定”即可开始下载。

回复收藏 0 原文

各空 2024-12-30 00:34:54

只需使用 aws s3 sync 命令即可下载存储桶的所有内容。
例如：aws s3sync s3://<存储桶名称>;到<目的地/路径>
注意：在继续之前执行aws configure

回复收藏 0 原文

ゃ懵逼小萝莉 2024-12-30 00:34:54

awssync 是完美的解决方案。它不是双向的。它是从源到目的地的单向方式。另外，如果您的存储桶中有很多项目，最好先创建 s3 端点，以便下载速度更快（因为下载不是通过 Internet 而是通过 Intranet 进行）并且不收费

回复收藏 0 原文

許願樹丅啲祈禱 2024-12-30 00:34:53

AWS CLI

有关更多信息，请参阅“AWS CLI 命令参考”。

AWS 最近发布了他们的命令行工具，其工作方式与 boto 非常相似，可以使用

sudo easy_install awscli

或

sudo pip install awscli

安装后，您可以简单地运行：

aws s3 sync s3://<source_bucket> <local_destination>

例如：

aws s3 sync s3://mybucket .

将 mybucket 中的所有对象下载到当前目录。

并将输出：

download: s3://mybucket/test.txt to test.txt
download: s3://mybucket/test2.txt to test2.txt

这将使用单向同步下载所有文件。除非您指定 --delete，并且不会更改或删除 S3 上的任何文件。

您还可以执行 S3 存储桶到 S3 存储桶，或本地到 S3 存储桶同步。

查看文档和其他示例。

虽然上面的示例是如何下载完整存储桶，但您还可以通过执行递归下载文件夹。

aws s3 cp s3://BUCKETNAME/PATH/TO/FOLDER LocalFolderName --recursive

这将指示 CLI 递归下载 PATH/TO/FOLDER 目录中的所有文件和文件夹密钥BUCKETNAME 存储桶。

AWS CLI

See the "AWS CLI Command Reference" for more information.

AWS recently released their Command Line Tools, which work much like boto and can be installed using

sudo easy_install awscli

sudo pip install awscli

Once installed, you can then simply run:

aws s3 sync s3://<source_bucket> <local_destination>

For example:

aws s3 sync s3://mybucket .

will download all the objects in mybucket to the current directory.

And will output:

download: s3://mybucket/test.txt to test.txt
download: s3://mybucket/test2.txt to test2.txt

This will download all of your files using a one-way sync. It will not delete any existing files in your current directory unless you specify --delete, and it won't change or delete any files on S3.

You can also do S3 bucket to S3 bucket, or local to S3 bucket sync.

Check out the documentation and other examples.

Whereas the above example is how to download a full bucket, you can also download a folder recursively by performing

aws s3 cp s3://BUCKETNAME/PATH/TO/FOLDER LocalFolderName --recursive

This will instruct the CLI to download all files and folder keys recursively within the PATH/TO/FOLDER directory within the BUCKETNAME bucket.

回复收藏 0 原文

星 2024-12-30 00:34:53

您可以使用 s3cmd 下载您的存储桶：

s3cmd --configure
s3cmd sync s3://bucketnamehere/folder /destination/folder

您可以使用另一个工具，名为 rclone。这是 Rclone 文档中的代码示例：

rclone sync /home/local/directory remote:bucket

You can use s3cmd to download your bucket:

s3cmd --configure
s3cmd sync s3://bucketnamehere/folder /destination/folder

There is another tool you can use called rclone. This is a code sample in the Rclone documentation:

rclone sync /home/local/directory remote:bucket

回复收藏 0 原文

岁月蹉跎了容颜 2024-12-30 00:34:53

我使用了几种不同的方法将 Amazon S3 数据复制到本地计算机，包括 s3cmd，到目前为止最简单的方法是 Cyberduck。

您所需要做的就是输入您的亚马逊凭证，并使用简单的界面下载、上传、同步您的任何存储桶、文件夹或文件。

回复收藏 0 原文

叹沉浮 2024-12-30 00:34:53

您有多种选择可以执行此操作，但最好的选择是使用 AWS CLI。

以下是演练：

在您的计算机中下载并安装 AWS CLI：
- 使用以下命令安装 AWS CLI MSI 安装程序 (Windows)。
- 使用以下命令安装 AWS CLI适用于 Linux、OS X 或 Unix 的捆绑安装程序。
配置 AWS CLI：
确保您输入了创建帐户时收到的有效访问和秘密密钥。
使用以下方式同步 S3 存储桶：
```
aws s3 同步 s3://yourbucket /local/path
```
在上面的命令中，替换以下字段：
- yourbucket >>>您要下载的 S3 存储桶。
- /local/path >>>您要下载所有文件的本地系统中的路径。

You've many options to do that, but the best one is using the AWS CLI.

Here's a walk-through:

Download and install AWS CLI in your machine:
- Install the AWS CLI using the MSI Installer (Windows).
- Install the AWS CLI using the Bundled Installer for Linux, OS X, or Unix.
Configure AWS CLI:
Make sure you input valid access and secret keys, which you received when you created the account.
Sync the S3 bucket using:
```
aws s3 sync s3://yourbucket /local/path
```
In the above command, replace the following fields:
- yourbucket >> your S3 bucket that you want to download.
- /local/path >> path in your local system where you want to download all the files.

回复收藏 0 原文

七婞 2024-12-30 00:34:53

要使用 AWS S3 CLI 下载：

aws s3 cp s3://WholeBucket LocalFolder --recursive
aws s3 cp s3://Bucket/Folder LocalFolder --recursive

要使用代码下载，请使用 AWS 开发工具包。

要使用 GUI 下载，请使用 Cyberduck。

To download using AWS S3 CLI:

aws s3 cp s3://WholeBucket LocalFolder --recursive
aws s3 cp s3://Bucket/Folder LocalFolder --recursive

To download using code, use the AWS SDK.

To download using GUI, use Cyberduck.

回复收藏 0 原文

甜点 2024-12-30 00:34:53

@Layke 的答案很好，但如果您有大量数据并且不想永远等待，您应该阅读“AWS CLI S3 配置"。

以下命令将告诉 AWS CLI 使用 1,000 个线程来执行作业（每个线程是一个小文件或多部分副本的一部分）并展望 100,000 个作业：

aws configure set default.s3.max_concurrent_requests 1000
aws configure set default.s3.max_queue_size 100000

运行这些作业后，您可以使用简单的 sync 命令：

aws s3 sync s3://source-bucket/source-path s3://destination-bucket/destination-path

或者

aws s3 sync s3://source-bucket/source-path c:\my\local\data\path

在具有 CPU 4 核和 16GB RAM 的系统上，对于像我这样的情况（3-50GB 文件），同步/复制速度从大约 9.5MiB/s 变为700+MiB/s，比默认配置速度提升70倍。

The answer by @Layke is good, but if you have a ton of data and don't want to wait forever, you should read "AWS CLI S3 Configuration".

The following commands will tell the AWS CLI to use 1,000 threads to execute jobs (each a small file or one part of a multipart copy) and look ahead 100,000 jobs:

aws configure set default.s3.max_concurrent_requests 1000
aws configure set default.s3.max_queue_size 100000

After running these, you can use the simple sync command:

aws s3 sync s3://source-bucket/source-path s3://destination-bucket/destination-path

aws s3 sync s3://source-bucket/source-path c:\my\local\data\path

On a system with CPU 4 cores and 16GB RAM, for cases like mine (3-50GB files) the sync/copy speed went from about 9.5MiB/s to 700+MiB/s, a speed increase of 70x over the default configuration.

回复收藏 0 原文