s3cmd 失败次数过多

发布于 2024-11-03 04:31:28 字数 707 浏览 0 评论 0原文

我曾经是一个快乐的 s3cmd 用户。然而,最近当我尝试将大型 zip 文件 (~7Gig) 传输到 Amazon S3 时,收到此错误:

$> s3cmd put thefile.tgz s3://thebucket/thefile.tgz

....
  20480 of 7563176329     0% in    1s    14.97 kB/s  failed
WARNING: Upload failed: /thefile.tgz ([Errno 32] Broken pipe)
WARNING: Retrying on lower speed (throttle=1.25)
WARNING: Waiting 15 sec...
thefile.tgz -> s3://thebucket/thefile.tgz  [1 of 1]
       8192 of 7563176329     0% in    1s     5.57 kB/s  failed
ERROR: Upload of 'thefile.tgz' failed too many times. Skipping that file.

我正在使用最新的 Ubuntu 上的 s3cmd

为什么会这样呢?我该如何解决这个问题?如果无法解决,我可以使用什么替代工具?

I used to be a happy s3cmd user. However recently when I try to transfer a large zip file (~7Gig) to Amazon S3, I am getting this error:

gt; s3cmd put thefile.tgz s3://thebucket/thefile.tgz

....
  20480 of 7563176329     0% in    1s    14.97 kB/s  failed
WARNING: Upload failed: /thefile.tgz ([Errno 32] Broken pipe)
WARNING: Retrying on lower speed (throttle=1.25)
WARNING: Waiting 15 sec...
thefile.tgz -> s3://thebucket/thefile.tgz  [1 of 1]
       8192 of 7563176329     0% in    1s     5.57 kB/s  failed
ERROR: Upload of 'thefile.tgz' failed too many times. Skipping that file.

I am using the latest s3cmd on Ubuntu.

Why is it so? and how can I solve it? If it is unresolvable, what alternative tool can I use?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(15

甜妞爱困 2024-11-10 04:31:28

现在到了 2014 年,aws cli 能够代替 s3cmd 上传大文件。

http://docs.aws.amazon .com/cli/latest/userguide/cli-chap-getting-set-up.html 有安装/配置说明,或者通常:

$ wget https://s3.amazonaws.com/aws-cli/awscli-bundle.zip
$ unzip awscli-bundle.zip
$ sudo ./awscli-bundle/install -i /usr/local/aws -b /usr/local/bin/aws
$ aws configure

后面跟着

$ aws s3 cp local_file.tgz s3://thereoncewasans3bucket

会给你满意的结果。

And now in 2014, the aws cli has the ability to upload big files in lieu of s3cmd.

http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-set-up.html has install / configure instructions, or often:

$ wget https://s3.amazonaws.com/aws-cli/awscli-bundle.zip
$ unzip awscli-bundle.zip
$ sudo ./awscli-bundle/install -i /usr/local/aws -b /usr/local/bin/aws
$ aws configure

followed by

$ aws s3 cp local_file.tgz s3://thereoncewasans3bucket

will get you satisfactory results.

给妤﹃绝世温柔 2024-11-10 04:31:28

我自己刚刚遇到这个问题。我有一个 24GB 的 .tar.gz 文件要放入 S3 中。

上传较小的片段会有帮助。

文件大小也有大约 5GB 的限制,因此我将文件分成几部分,稍后下载这些部分时可以重新组合。

split -b100m ../input-24GB-file.tar.gz input-24GB-file.tar.gz-

该行的最后一部分是“前缀”。 Split 将在其后面附加 'aa'、'ab'、'ac' 等。 -b100m 表示 100MB 块。 24GB 文件最终将包含大约 240 100mb 部分,称为“input-24GB-file.tar.gz-aa”到“input-24GB-file.tar.gz-jf”。

稍后要将它们组合起来,请将它们全部下载到一个目录中,并:

cat input-24GB-file.tar.gz-* > input-24GB-file.tar.gz

获取原始文件和分割文件的 md5sum 并将其存储在 S3 存储桶中,或者更好,如果它不是那么大,使用像 parchive 能够检查,甚至修复一些下载问题也可能很有价值。

I've just come across this problem myself. I've got a 24GB .tar.gz file to put into S3.

Uploading smaller pieces will help.

There is also ~5GB file size limit, and so I'm splitting the file into pieces, that can be re-assembled when the pieces are downloaded later.

split -b100m ../input-24GB-file.tar.gz input-24GB-file.tar.gz-

The last part of that line is a 'prefix'. Split will append 'aa', 'ab', 'ac', etc to it. The -b100m means 100MB chunks. A 24GB file will end up with about 240 100mb parts, called 'input-24GB-file.tar.gz-aa' to 'input-24GB-file.tar.gz-jf'.

To combine them later, download them all into a directory and:

cat input-24GB-file.tar.gz-* > input-24GB-file.tar.gz

Taking md5sums of the original and split files and storing that in the S3 bucket, or better, if its not so big, using a system like parchive to be able to check, even fix some download problems could also be valuable.

旧伤慢歌 2024-11-10 04:31:28

我尝试了所有其他答案,但没有一个有效。看起来 s3cmd 相当敏感。
就我而言,s3 存储桶位于欧盟。小文件可以上传,但当它达到约 60k 时,它总是失败。

当我更改 ~/.s3cfg 时它起作用了。

以下是我所做的更改:

host_base = s3-eu-west-1.amazonaws.com

host_bucket = %(bucket)s.s3-eu-west-1.amazonaws.com

I tried all of the other answers but none worked. It looks like s3cmd is fairly sensitive.
In my case the s3 bucket was in the EU. Small files would upload but when it got to ~60k it always failed.

When I changed ~/.s3cfg it worked.

Here are the changes I made:

host_base = s3-eu-west-1.amazonaws.com

host_bucket = %(bucket)s.s3-eu-west-1.amazonaws.com

破晓 2024-11-10 04:31:28

我在 ubuntu s3cmd 中遇到了同样的问题。

s3cmd --guess-mime-type --acl-public put test.zip s3://www.jaumebarcelo.info/teaching/lxs/test.zip
test.zip -> s3://www.jaumebarcelo.info/teaching/lxs/test.zip  [1 of 1]
 13037568 of 14456364    90% in  730s    17.44 kB/s  failed
WARNING: Upload failed: /teaching/lxs/test.zip (timed out)
WARNING: Retrying on lower speed (throttle=0.00)
WARNING: Waiting 3 sec...
test.zip -> s3://www.jaumebarcelo.info/teaching/lxs/test.zip  [1 of 1]
  2916352 of 14456364    20% in  182s    15.64 kB/s  failed
WARNING: Upload failed: /teaching/lxs/test.zip (timed out)
WARNING: Retrying on lower speed (throttle=0.01)
WARNING: Waiting 6 sec...

解决方案是使用来自 s3tools.org 的说明更新 s3cmd:

Debian 和Ubuntu

我们的 DEB 存储库是在最兼容的环境中精心创建的
方式 – 它应该适用于 Debian 5 (Lenny)、Debian 6 (Squeeze)、Ubuntu
10.04 LTS (Lucid Lynx) 以及所有较新的和可能某些较旧的 Ubuntu 版本。从命令行执行以下步骤:

  • 导入 S3tools 签名密钥:

    wget -O- -q http://s3tools.org/repo/deb-all/stable/s3tools.key | sudo apt-key add -

  • 将存储库添加到sources.list:

    sudo wget -O/etc/apt/sources.list.d/s3tools.list http://s3tools.org/repo/deb-all/stable/s3tools.list

  • 刷新软件包缓存并安装最新的 s3cmd:

    sudo apt-get update && sudo apt-get install s3cmd

I had the same problem with ubuntu s3cmd.

s3cmd --guess-mime-type --acl-public put test.zip s3://www.jaumebarcelo.info/teaching/lxs/test.zip
test.zip -> s3://www.jaumebarcelo.info/teaching/lxs/test.zip  [1 of 1]
 13037568 of 14456364    90% in  730s    17.44 kB/s  failed
WARNING: Upload failed: /teaching/lxs/test.zip (timed out)
WARNING: Retrying on lower speed (throttle=0.00)
WARNING: Waiting 3 sec...
test.zip -> s3://www.jaumebarcelo.info/teaching/lxs/test.zip  [1 of 1]
  2916352 of 14456364    20% in  182s    15.64 kB/s  failed
WARNING: Upload failed: /teaching/lxs/test.zip (timed out)
WARNING: Retrying on lower speed (throttle=0.01)
WARNING: Waiting 6 sec...

The solution was to update s3cmd with the instructions from s3tools.org:

Debian & Ubuntu

Our DEB repository has been carefully created in the most compatible
way – it should work for Debian 5 (Lenny), Debian 6 (Squeeze), Ubuntu
10.04 LTS (Lucid Lynx) and for all newer and possibly for some older Ubuntu releases. Follow these steps from the command line:

  • Import S3tools signing key:

    wget -O- -q http://s3tools.org/repo/deb-all/stable/s3tools.key | sudo apt-key add -

  • Add the repo to sources.list:

    sudo wget -O/etc/apt/sources.list.d/s3tools.list http://s3tools.org/repo/deb-all/stable/s3tools.list

  • Refresh package cache and install the newest s3cmd:

    sudo apt-get update && sudo apt-get install s3cmd

奢华的一滴泪 2024-11-10 04:31:28

当亚马逊返回错误时,就会发生此错误:他们似乎会断开套接字连接,以阻止您上传千兆字节的请求以返回“不,失败”作为响应。这就是为什么有些人因为时钟偏差而得到它,有些人因为策略错误而得到它,而另一些人则遇到需要使用分段上传 API 的大小限制。这并不是说每个人都是错的,或者甚至都在考虑不同的问题:这些都是 s3cmd 中相同基本行为的不同症状。

由于大多数错误条件都是确定性的,s3cmd 丢弃错误消息并重试速度变慢的行为是一种疯狂的不幸:(。然后要获取实际的错误消息,您可以进入 /usr/share/s3cmd/S3/ S3.py(记住删除相应的 .pyc,以便使用更改)并在 send_file 函数的 except Exception, e: 块中添加一个 print e

。 ,我试图将上传文件的 Content-Type 设置为“application/x-debian-package”。显然,s3cmd 的 S3.object_put 1) 不支持通过 --add-header 传递的 Content-Type,但 2 )无法覆盖通过 --add-header 添加的 Content-Type,因为它将标头存储在具有区分大小写键的字典中。结果是它使用“内容类型”的值进行签名计算,然后最终(至少有许多请求;这可能基于某处的某种哈希排序)将“内容类型”发送到亚马逊,导致签名错误。

在我今天的具体情况下,似乎 -M 会导致 s3cmd 猜测正确的 Content-Type,但它似乎仅基于文件名来做到这一点......我希望它会根据内容使用 mimemagic 数据库文件的。但老实说:s3cmd 在上传文件失败时甚至无法返回失败的 shell 退出状态,因此结合所有这些其他问题,最好编写自己的一次性工具来完成此操作你需要的东西......几乎可以肯定,当你遇到这个工具的某些极端情况时,它最终会节省你的时间:(。

This error occurs when Amazon returns an error: they seem to then disconnect the socket to keep you from uploading gigabytes of request to get back "no, that failed" in response. This is why for some people are getting it due to clock skew, some people are getting it due to policy errors, and others are running into size limitations requiring the use of the multi-part upload API. It isn't that everyone is wrong, or are even looking at different problems: these are all different symptoms of the same underlying behavior in s3cmd.

As most error conditions are going to be deterministic, s3cmd's behavior of throwing away the error message and retrying slower is kind of crazy unfortunate :(. Itthen To get the actual error message, you can go into /usr/share/s3cmd/S3/S3.py (remembering to delete the corresponding .pyc so the changes are used) and add a print e in the send_file function's except Exception, e: block.

In my case, I was trying to set the Content-Type of the uploaded file to "application/x-debian-package". Apparently, s3cmd's S3.object_put 1) does not honor a Content-Type passed via --add-header and yet 2) fails to overwrite the Content-Type added via --add-header as it stores headers in a dictionary with case-sensitive keys. The result is that it does a signature calculation using its value of "content-type" and then ends up (at least with many requests; this might be based on some kind of hash ordering somewhere) sending "Content-Type" to Amazon, leading to the signature error.

In my specific case today, it seems like -M would cause s3cmd to guess the right Content-Type, but it seems to do that based on filename alone... I would have hoped that it would use the mimemagic database based on the contents of the file. Honestly, though: s3cmd doesn't even manage to return a failed shell exit status when it fails to upload the file, so combined with all of these other issues it is probably better to just write your own one-off tool to do the one thing you need... it is almost certain that in the end it will save you time when you get bitten by some corner-case of this tool :(.

燕归巢 2024-11-10 04:31:28

s3cmd 1.0.0 尚不支持多部分。我尝试了 1.1.0-beta,效果很好。您可以在此处了解新功能:http://s3tools.org/s3cmd-110b2-released

s3cmd 1.0.0 does not support multi-part yet. I tried 1.1.0-beta and it works just fine. You can read about the new features here: http://s3tools.org/s3cmd-110b2-released

扮仙女 2024-11-10 04:31:28

就我而言,失败的原因是服务器的时间早于 S3 时间。由于我在我的服务器(位于美国东部)中使用 GMT+4,并且我使用亚马逊的美国东部存储设施。

将我的服务器调整为美国东部时间后,问题就消失了。

In my case the reason of the failure was the server's time being ahead of the S3 time. Since I used GMT+4 in my server (located in US East) and I was using Amazon's US East storage facility.

After adjusting my server to the US East time, the problem was gone.

So要识趣 2024-11-10 04:31:28

我遇到了同样的问题,结果是 ~/.s3cfg 中的 bucket_location 值不正确。

这篇博文引导我找到了答案。

如果您要上传到的存储桶不存在(或者您错过了输入),则会失败并出现该错误。谢谢您的通用错误消息。 - 查看更多信息:http ://jeremyshapiro.com/blog/2011/02/errno-32-broken-pipe-in​​-s3cmd/#sthash.ZbGwj5Ex.dpuf

检查我的 ~/.s3cfg 后是看到它有:

bucket_location = Sydney

而不是:

bucket_location = ap-southeast-2

更正此值以使用 正确的值 name(s) 解决了这个问题。

I experienced the same issue, it turned out to be a bad bucket_location value in ~/.s3cfg.

This blog post lead me to the answer.

If the bucket you’re uploading to doesn’t exist (or you miss typed it ) it’ll fail with that error. Thank you generic error message. - See more at: http://jeremyshapiro.com/blog/2011/02/errno-32-broken-pipe-in-s3cmd/#sthash.ZbGwj5Ex.dpuf

After inspecting my ~/.s3cfg is saw that it had:

bucket_location = Sydney

Rather than:

bucket_location = ap-southeast-2

Correcting this value to use the proper name(s) solved the issue.

焚却相思 2024-11-10 04:31:28

对我来说,以下方法有效:

在.s3cfg中,我更改了host_bucket

host_bucket = %(bucket)s.s3-external-3.amazonaws.com

For me, the following worked:

In .s3cfg, I changed the host_bucket

host_bucket = %(bucket)s.s3-external-3.amazonaws.com
小草泠泠 2024-11-10 04:31:28

s3cmd 版本 1.1.0-beta3 或更高版本将自动使用分段上传 允许发送任意大的文件()。您也可以控制它使用的块大小。例如,

s3cmd --multipart-chunk-size-mb=1000 put hugefile.tar.gz s3://mybucket/dir/

这将以 1 GB 块的形式进行上传。

s3cmd version 1.1.0-beta3 or better will automatically use multipart uploads to allow sending up arbitrarily large files (source). You can control the chunk size it uses, too. e.g.

s3cmd --multipart-chunk-size-mb=1000 put hugefile.tar.gz s3://mybucket/dir/

This will do the upload in 1 GB chunks.

红玫瑰 2024-11-10 04:31:28

由于安全组策略设置错误,我遇到了相同的断管错误。我责怪 S3 文档。

我写了关于如何正确设置策略 在我的博客中,它是:

{
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:GetBucketLocation",
        "s3:ListBucketMultipartUploads"
      ],
      "Resource": "arn:aws:s3:::example_bucket",
      "Condition": {}
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:AbortMultipartUpload",
        "s3:DeleteObject",
        "s3:DeleteObjectVersion",
        "s3:GetObject",
        "s3:GetObjectAcl",
        "s3:GetObjectVersion",
        "s3:GetObjectVersionAcl",
        "s3:PutObject",
        "s3:PutObjectAcl",
        "s3:PutObjectAclVersion"
      ],
      "Resource": "arn:aws:s3:::example_bucket/*",
      "Condition": {}
    }
  ]
}

I encountered the same broken pipe error as the security group policy was set wrongly.. I blame S3 documentation.

I wrote about how to set the policy correctly in my blog, which is:

{
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:GetBucketLocation",
        "s3:ListBucketMultipartUploads"
      ],
      "Resource": "arn:aws:s3:::example_bucket",
      "Condition": {}
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:AbortMultipartUpload",
        "s3:DeleteObject",
        "s3:DeleteObjectVersion",
        "s3:GetObject",
        "s3:GetObjectAcl",
        "s3:GetObjectVersion",
        "s3:GetObjectVersionAcl",
        "s3:PutObject",
        "s3:PutObjectAcl",
        "s3:PutObjectAclVersion"
      ],
      "Resource": "arn:aws:s3:::example_bucket/*",
      "Condition": {}
    }
  ]
}
世俗缘 2024-11-10 04:31:28

就我而言,我已经解决了这个问题,只是添加了正确的权限。

Bucket > Properties > Permissions 
"Authenticated Users"
- List
- Upload/Delete
- Edit Permissions

On my case, I've fixed this just adding right permissions.

Bucket > Properties > Permissions 
"Authenticated Users"
- List
- Upload/Delete
- Edit Permissions
吃颗糖壮壮胆 2024-11-10 04:31:28

我遇到了类似的错误,最终证明是由机器上的时间漂移​​引起的。正确设置时间为我解决了这个问题。

I encountered a similar error which eventually turned out to be caused by a time drift on the machine. Correctly setting the time fixed the issue for me.

探春 2024-11-10 04:31:28

搜索 .s3cfg 文件,通常在您的主文件夹中。

如果你拥有它,你就得到了恶棍。更改以下两个参数应该会对您有所帮助。

socket_timeout = 1000
multipart_chunk_size_mb = 15

Search for .s3cfg file, generally in your Home Folder.

If you have it, you got the villain. Changing the following two parameters should help you.

socket_timeout = 1000
multipart_chunk_size_mb = 15
星星的轨迹 2024-11-10 04:31:28

我通过简单地不使用 s3cmd 来解决这个问题。相反,我在 GitHub 上的 python 项目 S3-Multipart 上取得了巨大成功。它进行上传和下载,并根据需要使用尽可能多的线程。

I addressed this by simply not using s3cmd. Instead, I've had great success with the python project, S3-Multipart on GitHub. It does uploading and downloading, along with using as many threads as desired.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文