在已经加载副本时更新时，S3文件会发生什么？

发布于 2025-02-02 00:02:49 字数 890 浏览 3 评论 0 原文

我有一个服务 a ，它不断更新S3存储桶中的一组文件。

或多或少，它等效于这样的东西：

while true
do
  generate file
  aws cp s3 <file> <bucket>/<file>
  sleep a little
done

我有一个服务 b ，它会偶尔读取该文件以更新自身内部的数据。我想要一个服务的实例 a ，而服务 b 运行100个实例。

因此，服务 a 等同于：

while true
do
  aws cp s3 <bucket>/<file> <file>
  update variable holding this data
  sleep a little
done

目前，＆lt; file＆gt; 名称始终保持不变。我想知道这是否会引起问题。当我上传文件版本时，什么时候发生？旧版本是否仍然可用，直到Service b 中的副本完成，或者文件被服务覆盖 a ？

IE在我知道的所有操作系统下，如果我写入文件，则在同一位置读取一个新数据，而不是旧数据。换句话说，如果是标准操作系统文件，则读取可能会看到杂交的数据（旧数据和新数据的混合）。

是S3文件与标准OS文件相同，或者在这种情况下它们更安全在上传完成之前，不会覆盖？

注意：我特别感兴趣地拥有有关该特定情况如何工作的官方S3文档。到目前为止，我的搜索已经空了。

原文

I have a service A which constantly updates a set of files in an S3 bucket.

More or less, it is equivalent to something like this:

while true
do
  generate file
  aws cp s3 <file> <bucket>/<file>
  sleep a little
done

I have a service B which reads that file once in a while to update the data inside itself. I want a single instance of service A while service B runs 100 instances.

So service A has an equivalent to:

while true
do
  aws cp s3 <bucket>/<file> <file>
  update variable holding this data
  sleep a little
done

At the moment, the <file> name always remains the same. I'm wondering whether this can cause issues. When happens when I upload a file version of the file? Is the old version still available until the copy in service B is done, or does the file get overwritten by service A?

i.e. under all operating systems I know of, if I write to a file, a read at the same location sees the new data, not the old one. In other words, in case of a standard OS file, the read may see mangled data (a mix of old and new data).

Are S3 files the same as standard OS files, or are they safer in this case or not overwritten until an upload is done?

Note: I'm particularly interested in having an official S3 document about how this specific case works. My searches have, so far, come empty.