如何使用AWS S3在XTDB节点上配置检查点

发布于 2025-02-10 17:48:35 字数 1178 浏览 2 评论 0原文

我使用的XTDB 1.21.0部署在AWS/ECS（FARGATE）上，并配置了检查点（频率30分钟）并存储在S3桶（RockSDB）上。经过几个成功的检查点之后，由于http请求对AWS的例外，它们似乎不断失败，如下所示：

这使S3桶留下了不完整的检查点（即，一个包含一组SST和其他RockSDB文件的文件夹，没有关联的EDN索引文件）：

XTDB文档提到了一个可选的事实： a href =“ https://github.com/xtdb/xtdb/blob/master/modules/s3/src/xtdb/s3/s3/s3/s3configurator.java” rel =“ nofollow noreflow noreferrer”> s3configurator 可以通过节点配置并在搜索谷歌搜索后，我认为makeclient应该被覆盖，以便可以设置condect> ConnectionAcipitionTimeout：

NettyNioAsyncHttpClient.builder()
.maxConcurrency(200)
.connectionAcquisitionTimeout(Duration.ofMillis(20000))

我不太熟悉Netty，因此如果某人会很感激可以帮助正确的咒语。

另外，我正在从EDN文件中配置XT节点，并且还没有弄清楚如何在EDN文件中编写S3 Configurator（或者甚至可能）。

提前致谢！

原文

I am using XTDB 1.21.0 deployed on AWS/ECS (Fargate) with checkpoints configured (frequency 30 minutes) and stored on an S3 bucket (RocksDB). After a couple of successful checkpoints, they seem to be constantly failing with an XTDB warning due to an exception in the HTTP request to AWS, as shown below:

This leaves the S3 buckets with incomplete checkpoints (i.e., a Folder containing a set of SSTs and other RocksDB files and no associated EDN index file):

XTDB documentation mentions the fact that an optional S3configurator can be passed to the node configuration and after a bit of Googling around I figured that makeClient should be overridden so that connectionAcquisitionTimeout can be set:

NettyNioAsyncHttpClient.builder()
.maxConcurrency(200)
.connectionAcquisitionTimeout(Duration.ofMillis(20000))

I am not too familiar with NETTY so would appreciate if someone could help with the right incantation.

Also I am configuring the XT node from an EDN file, and haven't figure out how to write a S3 configurator in an EDN file (or if it is even possible).

Thanks in advance!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

忘你却要生生世世 2025-02-17 17:48:35

对于大型数据集可能会发生这种情况，在该数据集中，使用的默认S3客户端将为每个对象创建一个新的异步请求（该对象的数量可能非常大，尤其是在使用ROCKDBS索引时）。在内部，它使用ConnectionAccipitionTimeOut作为一种背压类型，以确保输入请求不要无限期地等待连接池的连接，但是，在这种情况下，我们是这些请求的唯一来源，我们绝对希望请求在开始节点之前完成请求，因此可以合理地将ConnectionAccipitionTimeOut设置为非常高的东西（默认值仅10秒）。限制的好选择可能是您要等待节点在失败之前启动的最长时间。

这似乎是SDK的非视觉参数，我只能假设是来自外部源的请求的明智默认策略，在我们的情况下，我们本质上希望它的行为，好像是同步操作一样。

用XTDB在Clojure中配置此内容将看起来像这样：

(ns foo.db
  (:require
   [xtdb.api :as xtdb]
   [xtdb.checkpoint]
   [xtdb.rocksdb]
   [xtdb.s3.checkpoint])
  (:import
   (java.time Duration)
   (software.amazon.awssdk.http.nio.netty NettyNioAsyncHttpClient)
   (software.amazon.awssdk.services.s3 S3AsyncClient)
   (xtdb.checkpoint Checkpointer)
   (xtdb.s3 S3Configurator)))

(def s3-configurator
  (reify S3Configurator
    (makeClient [this]
      (.. (S3AsyncClient/builder)
          (httpClientBuilder
           (.. (NettyNioAsyncHttpClient/builder)
               (connectionAcquisitionTimeout
                (Duration/ofSeconds 600)) ;; Set a high limit here

               ;; We can rely on the defaults for maxConcurrency and
               ;; maxPendingConnectionAcquires
               ;; (maxConcurrency (Integer. 200))
               ;; (maxPendingConnectionAcquires (Integer. 10000))

               ))
          (build)))))

(defn start-node!
  []
  (xtdb/start-node
    {:xtdb/index-store
     {:kv-store {:xtdb/module 'xtdb.rocksdb/->kv-store
                 :db-dir "/var/xtdb/idxs"
                 :checkpointer {:xtdb/module 'xtdb.checkpoint/->checkpointer
                                :store {:xtdb/module 'xtdb.s3.checkpoint/->cp-store
                                        :configurator (constantly s3-configurator)
                                        :bucket "checkpoints"}
                                :approx-frequency "PT3H"}}}}))

This can happen for large datasets where the default S3 client used will create a new async request for each object (for which the number of objects may be very large, particularly if using the RockDBs index). Internally it uses the connectionAcquisitionTimeout as a type of backpressure to ensure that incoming requests don't wait indefinitely for a connection from the connection pool, however, in this case we're the only source of these requests and we definitely want the requests to complete before starting the nodes so it's reasonable to set the connectionAcquisitionTimeout to something very high (the default is only 10 seconds). A good choice of limit might be something like the maximum amount of time you want to wait for the node to start before failing.

This appears to be a non-optional parameter of the SDK for what I can only assume is a sensible default strategy for requests coming from an external source, in our case we essentially want it to behave as if it was a synchronous operation.

Configuring this in Clojure with xtdb would look something like this:

(ns foo.db
  (:require
   [xtdb.api :as xtdb]
   [xtdb.checkpoint]
   [xtdb.rocksdb]
   [xtdb.s3.checkpoint])
  (:import
   (java.time Duration)
   (software.amazon.awssdk.http.nio.netty NettyNioAsyncHttpClient)
   (software.amazon.awssdk.services.s3 S3AsyncClient)
   (xtdb.checkpoint Checkpointer)
   (xtdb.s3 S3Configurator)))

(def s3-configurator
  (reify S3Configurator
    (makeClient [this]
      (.. (S3AsyncClient/builder)
          (httpClientBuilder
           (.. (NettyNioAsyncHttpClient/builder)
               (connectionAcquisitionTimeout
                (Duration/ofSeconds 600)) ;; Set a high limit here

               ;; We can rely on the defaults for maxConcurrency and
               ;; maxPendingConnectionAcquires
               ;; (maxConcurrency (Integer. 200))
               ;; (maxPendingConnectionAcquires (Integer. 10000))

               ))
          (build)))))

(defn start-node!
  []
  (xtdb/start-node
    {:xtdb/index-store
     {:kv-store {:xtdb/module 'xtdb.rocksdb/->kv-store
                 :db-dir "/var/xtdb/idxs"
                 :checkpointer {:xtdb/module 'xtdb.checkpoint/->checkpointer
                                :store {:xtdb/module 'xtdb.s3.checkpoint/->cp-store
                                        :configurator (constantly s3-configurator)
                                        :bucket "checkpoints"}
                                :approx-frequency "PT3H"}}}}))

回复收藏 0 原文

~没有更多了~