EMR-DYNAMODB-CONNECTOR CORTAND/TOMTLETE写入DDB(吞吐量超过您帐户的当前吞吐量限制。)

发布于 2025-02-04 04:05:39 字数 942 浏览 4 评论 0 原文

我试图将1 TB / 3000万个文档写入DDB表。 DDB表设置为按需容量。 为此,我正在使用 emr-dynamodb-connector 通过在EMR群集上运行Spark作业。代码如下所示,

JobConf ddbConfWrite = new JobConf(spark.sparkContext().hadoopConfiguration());
ddbConfWrite.set("dynamodb.output.tableName", tableName);
ddbConfWrite.set("dynamodb.throughput.write.percent", "0.5");
ddbConfWrite.set("mapred.input.format.class", "org.apache.hadoop.dynamodb.read.DynamoDBInputFormat");
ddbConfWrite.set("mapred.output.format.class", "org.apache.hadoop.dynamodb.write.DynamoDBOutputFormat");

ddbInsertFormattedRDD.saveAsHadoopDataset(ddbConfWrite);

但是我在写作时,它最终试图以最终达到帐户限制的快速速率插入文档。以下是我得到的例外

com.amazonaws.services.dynamodbv2.model.requestlimitexceededexception:吞吐量超过您帐户的当前吞吐量限制。请通过 https://aws.amazon.com/support 请求限制(服务) :AmazondyNamodBv2;状态代码:400;

I am trying to write 1 TB / 30 million documents to DDB table.
DDB table is set for On-demand capacity.
For that i am using emr-dynamodb-connector by running spark job on EMR cluster. Code looks like below

JobConf ddbConfWrite = new JobConf(spark.sparkContext().hadoopConfiguration());
ddbConfWrite.set("dynamodb.output.tableName", tableName);
ddbConfWrite.set("dynamodb.throughput.write.percent", "0.5");
ddbConfWrite.set("mapred.input.format.class", "org.apache.hadoop.dynamodb.read.DynamoDBInputFormat");
ddbConfWrite.set("mapred.output.format.class", "org.apache.hadoop.dynamodb.write.DynamoDBOutputFormat");

ddbInsertFormattedRDD.saveAsHadoopDataset(ddbConfWrite);

but i while job is writing it is eventually trying to insert the documents at fast rate that is eventually hitting account limit. below is the exception i am getting

com.amazonaws.services.dynamodbv2.model.RequestLimitExceededException: Throughput exceeds the current throughput limit for your account. Please contact AWS Support at https://aws.amazon.com/support request a limit increase (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: RequestLimitExceeded

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

記憶穿過時間隧道 2025-02-11 04:05:39

您可以按建议提交一张票,以提高帐户容量的软限制。它只是作为一种理智检查,您实际上是要以这种身份运行(并产生费用)。

You can file a ticket as suggested to raise the soft limit on account capacity. It’s just there as a sanity check that you actually mean to run at that capacity (and incur that cost).

南街九尾狐 2025-02-11 04:05:39

挖掘代码,正在通过

Math.floor(configuredThroughput * throughputPercent)

配置throughput 为40,000的where tarputpertask设置为40,000,如果将表设置为按需设置。

configuretThrougput 可以通过字符串write_throughput =“ dynamodb.throughput.write”

似乎是写入容量的下限为4,000个单位,因此,如果您想非常安全,请设置 ddbconfwrite.set(“ dynamodb.throughput.write”,“ 8000”);

refs:
https://github.com/awslabs/emr-dynamodb-connector/blob/blob/master/emr-dynamodb-hadoop/src/main/main/java/java/java/java/java/java/java/java/.org/apache/had一下/a>

https://github.com/awslabs/emr-dynamodb-connector/blob/master/master/master/emr-dynamodb-hadoop/src/main/main/main/java/java/java/java/java/java/java/opache/apache/hadoop/hadoop/dynalodb,ddynalynalodb,4.apache/dynamodb,4. /write/writeiopscalculator.java

Digging into the code, throughputPerTask is being set by

Math.floor(configuredThroughput * throughputPercent)

where configuredThroughput is 40,000 by default if table is set to on-demand .

configuredThroughput can be set by String WRITE_THROUGHPUT = "dynamodb.throughput.write"

Seems the lower bound for write capacity is 4,000 units, so if you want to be very safe, set ddbConfWrite.set("dynamodb.throughput.write", "8000");

Refs:
https://github.com/awslabs/emr-dynamodb-connector/blob/master/emr-dynamodb-hadoop/src/main/java/org/apache/hadoop/dynamodb/DynamoDBConstants.java

https://github.com/awslabs/emr-dynamodb-connector/blob/master/emr-dynamodb-hadoop/src/main/java/org/apache/hadoop/dynamodb/write/WriteIopsCalculator.java

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/on-demand-capacity-mode.html

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文