EMR-DYNAMODB-CONNECTOR CORTAND/TOMTLETE写入DDB(吞吐量超过您帐户的当前吞吐量限制。)
我试图将1 TB / 3000万个文档写入DDB表。
DDB表设置为按需容量。
为此,我正在使用 emr-dynamodb-connector
通过在EMR群集上运行Spark作业。代码如下所示,
JobConf ddbConfWrite = new JobConf(spark.sparkContext().hadoopConfiguration());
ddbConfWrite.set("dynamodb.output.tableName", tableName);
ddbConfWrite.set("dynamodb.throughput.write.percent", "0.5");
ddbConfWrite.set("mapred.input.format.class", "org.apache.hadoop.dynamodb.read.DynamoDBInputFormat");
ddbConfWrite.set("mapred.output.format.class", "org.apache.hadoop.dynamodb.write.DynamoDBOutputFormat");
ddbInsertFormattedRDD.saveAsHadoopDataset(ddbConfWrite);
但是我在写作时,它最终试图以最终达到帐户限制的快速速率插入文档。以下是我得到的例外
com.amazonaws.services.dynamodbv2.model.requestlimitexceededexception:吞吐量超过您帐户的当前吞吐量限制。请通过 https://aws.amazon.com/support 请求限制(服务) :AmazondyNamodBv2;状态代码:400;
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以按建议提交一张票,以提高帐户容量的软限制。它只是作为一种理智检查,您实际上是要以这种身份运行(并产生费用)。
You can file a ticket as suggested to raise the soft limit on account capacity. It’s just there as a sanity check that you actually mean to run at that capacity (and incur that cost).
挖掘代码,正在通过
配置throughput
为40,000的where tarputpertask设置为40,000,如果将表设置为按需设置。configuretThrougput
可以通过字符串write_throughput =“ dynamodb.throughput.write”
似乎是写入容量的下限为4,000个单位,因此,如果您想非常安全,请设置
ddbconfwrite.set(“ dynamodb.throughput.write”,“ 8000”);
refs:
https://github.com/awslabs/emr-dynamodb-connector/blob/blob/master/emr-dynamodb-hadoop/src/main/main/java/java/java/java/java/java/java/java/.org/apache/had一下/a>
https://github.com/awslabs/emr-dynamodb-connector/blob/master/master/master/emr-dynamodb-hadoop/src/main/main/main/java/java/java/java/java/java/java/opache/apache/hadoop/hadoop/dynalodb,ddynalynalodb,4.apache/dynamodb,4. /write/writeiopscalculator.java
Digging into the code, throughputPerTask is being set by
where
configuredThroughput
is 40,000 by default if table is set to on-demand .configuredThroughput
can be set byString WRITE_THROUGHPUT = "dynamodb.throughput.write"
Seems the lower bound for write capacity is 4,000 units, so if you want to be very safe, set
ddbConfWrite.set("dynamodb.throughput.write", "8000");
Refs:
https://github.com/awslabs/emr-dynamodb-connector/blob/master/emr-dynamodb-hadoop/src/main/java/org/apache/hadoop/dynamodb/DynamoDBConstants.java
https://github.com/awslabs/emr-dynamodb-connector/blob/master/emr-dynamodb-hadoop/src/main/java/org/apache/hadoop/dynamodb/write/WriteIopsCalculator.java
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/on-demand-capacity-mode.html