我正在使用 dynamodBenchancedAsyncclient
用于查询 dynamodb
使用 gsi
和分页
。以下是我用来实现相同的代码。我正在与限制
使用以下代码发送到单声道的订阅者的页面 的数量。我需要使用
timestamp descending 订单中的每个页面中的记录进行排序,这是我的在我的 gsi 。为此,我在下面使用 scanindexforward(false)
。但是,即使在 dynamodb
中存在的总计 4记录
中,即使在页面中没有获得任何记录
。
SdkPublisher<Page<Customer>> query = secindex.query(QueryEnhancedRequest.builder().queryConditional(queryconditional).scanIndexForward(false)
.limit(2).build())
Mono.from(PagePublisher.create(query().limit(1)))
SECINDEX
是 dynamodbasyncindex
,它是 gsi
。根据上述代码, 1页
应使用 2 Records
none
返回客户端。另外,如果我删除 scanindexforward(false)
,则结果如预期,但在上排序
顺序。如何使其返回下降
订单中的有限记录?当提供 scanindexforward()
时,分页是否有所不同?
I am using DynamoDBEnchancedAsyncClient
to query DynamoDB
using GSI
and pagination
. Below is the code that I am using to achieve the same. I am tying to limit
the number of items per page
and number of pages sent to the subscriber of the Mono
using below code. I need to sort the records in each page in descending
order using the timestamp
and this is the sort key
in my GSI
. For this I am using scanIndexForward(false)
below. However I am not getting any records
in the page even though there are in total 4 records
that are present in DynamoDB
.
SdkPublisher<Page<Customer>> query = secindex.query(QueryEnhancedRequest.builder().queryConditional(queryconditional).scanIndexForward(false)
.limit(2).build())
Mono.from(PagePublisher.create(query().limit(1)))
secindex
is the DynamoDBAsyncIndex
which is the GSI
. As per the above code, 1 page
should be returned to client with 2 records
however none
are getting returned. Also If I remove scanIndexForward(false)
then the result is as expected but sorted in ascending
order. How do I make it return limited records in descending
order ? Does the pagination work differently when the scanIndexForward()
is supplied?
发布评论
评论(1)
我只能猜出您的发电机电话中的过滤器,我只能猜到您的过滤器,但是我已经看到了很多次
。
校正限制 becor note 均未返回查询。这在下面是不正确的 - 但是由于返回后应用的其他过滤器的性质,这确实可能导致2个项目被返回,然后被过滤掉,最终回报为0
结束校正
dynamodb查询在返回数据之前不会对数据执行任何过滤器/限制。标准查询发电机可以做的唯一一件事是检查哈希键/范围键,其中一些基本范围键滤波(gt,lt,之间,以ect开头) - 所有其他属性的过滤器都不是哈希/范围。 SDK回到一组响应后使用。
1-用哈希键/范围键组合和范围上的任何过滤查询发电机。
2-所有匹配此匹配的项目均已发送回 - 最多1MB数据。除此之外,任何更重要的是需要其他调用
3-将限制应用于这些结果!这是不正确的,在返回之前将其应用于4-将过滤器应用于有限的内容!
5-然后剩下的任何东西都返回。
这意味着,当您在发电机查询上使用过滤器条件时,通常会发生什么,您实际上可能不会收回您的期望 - 因为它们要么在下一页上,而且当前页面上的内容都不匹配过滤器,因此您返回0。
由于您也使用限制,当它以与排序密钥相同的顺序排列数据(因为扫描索引向前为false),则如果前两个值与其他过滤器不匹配,则获得0个项目后退。
我建议您尝试查询所有项目,而无需任何滤镜/范围键 - 没有其他属性过滤器。
然后手动过滤响应。
您还应该意识到Dynamo的SDK的内部分页 - 它将仅在单个呼叫中从Dyanmo返回1MB的数据。除此之外,还需要第二个呼叫,包括在结果的第一页中返回的lastEvaLeDkey。 有更多信息。
如果您的系统在调用查询后负担不起进行过滤,则需要重新评估hashkey/storskey组合。 Dynamo最好在访问模式架构中对齐 - 也就是说,我有X数据,我需要Y数据,因此我将使X成为HASH键,而y值在该x下为不同的范围键。
例如:用户数据。您可能有“ user_id”的广盖。
然后,您有几种不同的模式,用于range_keys
meta#(带有电子邮件,用户名,哈希和咸密码的属性,ECT)
帖子#1
帖子#2
帖子#3
Avatar#
,因此您可以在用户ID的Hash键上查询,您可以获得所有信息。或者,如果您只有他们的帖子页面,则可以查询用户ID和范围键的哈希密钥(以帖子#开始),
这是良好的dynamo架构的最重要方面 - 能够在任何事情上进行查询的能力您只需要使用Hash键或Hashkey和Rangekey,
并具有一组所需的访问模式以及适当设置的发电机,因此您不需要过滤器或限制,因为您的哈希/范围组合的组合关键查询就足够了。
(有时这确实意味着数据的重复!您在帖子#项目中可能具有与元#项目中相同的信息 - 即它们都包含用户名。这是可以的,因为当您查询帖子时,您需要用户名称 - 当您查询密码/用户名是否匹配时 - NOSQL可以很好地处理此功能,并且非常快 - 给定的哈希/范围钥匙组合基本上被视为其自己的表格,使查询非常快。)
Without 100% knowing what your filters are on your dynamo call, i can only guess - but I've seen this sort of thing many times
so.
Correction limit is applied before the query is returned not after. This was incorrect below - but because of the nature of additional filters being applied after the return this could indeed result in 2 items being returned that are then filtered out and an ultimate return of 0
end correction
Dynamodb Query does not perform any filter/limits on the data before returning it. The only thing a standard query to dynamo can do is check Hash Key/Range Key with some basic Range Key filtering ( gt, lt, between, begins with ect) - all other filters on attributes that are not Hash/Range are done by the SDK you're using after getting back a set of responses.
1 - Query the dynamo with a Hash Key/Range Key combination and any filtering on Range.
2 - All items that match this are sent back - up to 1mb data. Anything more than that needs additional calls
3 - Apply the limit to these results!this was incorrect, this is applied before being returned4 - Apply the filter to whats been limited!
5 - Then whatever is left is returned.
This means that what happens often when you are using filter conditions on a dynamo query, you may not actually get back what you expect to - because either they are on the next page and what is on the current page, nothing matches the filter so you get back 0.
Since you are also using Limit, when it sorts the data in the same order as the Sort Key (as scan index forward is false) then if the first two values don't match your other filters, you get 0 items back.
I would recommend you try querying all the items without any filters beyond just Hash/Range key - no other attribute filters.
Then filter the response manually on your side.
You also should be aware of the internal pagination of the SDKs for dynamo - it will only return 1mb amount of data from the Dyanmo in a single call. Anything beyond that requires a second call including the LastEvaluatedKey that is returned in the first page of results. https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Query.Pagination.html has more information.
If your system cannot afford to do the filtering itself after the query is called, then you need to re-evaluate your HashKey/SortKey combinations. Dynamo is best aligned in an Access Pattern schema - that is to say, I have X data and I will need Y data, so I will cause X to be a Hash Key, and the Y values to be different Range Keys under that X.
like as an example: User data. You might have a HashKey of "user_id".
Then you have several different patterns for Range_keys
meta# (with attributes of email, username, hashed and salted passwords, ect)
post#1
post#2
post#3
avatar#
and so you make a query on just Hash Key of the user id, you get all the info. Or if you have page with just their posts, you can do a query of hash key of user id and range key (begins with post#)
This is the most important aspect of a good dynamo schema - the ability to do queries on any thing you need with just a Hash Key or a HashKey and RangeKey
With a well understood set of access patterns that you will need, and a dynamo that is set up appropriately, then you should need no filters or limits, because your combination of Hash/Range key queries will be enough.
(this does sometimes mean a duplication of data! You may have the same information in a Post# item as you do in the Meta# item - ie they both contain usernames. This is OK as when you query for a post you need the user name - as well as when you query for the password/username to see if they match. - Dynamo as a NoSQL handles this very well, and very fast - a given Hash/Range key combination is basically considered its own table in terms of access, making queries VERY fast against it.)