Azure 表存储 - 选择在查询之间使用的 PartitionKey 和 RowKey
我是 Azure 的新手!目的是根据 RowKey 中存储的时间戳返回行。由于每个查询都会产生事务成本,因此我希望在保持性能的同时最大限度地减少事务/查询的数量。
这些是建议的分区键和行键:
- 分区键: TextCache_(AccountID)_(ParentMessageId)
- 行键: (DateOfMessage)_(MessageId)
图例:
- AccountId - 是一个整数
- ParentMessageId - 父级消息Id(如果有),如果是父级则为空
- DateOfMessage - 创建消息的日期 - 格式将为 DateTime.Ticks.ToString("d19")
- 消息的唯一 ID
MessageId -我想从单个查询返回的行和任何子行的 > >或< DateOfMessage_MessageId
这可以通过我建议的 PartitionKeys 和 RowKeys 来完成吗?
即..(在伪代码中)
var results = ctx.PartitionKey.StartsWith(TextCache_AccountId)
&& ctx.RowKey > (TimeStamp)_MessageId
其次,如果我有多个帐户,并且只想返回前10个帐户,是否可以通过单个查询来完成
,即..(在伪代码中)
var results = (
(
ctx.PartitionKey.StartsWith(TextCache_(AccountId1)) &&
&& ctx.RowKey > (TimeStamp1)_MessageId1 )
)
||
(
ctx.PartitionKey.StartsWith(TextCache_(AccountId2)) &&
&& ctx.RowKey > (TimeStamp2)_MessageId2 )
) ...
)
.Take(10)
I am a total newbie with Azure! The purpose is to return the rows based on the timestamp stored in the RowKey. As there is a transaction cost with each query, I want to minimize the number of transactions/queries whilst maintain performance
These are the proposed Partition and Row Keys:
- Partition Key: TextCache_(AccountID)_(ParentMessageId)
- Row Key: (DateOfMessage)_(MessageId)
Legend:
- AccountId - is an integer
- ParentMessageId - The parent messageId if there is one, blank if it is the parent
- DateOfMessage - Date the message was created - format will be DateTime.Ticks.ToString("d19")
- MessageId - the unique Id of the message
I would like to get back from a single query the rows and any childrows that is > or < DateOfMessage_MessageId
Can this be done via my proposed PartitionKeys and RowKeys?
ie.. (in psuedo code)
var results = ctx.PartitionKey.StartsWith(TextCache_AccountId)
&& ctx.RowKey > (TimeStamp)_MessageId
Secondly, if there I have a number of accounts, and only want to return back the first 10, could it be done via a single query
ie.. (in psuedo code)
var results = (
(
ctx.PartitionKey.StartsWith(TextCache_(AccountId1)) &&
&& ctx.RowKey > (TimeStamp1)_MessageId1 )
)
||
(
ctx.PartitionKey.StartsWith(TextCache_(AccountId2)) &&
&& ctx.RowKey > (TimeStamp2)_MessageId2 )
) ...
)
.Take(10)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
对您的问题的简短回答是肯定的,但您需要注意一些事情。
Azure 表存储没有直接等效的
.StartsWith()
。如果您将存储库与 LINQ 结合使用,则可以使用.CompareTo()
(> 和 < 不能正确翻译),这意味着如果您运行帐户 1 的搜索并且您要求查询返回 1000 个结果,但帐户 1 只有 600 个结果,最后 400 个结果将是帐户 10(词汇上的下一个帐号)。因此,您需要明智地对待如何处理结果。如果您用前导 0 填充帐户 ID,则可以执行类似的操作(此处也是伪代码)。
另外要记住的是,对 Azure 表的查询会在
PartitionKey
中返回其结果,然后在RowKey
顺序。因此,在您的情况下,没有ParentMessageId
的消息将在具有ParentMessageId
的消息之前返回。如果您永远不会通过ParentMessageId
查询此表,我会将其移至属性。如果
TextCache_
只是一个字符串常量,则它不会通过包含在PartitionKey
中来添加任何内容,除非这在返回时对您的代码确实有意义。虽然您的第二个查询将运行,但我认为它不会产生您想要的结果。如果您想要按
DateOfMessage
顺序排列前十行,那么它将不起作用(请参阅我上面关于排序顺序的观点)。如果您按原样运行此查询,并且帐户 1 有 11 条消息,它将仅返回与帐户 1 相关的前 10 条消息,无论帐户 2 是否有较早的消息。虽然尽量减少使用的事务数量是一种很好的做法,但不必太担心。运行工作人员/网络角色的成本将使您的交易成本相形见绌。 1,000,000 笔交易将花费您 1 美元,这低于运行一个小型实例 9 小时的成本。
The short answer to your questions is yes, but there are some things you need to watch for.
Azure table storage doesn't have a direct equivalent of
.StartsWith()
. If you're using the storage library in combination with LINQ you can use.CompareTo()
(> and < don't translate properly) which will mean that if you run a search for account 1 and you ask the query to return 1000 results, but there are only 600 results for account 1, the last 400 results will be for account 10 (the next account number lexically). So you'll need to be a bit smart about how you deal with your results.If you padded out the account id with leading 0s you could do something like this (pseudo code here as well)
Something else to bear in mind is that queries to Azure Tables return their results in
PartitionKey
thenRowKey
order. So in your case messages without aParentMessageId
will be returned before messages with aParentMessageId
. If you're never going to query this table byParentMessageId
I'd move this to a property.If
TextCache_
is just a string constant, it's not adding anything by being included in thePartitionKey
unless this will actually mean something to your code when it's returned.While you're second query will run, I don't think it will produce what you're after. If you want the first ten rows in
DateOfMessage
order, then it won't work (see my point above about sort orders). If you ran this query as it is and account 1 had 11 messages it will return only the first 10 messages related to account 1 regardless if whether account 2 had an earlier message.While trying to minimise the number of transactions you use is good practice, don't be too concerned about it. The cost of running your worker/web roles will dwarf your transaction costs. 1,000,000 transactions will cost you $1 which is less than the cost of running one small instance for 9 hours.