Azure 表存储性能 - REST 与 StorageClient

发布于 2024-11-30 09:29:59 字数 314 浏览 0 评论 0原文

我正在使用 Azure 表存储,并试图找出提高性能的最佳方法。我执行的查询非常简单 - 要么使用分区键和行键进行精确选择,要么使用列表的 where 子句(例如,WHERE x==1 或 x==2 或 x==3 等)。一旦我取回数据,我就不会在数据上下文中跟踪它(不需要更改跟踪等)。保存数据也是如此,所以我只将其添加到上下文中以启用保存。

目前,我正在使用.NET 库(存储客户端)。由于我不使用 TableServiceContext 的更改跟踪和其他功能,因此我正在考虑直接使用 HTTP API。有人尝试过这两种选择吗?如果是这样,您发现了什么样的性能差异?

谢谢, 埃里克

I am working with Azure Table Storage, and trying to figure out the best way to increase performance. The queries that I perform are very simple - either an exact select using partition key and row key, or a where clause with a list (e.g., WHERE x==1 or x==2 or x==3, etc). Once I get the data back, I don't track it in a data context (no need for change tracking, etc). Saving data is likewise, so I only add it to the context to enable the save.

At the moment, I am using the .NET library (storage client). As I don't use the change tracking and other features of the TableServiceContext, I am thinking about using the HTTP API directly. Has anyone tried both options? If so, what kind of performance difference did you see?

Thanks,
Erick

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

隐诗 2024-12-07 09:29:59

表存储在优化性能方面可能有点变化无常。有多种因素会影响它。以下是我的一些想法:

  1. 在每个查询中使用分区键是必须的。如果你不这样做,那么你就做错了。如果您使用单个 PK 和单个 RK(并且仅使用这两个),则它不再是查询,而是资源 GET 并且应该是相对即时的。
  2. 不要使用基于 OR 的查询。这将导致全表扫描,并且您的性能将非常糟糕。相反,请在 OR 语句中并行化这些查询。
  3. 分区策略将会产生重大影响。您拥有多少个分区以及您访问它们的频率(以预热它们并导致底层分区服务器负载平衡)将导致巨大的差异。分区的大小在这里也产生很大的影响。顺序分区键通常是一个坏主意。
  4. 小请求可以从关闭唠叨中受益(如前所述)。
  5. 关闭上下文跟踪并 100 继续 (参见此处)也可以提供帮助。

我想还有更多取决于您的应用程序。然而,我提到的通常是我开始的那些。

Table storage can be a bit of a fickle beast to optimize performance. There are a variety of factors that will impact it. Here are just a few off the top of my head:

  1. Using a Partition Key in every query is a must. If you are not doing this, you are doing it wrong. If you use single PK and single RK (and only those two), it is no longer a query, but a resource GET and should be relatively instantaneous.
  2. Do not use OR-based queries. This will cause a full table scan and your performance will be horrible. Instead, parallelize those queries within the OR statement.
  3. Partitioning strategy will have a major impact. How many partitions you have and how often you hit them (to warm them up and cause the underlying partition servers to load balance) will cause dramatic differences. The size of the partition makes a big impact here too. Sequential partition keys is often a bad idea.
  4. Small requests can benefit from turning off nagling (as previously mentioned).
  5. Turn off context tracking and 100 continue (see here) can help as well.

There are many more I suppose that depend on your application. However, the ones I mention are generally the ones I start with.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文