Hibernate 的批量获取算法是如何工作的?

发布于 2024-09-14 05:46:35 字数 616 浏览 9 评论 0原文

我在“Manning - Java Persistence with Hibernate”中找到了批量获取算法的描述:

什么是真正的批量抓取 算法? (...) 想象一下批量大小为 20 总数量 119 必须是未初始化的代理 分批加载。在启动时, Hibernate读取映射元数据 并创建 11 个批处理加载器 内部。每个装载机都知道有多少 它可以初始化的代理:20、10、9、 8, 7, 6, 5, 4, 3, 2, 1。目标是 最小化内存消耗 加载器创建并创建足够的 每个可能的批次的装载机 可以产生fetch。另一个目标是 尽量减少SQL的数量 显然,选择。初始化 119 代理 Hibernate 执行七个 批次(您可能期望六个, 因为6×20> 119)。该批次 应用的装载机有五个 20 次、10 次、9 次、 由 Hibernate 自动选择。

但我仍然不明白它是如何工作的。

  1. 为什么使用 11 个批量装载机?
  2. 为什么批量加载器可以初始化:20、10、9、8、7、6、5、4、3、2、1 个代理?

如果有人可以提出一步一步的算法......:)

I found this description of the batch-fetching algorithm in "Manning - Java Persistence with Hibernate":

What is the real batch-fetching
algorithm? (...) Imagine a batch size of
20 and a total number of 119
uninitialized proxies that have to be
loaded in batches. At startup time,
Hibernate reads the mapping metadata
and creates 11 batch loaders
internally. Each loader knows how many
proxies it can initialize: 20, 10, 9,
8, 7, 6, 5, 4, 3, 2, 1. The goal is to
minimize the memory consumption for
loader creation and to create enough
loaders that every possible batch
fetch can be produced. Another goal is
to minimize the number of SQL
SELECTs, obviously. To initialize 119
proxies Hibernate executes seven
batches (you probably expected six,
because 6 x 20 > 119). The batch
loaders that are applied are five
times 20, one time 10, and one time 9,
automatically selected by Hibernate.

but I still don't understand how it works.

  1. Why 11 batch loaders ?
  2. Why batch loaders can initialize: 20, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1 proxies ?

If anybody could present a step by step algorithm ... :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

最佳男配角 2024-09-21 05:46:35

这有助于避免创建大量不同的准备好的语句。

每个查询(准备好的语句)都需要进行解析,并且其执行计划需要为 由数据库计算并缓存。此过程可能比实际执行已缓存语句的查询要昂贵得多。

大量不同的语句可能会导致将其他缓存的语句从缓存中清除,从而降低整体应用程序性能。

此外,由于硬解析通常非常昂贵,因此执行多个缓存的准备语句(包括多个数据库往返)通常比解析和执行新语句更快。因此,除了减少不同语句数量的明显好处之外,通过执行 11 个缓存语句来检索所有 119 个实体实际上可能比创建并执行包含所有 119 个 id 的单个新语句更快。

正如评论中已经提到的,Hibernate 调用 ArrayHelper.getBatchSizes 方法来确定给定最大批量大小的批量大小。

This helps avoid creating a large number of different prepared statements.

Each query (prepared statement) needs to be parsed and its execution plan needs to be calculated and cached by the database. This process may be much more expensive than the actual execution of the query for which the statement has already been cached.

A large number of different statements may lead to purging other cached statements out of the cache, thus degrading the overall application performance.

Also, since hard parse is generally very expensive, it is usually faster to execute multiple cached prepared statements (including multiple database round trips) than to parse and execute a new one. So, besides the obvious benefit of reducing the number of different statements, it may actually be faster to retrieve all of the 119 entities by executing 11 cached statements than to create and execute a single new one which contains all of the 119 ids.

As already mentioned in the comments, Hibernate invokes ArrayHelper.getBatchSizes method to determine the batch sizes for the given maximum batch size.

爱格式化 2024-09-21 05:46:35

我在网上找不到任何关于hibernate如何处理批量加载的信息,但是从您的信息来看,人们可以猜测以下内容:

为什么

使用 11个批量加载器?批量大小为 20,如果您想最小化任意代理组合所需的加载器数量,基本上有两个选项:

  • 为 1,2,3,4,5,6 创建一个加载器, 7,...20,21,22,23,... N 个未初始化的代理(愚蠢!)或者
  • 为 1..9 之间的任意 N 创建一个加载器,然后为 batch_size/2(递归地)

示例:对于批量大小 40,您最终会得到 40,20,10,9,8,7,6,5,4,3,2,1 个装载机。

  1. 如果您有 33 个未初始化的代理,则可以使用以下加载器: 20, 10, 3
  2. 如果您有 119 个未初始化的代理,则可以使用以下加载器, 40(x2), 20, 10, 9
  3. ...

为什么批量加载器可以初始化:20、10、9、8、7、6、5、4、3、2、1 个代理?
我认为 hibernate 团队选择此作为加载“公共”数量 N 的未初始化代理所需的加载器数量与内存消耗之间的平衡。可以为 0 到 batch_size 之间的每个 N 创建一个加载程序,但我怀疑加载程序有相当大的内存占用所以这是一个权衡。算法可以是这样的(有根据的猜测):

  1. n = batch_size;而(n>10)

    1.1。 加载器(n); n = n / 2

  2. 对于 n = 0..10 创建加载器(n)

I couldn't find any information on the web about how hibernate handles batch loading, but judging from your information, one could guess the following:

Why 11 batch loaders?

With a batch size of 20, if you want to minimize the number of loaders required for any combination of proxies, there are basically two options:

  • create a loader for 1,2,3,4,5,6,7,...20,21,22,23,... N uninitialized proxies (stupid!) OR
  • create a loader for any N between 1..9 and then create more loaders for batch_size/2(recursively)

Example: for batch size 40, you would end up with loaders for 40,20,10,9,8,7,6,5,4,3,2,1 loaders.

  1. If you have 33 uninitialized proxies, you can use the following loaders: 20, 10, 3
  2. If you have 119 uninitialized proxies, you can use the following loaders, 40(x2), 20, 10, 9
  3. ...

Why batch loaders can initialize: 20, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1 proxies ?
I think the hibernate team chose this as a balance between the number of loaders required for loading a "common" number N of uninitialized proxies and memory consumption. The could have created a loader for every N between 0 and batch_size, but I suspect that the loaders have a considerable memory footprint so this is a tradeoff. The algorithm can be something like this (educated guess):

  1. n = batch_size; while (n > 10)

    1.1. loader(n); n = n / 2

  2. for n = 0..10 create loader(n)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文