用JPA/Hibernate急切地查询类似树状的结构时,可以提高性能
我们在Oracle中有这四桌持续的树状结构,我们正在使用Hibernate来处理持久性肮脏的工作。在几乎所有情况下,它都非常有效,除了查询一堆这些结构以将其转移为JSON。取回3K实体大约30秒钟,所以我试图进一步调查。
实体看起来像这样(许多代码为简洁):
@Table(name = "ROOT")
public class Root {
@Id
private Long id;
@OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "root")
@Fetch(org.hibernate.annotations.FetchMode.SUBSELECT)
@Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
private List<Stem> stems;
@OneToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "root")
@Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
private List<Branch> branches;
}
@Table(name = "STEM")
public class Stem {
@Id
private Long id;
@ManyToOne
@JoinColumn(name = "ROOT_ID")
private Root root;
}
@Table(name = "BRANCH")
public class Branch {
@Id
private Long id;
@ManyToOne
@JoinColumn(name = "ROOT_ID")
private Root root;
@OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "branch")
private List<Leaf> leaves;
}
@Table(name = "LEAF")
public class Leaf {
@Id
private Long id;
@ManyToOne
@JoinColumn(name = "BRANCH_ID")
private Branch branch;
}
测试是用25 root
branch> branch es进行的,并且比3K 叶
ves。另外,存在两个半数百个stem
。
阅读日志,对我来说,冬眠的渗透a dfs ,会导致大量查询(显然是一个针对每个实体的查询)。
因此,我试图通过对四个表中的每个表触发一个查询来实现 bfs (在Python中)。令人惊讶的是,我能够在不到半秒的时间内准备好相同的JSON(包括json.dump
的时间)!该脚本是这样的(类似地重复了所有关系):
cur.execute('select * from PARENT_TABLE')
parents = parent_tuples_to_dict(cur.fetchmany(25))
cur.execute(f'select * from CHILD_TABLE where PARENT_ID in ({",".join([p[id] for p in parents])})')
children = child_tuples_to_dict(cur.fetchall())
for child in children.values():
parent = next(p for p in parents if p['id'] == child['parent_id'])
parent['children'].append(child)
我正在考虑将DAOS更改为执行A bfs ,使用@sqlresultsetmapping
和本机查询,但这看起来像是一个非常丑陋的解决方案。此外,我相信我需要自己管理实体,以防止记忆泄漏。
我的问题是:
问题1:可能是我们做错事的原因(我试图恢复有关实体的所有相关细节,但是如果需要,我可以带来更多)吗?
问题#2:< / strong>使用JPA / Hibernate /命名查询等进行此类查询的正确 /漂亮方法是什么?
问题#3:如果处理此“手工”确实成为最性能的解决方案,我真的还需要手工管理实体,以防止内存泄漏?
We have this four tables persisting tree-like structures in Oracle, and we are using Hibernate to handle the persistence dirty job. It works pretty well in almost all scenarios , except for querying a bunch of those structures to transfer it as a JSON. It is taking something like 30 seconds to retrieve something like 3k entities, so I tried to investigate further.
The entities looks like this (a lot of code ommited for brevity):
@Table(name = "ROOT")
public class Root {
@Id
private Long id;
@OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "root")
@Fetch(org.hibernate.annotations.FetchMode.SUBSELECT)
@Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
private List<Stem> stems;
@OneToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "root")
@Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
private List<Branch> branches;
}
@Table(name = "STEM")
public class Stem {
@Id
private Long id;
@ManyToOne
@JoinColumn(name = "ROOT_ID")
private Root root;
}
@Table(name = "BRANCH")
public class Branch {
@Id
private Long id;
@ManyToOne
@JoinColumn(name = "ROOT_ID")
private Root root;
@OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "branch")
private List<Leaf> leaves;
}
@Table(name = "LEAF")
public class Leaf {
@Id
private Long id;
@ManyToOne
@JoinColumn(name = "BRANCH_ID")
private Branch branch;
}
The tests were performed with 25 Root
s, a little less than 200 Branch
es and a little more than 3k Leaf
ves. Also, two and a half hundreds of Stem
s were present.
Reading the logs, looks like for me that Hibernate permorms a DFS, that leads to a high number of queries (apparently, one for each entity) being fired.
So, I tried to implement a BFS (in Python, for brevity), by firing one query for each of the four tables. Surprisingly, I was able to have the same JSON ready in less than a half second (including the time for json.dump
it all)! The script was something like this (similarly repeated to all relations):
cur.execute('select * from PARENT_TABLE')
parents = parent_tuples_to_dict(cur.fetchmany(25))
cur.execute(f'select * from CHILD_TABLE where PARENT_ID in ({",".join([p[id] for p in parents])})')
children = child_tuples_to_dict(cur.fetchall())
for child in children.values():
parent = next(p for p in parents if p['id'] == child['parent_id'])
parent['children'].append(child)
I was considering just changing the DAOs to perform a BFS too, using @SqlResultSetMapping
and native queries, but this looks like a pretty ugly solution. Besides, I believe I would need to manage the entities by myself to prevent memory leaks.
My questions are:
Question #1: could be the cause for this performance abism something we are doing wrong (I tried to resume all relevant details about the entities, but I can bring more, if needed)?
Question #2: what is the right/pretty way to do this kind of query with JPA / Hibernate / Named queries etc?
Question #3: if handling this "by hand" really become the most performatic solution, will I really need to manage the entities "by hand" too, to prevent memory leaks?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我的解决方案
所以...我在Facebook组的一些同事的帮助下发现这是一种问题,称为 n+1查询,与ORM一起工作时有点常见。
我的解决方案是使用有关多个袋子fetch fetch fettion 利用一个优势的优势持久性上下文行为并取得结果,类似于我在Python脚本中获得的结果。
我基本上如何解决它
,一旦我加载了父桌(带有懒惰的孩子),i select 在与孩子们一起进行时,我又一次再次加入fetch 表我想加载。
对于我在问题中发布的示例,解决方案就是这样:
试图回答我留下的问题
问题#1 :是的。我们忽略了n+1查询方案。
问题#2 :看起来(一种)与Hibernate求解N+1查询的正确方法是与
Hibernate ::初始化
和@fetch( fetchmode.subselect)
,但我无法找出如何。问题#3 :没有时间找出它,我有一种强烈的感觉,可以通过仅使用
@sqlresultSetMapping
> >。如果我尝试过,我会用结果编辑此答案。My solution
So... I found out with the help of some colleagues from a Facebook group that this is kind of problem is called N+1 queries, and it's kinda common when working with ORM.
My solution was to use the trick described at this article about multiple bag fetch exception to take advantage of a persistence context behavior and achieve results simmilar to the ones I got with the Python script.
How I solved it
Basically, once I have the parent table loaded (with the children marked as lazy), I select those very items again while doing a join fetch with the children table I want to load.
For the example I posted in my question, the solution is something like this:
Trying to answer the questions I left
Question #1: Yes. We were ignoring the N+1 queries scenario.
Question #2: Looks like (one of) the right way to solve N+1 queries with Hibernate is to work with
Hibernate::initialize
and@Fetch(FetchMode.SUBSELECT)
, but I was not able to find out how.Question #3: Didn't had time to find it out, I have a strong feeling that it's possible to take a lot of advantage of the persistence context by just using
@SqlResultSetMapping
. If I ever try it out, I edit this answer with the results.