用JPA/Hibernate急切地查询类似树状的结构时，可以提高性能

发布于 2025-02-11 16:21:39 字数 2441 浏览 0 评论 0原文

我们在Oracle中有这四桌持续的树状结构，我们正在使用Hibernate来处理持久性肮脏的工作。在几乎所有情况下，它都非常有效，除了查询一堆这些结构以将其转移为JSON。取回3K实体大约30秒钟，所以我试图进一步调查。

实体看起来像这样（许多代码为简洁）：

@Table(name = "ROOT")
public class Root {
    @Id
    private Long id;

    @OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "root")
    @Fetch(org.hibernate.annotations.FetchMode.SUBSELECT)
    @Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
    private List<Stem> stems;

    @OneToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "root")
    @Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
    private List<Branch> branches;
}

@Table(name = "STEM")
public class Stem {
    @Id
    private Long id;

    @ManyToOne
    @JoinColumn(name = "ROOT_ID")
    private Root root;


}

@Table(name = "BRANCH")
public class Branch {
    @Id
    private Long id;

    @ManyToOne
    @JoinColumn(name = "ROOT_ID")
    private Root root;

    @OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "branch")
    private List<Leaf> leaves;
}

@Table(name = "LEAF")
public class Leaf {
    @Id
    private Long id;

    @ManyToOne
    @JoinColumn(name = "BRANCH_ID")
    private Branch branch;
}

测试是用25 root branch> branch es进行的，并且比3K 叶 ves。另外，存在两个半数百个stem。

阅读日志，对我来说，冬眠的渗透a dfs ，会导致大量查询（显然是一个针对每个实体的查询）。

因此，我试图通过对四个表中的每个表触发一个查询来实现 bfs （在Python中）。令人惊讶的是，我能够在不到半秒的时间内准备好相同的JSON（包括json.dump的时间）！该脚本是这样的（类似地重复了所有关系）：

cur.execute('select * from PARENT_TABLE')
parents = parent_tuples_to_dict(cur.fetchmany(25))

cur.execute(f'select * from CHILD_TABLE where PARENT_ID in ({",".join([p[id] for p in parents])})')
children = child_tuples_to_dict(cur.fetchall())

for child in children.values():
  parent = next(p for p in parents if p['id'] == child['parent_id'])
  parent['children'].append(child)

我正在考虑将DAOS更改为执行A bfs ，使用@sqlresultsetmapping 和本机查询，但这看起来像是一个非常丑陋的解决方案。此外，我相信我需要自己管理实体，以防止记忆泄漏。

我的问题是：

问题1：可能是我们做错事的原因（我试图恢复有关实体的所有相关细节，但是如果需要，我可以带来更多）吗？

问题＃2：< / strong>使用JPA / Hibernate /命名查询等进行此类查询的正确 /漂亮方法是什么？

问题＃3：如果处理此“手工”确实成为最性能的解决方案，我真的还需要手工管理实体，以防止内存泄漏？

原文

We have this four tables persisting tree-like structures in Oracle, and we are using Hibernate to handle the persistence dirty job. It works pretty well in almost all scenarios , except for querying a bunch of those structures to transfer it as a JSON. It is taking something like 30 seconds to retrieve something like 3k entities, so I tried to investigate further.

The entities looks like this (a lot of code ommited for brevity):

@Table(name = "ROOT")
public class Root {
    @Id
    private Long id;

    @OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "root")
    @Fetch(org.hibernate.annotations.FetchMode.SUBSELECT)
    @Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
    private List<Stem> stems;

    @OneToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "root")
    @Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
    private List<Branch> branches;
}

@Table(name = "STEM")
public class Stem {
    @Id
    private Long id;

    @ManyToOne
    @JoinColumn(name = "ROOT_ID")
    private Root root;


}

@Table(name = "BRANCH")
public class Branch {
    @Id
    private Long id;

    @ManyToOne
    @JoinColumn(name = "ROOT_ID")
    private Root root;

    @OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "branch")
    private List<Leaf> leaves;
}

@Table(name = "LEAF")
public class Leaf {
    @Id
    private Long id;

    @ManyToOne
    @JoinColumn(name = "BRANCH_ID")
    private Branch branch;
}

The tests were performed with 25 Roots, a little less than 200 Branches and a little more than 3k Leafves. Also, two and a half hundreds of Stems were present.

Reading the logs, looks like for me that Hibernate permorms a DFS, that leads to a high number of queries (apparently, one for each entity) being fired.

So, I tried to implement a BFS (in Python, for brevity), by firing one query for each of the four tables. Surprisingly, I was able to have the same JSON ready in less than a half second (including the time for json.dump it all)! The script was something like this (similarly repeated to all relations):

cur.execute('select * from PARENT_TABLE')
parents = parent_tuples_to_dict(cur.fetchmany(25))

cur.execute(f'select * from CHILD_TABLE where PARENT_ID in ({",".join([p[id] for p in parents])})')
children = child_tuples_to_dict(cur.fetchall())

for child in children.values():
  parent = next(p for p in parents if p['id'] == child['parent_id'])
  parent['children'].append(child)

I was considering just changing the DAOs to perform a BFS too, using @SqlResultSetMapping and native queries, but this looks like a pretty ugly solution. Besides, I believe I would need to manage the entities by myself to prevent memory leaks.

My questions are:

Question #1: could be the cause for this performance abism something we are doing wrong (I tried to resume all relevant details about the entities, but I can bring more, if needed)?

Question #2: what is the right/pretty way to do this kind of query with JPA / Hibernate / Named queries etc?

Question #3: if handling this "by hand" really become the most performatic solution, will I really need to manage the entities "by hand" too, to prevent memory leaks?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

王权女流氓 2025-02-18 16:21:42

我的解决方案

所以...我在Facebook组的一些同事的帮助下发现这是一种问题，称为 n+1查询，与ORM一起工作时有点常见。

我的解决方案是使用有关多个袋子fetch fetch fettion 利用一个优势的优势持久性上下文行为并取得结果，类似于我在Python脚本中获得的结果。

我基本上如何解决它

，一旦我加载了父桌（带有懒惰的孩子），i select 在与孩子们一起进行时，我又一次再次加入fetch 表我想加载。

对于我在问题中发布的示例，解决方案就是这样：

void fetchWholeTree(List<Root> roots) {
    em.createNamedQuery("select distinct r " +
                        "from Root r " +
                        "left join fetch r.stems s " +
                        "where r in :roots")
            .setParameter("roots", roots)
            .setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)
            .getResultList();
    em.createNamedQuery("select distinct r " +
                        "from Root r " +
                        "left join fetch r.branches b " +
                        "where r in :roots")
            .setParameter("roots", roots)
            .setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)
            .getResultList();
    em.createNamedQuery("select distinct b " +
                        "from Root r " +
                        "left join r.branches b " +
                        "left join fetch b.leaves l " +
                        "where r in :roots")
            .setParameter("roots", roots)
            .setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)
            .getResultList();
}

试图回答我留下的问题

问题＃1 ：是的。我们忽略了n+1查询方案。

问题＃2 ：看起来（一种）与Hibernate求解N+1查询的正确方法是与Hibernate ::初始化和@fetch（ fetchmode.subselect），但我无法找出如何。

问题＃3 ：没有时间找出它，我有一种强烈的感觉，可以通过仅使用@sqlresultSetMapping> >。如果我尝试过，我会用结果编辑此答案。

My solution

So... I found out with the help of some colleagues from a Facebook group that this is kind of problem is called N+1 queries, and it's kinda common when working with ORM.

My solution was to use the trick described at this article about multiple bag fetch exception to take advantage of a persistence context behavior and achieve results simmilar to the ones I got with the Python script.

How I solved it

Basically, once I have the parent table loaded (with the children marked as lazy), I select those very items again while doing a join fetch with the children table I want to load.

For the example I posted in my question, the solution is something like this:

void fetchWholeTree(List<Root> roots) {
    em.createNamedQuery("select distinct r " +
                        "from Root r " +
                        "left join fetch r.stems s " +
                        "where r in :roots")
            .setParameter("roots", roots)
            .setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)
            .getResultList();
    em.createNamedQuery("select distinct r " +
                        "from Root r " +
                        "left join fetch r.branches b " +
                        "where r in :roots")
            .setParameter("roots", roots)
            .setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)
            .getResultList();
    em.createNamedQuery("select distinct b " +
                        "from Root r " +
                        "left join r.branches b " +
                        "left join fetch b.leaves l " +
                        "where r in :roots")
            .setParameter("roots", roots)
            .setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)
            .getResultList();
}

Trying to answer the questions I left

Question #1: Yes. We were ignoring the N+1 queries scenario.

Question #2: Looks like (one of) the right way to solve N+1 queries with Hibernate is to work with Hibernate::initialize and @Fetch(FetchMode.SUBSELECT), but I was not able to find out how.

Question #3: Didn't had time to find it out, I have a strong feeling that it's possible to take a lot of advantage of the persistence context by just using @SqlResultSetMapping. If I ever try it out, I edit this answer with the results.

回复收藏 0 原文

~没有更多了~