用JPA/Hibernate急切地查询类似树状的结构时,可以提高性能

发布于 2025-02-11 16:21:39 字数 2441 浏览 0 评论 0原文

我们在Oracle中有这四桌持续的树状结构,我们正在使用Hibernate来处理持久性肮脏的工作。在几乎所有情况下,它都非常有效,除了查询一堆这些结构以将其转移为JSON。取回3K实体大约30秒钟,所以我试图进一步调查。

实体看起来像这样(许多代码为简洁):

@Table(name = "ROOT")
public class Root {
    @Id
    private Long id;

    @OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "root")
    @Fetch(org.hibernate.annotations.FetchMode.SUBSELECT)
    @Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
    private List<Stem> stems;

    @OneToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "root")
    @Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
    private List<Branch> branches;
}

@Table(name = "STEM")
public class Stem {
    @Id
    private Long id;

    @ManyToOne
    @JoinColumn(name = "ROOT_ID")
    private Root root;


}

@Table(name = "BRANCH")
public class Branch {
    @Id
    private Long id;

    @ManyToOne
    @JoinColumn(name = "ROOT_ID")
    private Root root;

    @OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "branch")
    private List<Leaf> leaves;
}

@Table(name = "LEAF")
public class Leaf {
    @Id
    private Long id;

    @ManyToOne
    @JoinColumn(name = "BRANCH_ID")
    private Branch branch;
}

测试是用25 root branch> branch es进行的,并且比3K ves。另外,存在两个半数百个stem

阅读日志,对我来说,冬眠的渗透a dfs ,会导致大量查询(显然是一个针对每个实体的查询)。

因此,我试图通过对四个表中的每个表触发一个查询来实现 bfs (在Python中)。令人惊讶的是,我能够在不到半秒的时间内准备好相同的JSON(包括json.dump的时间)!该脚本是这样的(类似地重复了所有关系):

cur.execute('select * from PARENT_TABLE')
parents = parent_tuples_to_dict(cur.fetchmany(25))

cur.execute(f'select * from CHILD_TABLE where PARENT_ID in ({",".join([p[id] for p in parents])})')
children = child_tuples_to_dict(cur.fetchall())

for child in children.values():
  parent = next(p for p in parents if p['id'] == child['parent_id'])
  parent['children'].append(child)

我正在考虑将DAOS更改为执行A bfs ,使用@sqlresultsetmapping本机查询,但这看起来像是一个非常丑陋的解决方案。此外,我相信我需要自己管理实体,以防止记忆泄漏。

我的问题是:

问题1:可能是我们做错事的原因(我试图恢复有关实体的所有相关细节,但是如果需要,我可以带来更多)吗?

问题#2:< / strong>使用JPA / Hibernate /命名查询等进行此类查询的正确 /漂亮方法是什么?

问题#3:如果处理此“手工”确实成为最性能的解决方案,我真的还需要手工管理实体,以防止内存泄漏?

We have this four tables persisting tree-like structures in Oracle, and we are using Hibernate to handle the persistence dirty job. It works pretty well in almost all scenarios , except for querying a bunch of those structures to transfer it as a JSON. It is taking something like 30 seconds to retrieve something like 3k entities, so I tried to investigate further.

The entities looks like this (a lot of code ommited for brevity):

@Table(name = "ROOT")
public class Root {
    @Id
    private Long id;

    @OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "root")
    @Fetch(org.hibernate.annotations.FetchMode.SUBSELECT)
    @Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
    private List<Stem> stems;

    @OneToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "root")
    @Cache(usage = CacheConcurrencyStrategy.NONSTRICT_READ_WRITE)
    private List<Branch> branches;
}

@Table(name = "STEM")
public class Stem {
    @Id
    private Long id;

    @ManyToOne
    @JoinColumn(name = "ROOT_ID")
    private Root root;


}

@Table(name = "BRANCH")
public class Branch {
    @Id
    private Long id;

    @ManyToOne
    @JoinColumn(name = "ROOT_ID")
    private Root root;

    @OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "branch")
    private List<Leaf> leaves;
}

@Table(name = "LEAF")
public class Leaf {
    @Id
    private Long id;

    @ManyToOne
    @JoinColumn(name = "BRANCH_ID")
    private Branch branch;
}

The tests were performed with 25 Roots, a little less than 200 Branches and a little more than 3k Leafves. Also, two and a half hundreds of Stems were present.

Reading the logs, looks like for me that Hibernate permorms a DFS, that leads to a high number of queries (apparently, one for each entity) being fired.

So, I tried to implement a BFS (in Python, for brevity), by firing one query for each of the four tables. Surprisingly, I was able to have the same JSON ready in less than a half second (including the time for json.dump it all)! The script was something like this (similarly repeated to all relations):

cur.execute('select * from PARENT_TABLE')
parents = parent_tuples_to_dict(cur.fetchmany(25))

cur.execute(f'select * from CHILD_TABLE where PARENT_ID in ({",".join([p[id] for p in parents])})')
children = child_tuples_to_dict(cur.fetchall())

for child in children.values():
  parent = next(p for p in parents if p['id'] == child['parent_id'])
  parent['children'].append(child)

I was considering just changing the DAOs to perform a BFS too, using @SqlResultSetMapping and native queries, but this looks like a pretty ugly solution. Besides, I believe I would need to manage the entities by myself to prevent memory leaks.

My questions are:

Question #1: could be the cause for this performance abism something we are doing wrong (I tried to resume all relevant details about the entities, but I can bring more, if needed)?

Question #2: what is the right/pretty way to do this kind of query with JPA / Hibernate / Named queries etc?

Question #3: if handling this "by hand" really become the most performatic solution, will I really need to manage the entities "by hand" too, to prevent memory leaks?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

王权女流氓 2025-02-18 16:21:42

我的解决方案

所以...我在Facebook组的一些同事的帮助下发现这是一种问题,称为 n+1查询,与ORM一起工作时有点常见。

我的解决方案是使用有关多个袋子fetch fetch fettion 利用一个优势的优势持久性上下文行为并取得结果,类似于我在Python脚本中获得的结果。

我基本上如何解决它

,一旦我加载了父桌(带有懒惰的孩子),i select 在与孩子们一起进行时,我又一次再次加入fetch 表我想加载。

对于我在问题中发布的示例,解决方案就是这样:

void fetchWholeTree(List<Root> roots) {
    em.createNamedQuery("select distinct r " +
                        "from Root r " +
                        "left join fetch r.stems s " +
                        "where r in :roots")
            .setParameter("roots", roots)
            .setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)
            .getResultList();
    em.createNamedQuery("select distinct r " +
                        "from Root r " +
                        "left join fetch r.branches b " +
                        "where r in :roots")
            .setParameter("roots", roots)
            .setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)
            .getResultList();
    em.createNamedQuery("select distinct b " +
                        "from Root r " +
                        "left join r.branches b " +
                        "left join fetch b.leaves l " +
                        "where r in :roots")
            .setParameter("roots", roots)
            .setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)
            .getResultList();
}

试图回答我留下的问题

问题#1 :是的。我们忽略了n+1查询方案。

问题#2 :看起来(一种)与Hibernate求解N+1查询的正确方法是与Hibernate ::初始化@fetch( fetchmode.subselect),但我无法找出如何。

问题#3 :没有时间找出它,我有一种强烈的感觉,可以通过仅使用@sqlresultSetMapping> >。如果我尝试过,我会用结果编辑此答案。

My solution

So... I found out with the help of some colleagues from a Facebook group that this is kind of problem is called N+1 queries, and it's kinda common when working with ORM.

My solution was to use the trick described at this article about multiple bag fetch exception to take advantage of a persistence context behavior and achieve results simmilar to the ones I got with the Python script.

How I solved it

Basically, once I have the parent table loaded (with the children marked as lazy), I select those very items again while doing a join fetch with the children table I want to load.

For the example I posted in my question, the solution is something like this:

void fetchWholeTree(List<Root> roots) {
    em.createNamedQuery("select distinct r " +
                        "from Root r " +
                        "left join fetch r.stems s " +
                        "where r in :roots")
            .setParameter("roots", roots)
            .setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)
            .getResultList();
    em.createNamedQuery("select distinct r " +
                        "from Root r " +
                        "left join fetch r.branches b " +
                        "where r in :roots")
            .setParameter("roots", roots)
            .setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)
            .getResultList();
    em.createNamedQuery("select distinct b " +
                        "from Root r " +
                        "left join r.branches b " +
                        "left join fetch b.leaves l " +
                        "where r in :roots")
            .setParameter("roots", roots)
            .setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)
            .getResultList();
}

Trying to answer the questions I left

Question #1: Yes. We were ignoring the N+1 queries scenario.

Question #2: Looks like (one of) the right way to solve N+1 queries with Hibernate is to work with Hibernate::initialize and @Fetch(FetchMode.SUBSELECT), but I was not able to find out how.

Question #3: Didn't had time to find it out, I have a strong feeling that it's possible to take a lot of advantage of the persistence context by just using @SqlResultSetMapping. If I ever try it out, I edit this answer with the results.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文