如何调试这个 Hibernate/JPA 慢查询问题?

发布于 2024-12-07 11:12:00 字数 4434 浏览 1 评论 0原文

我有一个 Hibernate/JPA 查询,它根据数据库中的索引字段执行简单的查询。该表相当大,大约有2800万条记录,但直接使用数据库时查询会立即返回,使用Hibernate+JPA需要20多秒。数据库是PostgreSQL 9.0.4。

这是包括相关查询的日志摘录,从时间戳中您可以看到发出查询和打开结果集之间存在很长的延迟。

2011-09-28 06:26:38,019 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.AbstractBatcher - about to open PreparedStatement (open PreparedStatements: 0, globally: 0)
2011-09-28 06:26:38,019 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.ConnectionManager - opening JDBC connection
2011-09-28 06:26:38,020 [http-thread-pool-8080(5)] DEBUG org.hibernate.SQL - select pafaddress0_.address_id as address1_0_, pafaddress0_.admin1 as admin2_0_, pafaddress0_.admin2 as admin3_0_, pafaddress0_.admin3 as admin4_0_, pafaddress0_.barcode as barcode0_, pafaddress0_.building_name as building6_0_, pafaddress0_.building_number as building7_0_, pafaddress0_.country_name as country8_0_, pafaddress0_.dependent_locality as dependent9_0_, pafaddress0_.dependent_street as dependent10_0_, pafaddress0_.dept_name as dept11_0_, pafaddress0_.double_dep_locality as double12_0_, pafaddress0_.dps as dps0_, pafaddress0_.mailsort as mailsort0_, pafaddress0_.org_name as org15_0_, pafaddress0_.pca_id as pca16_0_, pafaddress0_.pobox as pobox0_, pafaddress0_.postcode as postcode0_, pafaddress0_.postcode1 as postcode19_0_, pafaddress0_.postcode_type as postcode20_0_, pafaddress0_.street as street0_, pafaddress0_.su_org_indicator as su22_0_, pafaddress0_.subbuilding_name as subbuil23_0_, pafaddress0_.posttown as posttown0_, pafaddress0_.trad_county as trad25_0_, pafaddress0_.type as type0_, pafaddress0_.udprn as udprn0_ from paf pafaddress0_ where pafaddress0_.pca_id=? limit ?
2011-09-28 06:27:01,897 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.AbstractBatcher - about to open ResultSet (open ResultSets: 0, globally: 0)
2011-09-28 06:27:01,900 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.AbstractBatcher - about to close ResultSet (open ResultSets: 1, globally: 1)
2011-09-28 06:27:01,900 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.AbstractBatcher - about to close PreparedStatement (open PreparedStatements: 1, globally: 1)
2011-09-28 06:27:01,900 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.ConnectionManager - aggressively releasing JDBC connection
201

我的实体没有复杂的关联,并且日志中没有任何迹象表明在这段长时间的延迟期间发生了什么。

@Entity
@Table(name = "paf")
@NamedQueries({
        @NamedQuery(name = "PafAddress.findByPcaId", query = "SELECT a FROM PafAddress a where a.pcaId=:pcaId ")
})
public class PafAddress implements Serializable, Identifiable<Long> {
    private static final long serialVersionUID = -5831210090000717285L;

    @Id
    @Column(name = "address_id")
    private Long id;

    private String postcode;

    @Column(name = "posttown")
    private String town;

    @Column(name = "dependent_locality")
    private String dependentLocality;

    @Column(name = "double_dep_locality")
    private String doubleDepLocality;

    @Column(name = "street")
    private String street;

    @Column(name = "dependent_street")
    private String dependentStreet;

    @Column(name = "building_number")
    private String buildingNumber;

    @Column(name = "building_name")
    private String buildingName;

    @Column(name = "subbuilding_name")
    private String subbuildingName;

    @Column(name = "pobox")
    private String pobox;

    @Column(name = "dept_name")
    private String deptName;

    @Column(name = "org_name")
    private String orgName;

    @Column(name = "udprn")
    private String udprn;

    @Column(name = "postcode_type")
    private String postcodeType;

    @Column(name = "su_org_indicator")
    private String suOrgIndicator;

    @Column(name = "dps")
    private String dps;

    @Column(name = "postcode1")
    private String postcode1;

    @Column(name = "admin1")
    private String admin1;

    @Column(name = "admin2")
    private String admin2;

    @Column(name = "admin3")
    private String admin3;

    @Column(name = "trad_county")
    private String tradCounty;

    @Column(name = "country_name")
    private String countryName;

    @Column(name = "mailsort")
    private String mailsort;

    @Column(name = "barcode")
    private String barcode;

    @Column(name = "type")
    private String type;

    @Column(name = "pca_id")
    private Double pcaId;

    public PafAddress() {
    }

关于为什么这个查询执行如此糟糕或者我如何确定原因有什么想法吗?

I have a Hibernate/JPA query which performs a simple query based on an indexed field in the database. The table is fairly large with around 28 million records but when using the database directly the query returns instantaneously, with Hibernate+JPA it takes over 20 seconds. The database is PostgreSQL 9.0.4.

This is an extract of the logs including the relevant query and from the timestamps you can see a long delay between the query getting issued and opening the ResultSet.

2011-09-28 06:26:38,019 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.AbstractBatcher - about to open PreparedStatement (open PreparedStatements: 0, globally: 0)
2011-09-28 06:26:38,019 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.ConnectionManager - opening JDBC connection
2011-09-28 06:26:38,020 [http-thread-pool-8080(5)] DEBUG org.hibernate.SQL - select pafaddress0_.address_id as address1_0_, pafaddress0_.admin1 as admin2_0_, pafaddress0_.admin2 as admin3_0_, pafaddress0_.admin3 as admin4_0_, pafaddress0_.barcode as barcode0_, pafaddress0_.building_name as building6_0_, pafaddress0_.building_number as building7_0_, pafaddress0_.country_name as country8_0_, pafaddress0_.dependent_locality as dependent9_0_, pafaddress0_.dependent_street as dependent10_0_, pafaddress0_.dept_name as dept11_0_, pafaddress0_.double_dep_locality as double12_0_, pafaddress0_.dps as dps0_, pafaddress0_.mailsort as mailsort0_, pafaddress0_.org_name as org15_0_, pafaddress0_.pca_id as pca16_0_, pafaddress0_.pobox as pobox0_, pafaddress0_.postcode as postcode0_, pafaddress0_.postcode1 as postcode19_0_, pafaddress0_.postcode_type as postcode20_0_, pafaddress0_.street as street0_, pafaddress0_.su_org_indicator as su22_0_, pafaddress0_.subbuilding_name as subbuil23_0_, pafaddress0_.posttown as posttown0_, pafaddress0_.trad_county as trad25_0_, pafaddress0_.type as type0_, pafaddress0_.udprn as udprn0_ from paf pafaddress0_ where pafaddress0_.pca_id=? limit ?
2011-09-28 06:27:01,897 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.AbstractBatcher - about to open ResultSet (open ResultSets: 0, globally: 0)
2011-09-28 06:27:01,900 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.AbstractBatcher - about to close ResultSet (open ResultSets: 1, globally: 1)
2011-09-28 06:27:01,900 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.AbstractBatcher - about to close PreparedStatement (open PreparedStatements: 1, globally: 1)
2011-09-28 06:27:01,900 [http-thread-pool-8080(5)] DEBUG org.hibernate.jdbc.ConnectionManager - aggressively releasing JDBC connection
201

My Entity has no complex associations and there are no indications in the logs as to what is going on during this long delay.

@Entity
@Table(name = "paf")
@NamedQueries({
        @NamedQuery(name = "PafAddress.findByPcaId", query = "SELECT a FROM PafAddress a where a.pcaId=:pcaId ")
})
public class PafAddress implements Serializable, Identifiable<Long> {
    private static final long serialVersionUID = -5831210090000717285L;

    @Id
    @Column(name = "address_id")
    private Long id;

    private String postcode;

    @Column(name = "posttown")
    private String town;

    @Column(name = "dependent_locality")
    private String dependentLocality;

    @Column(name = "double_dep_locality")
    private String doubleDepLocality;

    @Column(name = "street")
    private String street;

    @Column(name = "dependent_street")
    private String dependentStreet;

    @Column(name = "building_number")
    private String buildingNumber;

    @Column(name = "building_name")
    private String buildingName;

    @Column(name = "subbuilding_name")
    private String subbuildingName;

    @Column(name = "pobox")
    private String pobox;

    @Column(name = "dept_name")
    private String deptName;

    @Column(name = "org_name")
    private String orgName;

    @Column(name = "udprn")
    private String udprn;

    @Column(name = "postcode_type")
    private String postcodeType;

    @Column(name = "su_org_indicator")
    private String suOrgIndicator;

    @Column(name = "dps")
    private String dps;

    @Column(name = "postcode1")
    private String postcode1;

    @Column(name = "admin1")
    private String admin1;

    @Column(name = "admin2")
    private String admin2;

    @Column(name = "admin3")
    private String admin3;

    @Column(name = "trad_county")
    private String tradCounty;

    @Column(name = "country_name")
    private String countryName;

    @Column(name = "mailsort")
    private String mailsort;

    @Column(name = "barcode")
    private String barcode;

    @Column(name = "type")
    private String type;

    @Column(name = "pca_id")
    private Double pcaId;

    public PafAddress() {
    }

Any ideas as to why this query is performing so badly or how I can determine the cause?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

三岁铭 2024-12-14 11:12:00

我发现了问题:

我在 PostgreSQL 中启用了 auto_explain 来进行查询,这使得我能够在日志中看到查询计划。
请参阅http://www.postgresql.org/docs/current/static/ auto-explain.html

对于 Hibernate/JPA 查询:

Limit  (cost=0.00..13.12 rows=2 width=469)
  ->  Seq Scan on paf pafaddress0_  (cost=0.00..943043.74 rows=143753 width=469)
        Filter: ((pca_id)::double precision = 7912139::double precision)

本机数据库查询:

                                          QUERY PLAN                                              
 Limit  (cost=0.00..7.03 rows=2 width=469)
   ->  Index Scan using paf_pca_idx on paf pafaddress0_  (cost=0.00..505649.83 rows=143753 width=469)
         Index Cond: (pca_id = 7912139::numeric)

速度很慢,因为正在执行顺序扫描而不是使用索引,这似乎是我的实体和数据库字段之间类型不匹配的结果。实体属性是“双精度”,而数据库类型是“数字”,查询解释显示数据库字段的转换为双精度。

我将数据库字段类型更改为双精度,查询现在按预期执行。

I've discovered the problem:

I enabled auto_explain in PostgreSQL for queries taking >1 second which enabled me to see the query plan in the logs.
see http://www.postgresql.org/docs/current/static/auto-explain.html

For the Hibernate/JPA query:

Limit  (cost=0.00..13.12 rows=2 width=469)
  ->  Seq Scan on paf pafaddress0_  (cost=0.00..943043.74 rows=143753 width=469)
        Filter: ((pca_id)::double precision = 7912139::double precision)

Native database query:

                                          QUERY PLAN                                              
 Limit  (cost=0.00..7.03 rows=2 width=469)
   ->  Index Scan using paf_pca_idx on paf pafaddress0_  (cost=0.00..505649.83 rows=143753 width=469)
         Index Cond: (pca_id = 7912139::numeric)

It was slow because a sequential scan was being performed instead of using the index, seemingly as a result of a type mismatch between my entity and database field. The entity property was a 'Double', whilst the database type was 'numeric' and the query explanation shows a cast of the database field occurring to double double precision.

I changed my database field type to double precision and the query is now performing as expected.

左秋 2024-12-14 11:12:00

如果sql相同,那么瓶颈一定是为每条记录创建单独的java对象的过程。我假设它返回大量记录,您可以尝试限制这个数量。

If the sql is the same, the bottleneck must be the process of creating the individual java objects for each record. I assume it returns a large number of records, you could try to limit this number.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文