需要帮助提高 grails 中大型数据集的性能

发布于 2024-09-19 23:51:03 字数 1099 浏览 8 评论 0原文

该解决方案有效,但性能低于预期。返回 200K 行的查询需要几分钟的时间,并将 CPU 固定在我的开发盒上。在查询分析器中运行相同*查询将返回 << 中的所有结果。 1 分钟。

Class MyController { 

 def index = {...}
 ...
 def csv = {
   ...
   def rs = DomainClass.createCritera().scroll {}

   while(rs.next()){
    response.getOutputStream().print(rs.getString(1)\n)
   }
   ...
 }

DB = SQL Server 2005 服务器位于与我的开发机器分开的专用机箱上。

我还通过 SQL Server Profiler 注意到 gorm/hibernate 使用 sp_cursorprepexec 和 sp_cursorfetch 一次读取结果 128 行。如果可以的话,我想尝试不使用光标。

不确定这是否是问题所在,但只能提供帮助。在休眠状态下,可以将滚动设置为仅向前滚动,但我无法为 grails 找到类似的设置。

原始 hibernate 问题

解决方案:绕过休眠。从10分钟到15秒。

Class MyController { 
 def DataSource

 def index = {...}
 ...
 def csv = {
   ...
   def out = response.getOutoutStream()
   Sql sql = new Sql(dataSource)

   sql.eachRow("select c1, c2 from t1",{
     out.println( it.c1 + "," + it.c2 )
   })
   ...
 }

*same = 从 SQL Server Profiler 剪切并粘贴,但不包括包装 sp_cursorprepexec 存储过程。

This solution works but performance is lower than expected. A query returning 200K rows takes several minutes and pegs the CPU on my dev box. Running the same* query in query analyzer returns all results in < 1 minute.

Class MyController { 

 def index = {...}
 ...
 def csv = {
   ...
   def rs = DomainClass.createCritera().scroll {}

   while(rs.next()){
    response.getOutputStream().print(rs.getString(1)\n)
   }
   ...
 }

DB = SQL Server 2005 server on a dedicated box separate from my dev machine.

I've also noticed via SQL Server Profiler that gorm/hibernate is using sp_cursorprepexec and sp_cursorfetch to read the result 128 rows at a time. I'd like to try not using a cursor if it's an option.

Not sure if it's the problem but can only help. In hibernate it's possible to set the scroll as forward only but I'm having trouble finding a similar setting for grails.

original hibernate issue.

Solution: Bypass hibernate. From 10 minutes to 15 seconds.

Class MyController { 
 def DataSource

 def index = {...}
 ...
 def csv = {
   ...
   def out = response.getOutoutStream()
   Sql sql = new Sql(dataSource)

   sql.eachRow("select c1, c2 from t1",{
     out.println( it.c1 + "," + it.c2 )
   })
   ...
 }

*same = Cut and paste from the SQL Server Profiler, but excluding the wrapping sp_cursorprepexec sproc.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

放肆 2024-09-26 23:51:03

如果 GORM 不支持某些内容,可以很简单地直接下拉到 Hibernate:

import org.hibernate.ScrollMode

class MyController { 

   def index = {...}

   def csv = {
      DomainClass.withSession { session ->
         def rs = session.createCriteria(DomainClass).scroll(ScrollMode.FORWARD_ONLY)
         while (rs.next()) {
            response.outputStream.print rs.getString(1)
         }
      }
   }
}

您可以使用 session.createQuery(...) 对 HQL 查询执行相同的操作。

It's simple to drop down to Hibernate directly if something isn't supported by GORM:

import org.hibernate.ScrollMode

class MyController { 

   def index = {...}

   def csv = {
      DomainClass.withSession { session ->
         def rs = session.createCriteria(DomainClass).scroll(ScrollMode.FORWARD_ONLY)
         while (rs.next()) {
            response.outputStream.print rs.getString(1)
         }
      }
   }
}

You could do the same for an HQL query using session.createQuery(...) instead.

世界和平 2024-09-26 23:51:03

Hibernate 并不是真正为批量加载而设计的,但是您可以尝试一些操作(其中大多数操作要求您放弃 ScrollableResult 的使用,而只对对象结果进行常规查询)。

  1. 只需绕过 Hibernate/GORM 并直接使用 SQL 来查找(希望如此)少数您需要的地方。是的,我知道,但如果情况变得更糟...
  2. 调用 session.setReadOnly() 或 query.setReadOnly() 来禁用 Hibernate 的状态快照
  3. 尝试一下 Hibernate 的无状态会话。如果你所做的只是阅读,那么这可能会很有效。无状态会话的开销比常规 Hibernate 会话低得多,但您将放弃所有缓存和对象状态跟踪。您必须执行以下操作才能使用它:

    <块引用>

    def Session statelessSession = sessionFactory.openStatelessSession()
    statelessSession.beginTransaction()
    
    // ...
    
    statelessSession.getTransaction().commit()
    无状态Session.close()
    
  4. 以 25 或 50 为一批刷新会话。本质上,当您迭代带回的项目时,请执行 session.flush()。如果不这样做,会话将继续增长,直到内存耗尽并且垃圾收集器开始疯狂。这可能就是您的处理器被锁定的原因。

祝你好运!

Hibernate isn't really made for batch loading, but there are some things you can try (most of which require you to drop the ScrollableResult usage and just do regular queries with object results).

  1. Just bypass Hibernate/GORM and go to SQL directly for the (hopefully) handful of places where you need it. Ya, I know, but if worse comes to worse...
  2. Call session.setReadOnly() or query.setReadOnly() to disable Hibernate's state snapshots
  3. Give Hibernate's Stateless Session a try. If all you're doing is reading, this may work fine. The Stateless Session has a much lower overhead than the regular Hibernate session, but you'll give up all your caching and object state tracking. You'll have to do something like this to use it:

    def Session statelessSession = sessionFactory.openStatelessSession()
    statelessSession.beginTransaction()
    
    // ...
    
    statelessSession.getTransaction().commit()
    statelessSession.close()
    
  4. Flush the session in batches of 25 or 50. Essentially, as you're iterating over the items that you've brought back, do a session.flush(). If you don't, the session will keep growing until you run out of memory and your garbage collector starts going crazy. This might be why your processor is getting pegged.

Good luck!

伴随着你 2024-09-26 23:51:03

使用 Grails 标准和 ScrollMode 的另一种方法:

Criteria criteria = Domain.createCriteria().buildCriteria{
    eq('id', id)
}
ScrollableResults results = criteria.scroll(ScrollMode.FORWARD_ONLY)

int i = 0
while (results.next()){
    ...
    if (++i % 50 == 0){
        Domain.withSession { Session session ->
            session.flush()
            session.clear()
        }
    }
}

Another way to use Grails criteria and ScrollMode:

Criteria criteria = Domain.createCriteria().buildCriteria{
    eq('id', id)
}
ScrollableResults results = criteria.scroll(ScrollMode.FORWARD_ONLY)

int i = 0
while (results.next()){
    ...
    if (++i % 50 == 0){
        Domain.withSession { Session session ->
            session.flush()
            session.clear()
        }
    }
}
唐婉 2024-09-26 23:51:03

有几点值得注意:

A few things worth noting:

意中人 2024-09-26 23:51:03

使用批量插入,它比 gorm 清理方法和无状态会话方法更快。下面的示例帮助您如何在 grails 中实现批量插入。

    Date startTime   = new Date()
    Session session = sessionFactory.openSession();
    Transaction tx = session.beginTransaction();

    (1..50000).each {counter ->
        Person person           = new Person()
        person.firstName        = "abc"
        person.middleName       = "abc"
        person.lastName         = "abc"
        person.address          = "abc"
        person.favouriteGame    = "abc"
        person.favouriteActor   = "abc"

        session.save(person)
        if(counter.mod(100)==0) {
            session.flush();
            session.clear();
        }

        if(counter.mod(10000)==0) {
            Date endTime    =new Date()
            println "Total record insert Counter =>"+counter+" Time =>"+TimeCategory.minus(endTime,startTime)
        }
    }

    tx.commit();
    session.close();

Use batch insert and it faster than gorm cleanup method and stateless session method.Below example helps you how to implement batch insert in grails.

    Date startTime   = new Date()
    Session session = sessionFactory.openSession();
    Transaction tx = session.beginTransaction();

    (1..50000).each {counter ->
        Person person           = new Person()
        person.firstName        = "abc"
        person.middleName       = "abc"
        person.lastName         = "abc"
        person.address          = "abc"
        person.favouriteGame    = "abc"
        person.favouriteActor   = "abc"

        session.save(person)
        if(counter.mod(100)==0) {
            session.flush();
            session.clear();
        }

        if(counter.mod(10000)==0) {
            Date endTime    =new Date()
            println "Total record insert Counter =>"+counter+" Time =>"+TimeCategory.minus(endTime,startTime)
        }
    }

    tx.commit();
    session.close();
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文