需要帮助提高 grails 中大型数据集的性能

发布于 2024-09-19 23:51:03 字数 1099 浏览 8 评论 0原文

该解决方案有效，但性能低于预期。返回 200K 行的查询需要几分钟的时间，并将 CPU 固定在我的开发盒上。在查询分析器中运行相同*查询将返回 << 中的所有结果。 1 分钟。

Class MyController { 

 def index = {...}
 ...
 def csv = {
   ...
   def rs = DomainClass.createCritera().scroll {}

   while(rs.next()){
    response.getOutputStream().print(rs.getString(1)\n)
   }
   ...
 }

DB = SQL Server 2005 服务器位于与我的开发机器分开的专用机箱上。

我还通过 SQL Server Profiler 注意到 gorm/hibernate 使用 sp_cursorprepexec 和 sp_cursorfetch 一次读取结果 128 行。如果可以的话，我想尝试不使用光标。

不确定这是否是问题所在，但只能提供帮助。在休眠状态下，可以将滚动设置为仅向前滚动，但我无法为 grails 找到类似的设置。

原始 hibernate 问题。

解决方案：绕过休眠。从10分钟到15秒。

Class MyController { 
 def DataSource

 def index = {...}
 ...
 def csv = {
   ...
   def out = response.getOutoutStream()
   Sql sql = new Sql(dataSource)

   sql.eachRow("select c1, c2 from t1",{
     out.println( it.c1 + "," + it.c2 )
   })
   ...
 }

*same = 从 SQL Server Profiler 剪切并粘贴，但不包括包装 sp_cursorprepexec 存储过程。

原文

This solution works but performance is lower than expected. A query returning 200K rows takes several minutes and pegs the CPU on my dev box. Running the same* query in query analyzer returns all results in < 1 minute.

Class MyController { 

 def index = {...}
 ...
 def csv = {
   ...
   def rs = DomainClass.createCritera().scroll {}

   while(rs.next()){
    response.getOutputStream().print(rs.getString(1)\n)
   }
   ...
 }

DB = SQL Server 2005 server on a dedicated box separate from my dev machine.

I've also noticed via SQL Server Profiler that gorm/hibernate is using sp_cursorprepexec and sp_cursorfetch to read the result 128 rows at a time. I'd like to try not using a cursor if it's an option.

Not sure if it's the problem but can only help. In hibernate it's possible to set the scroll as forward only but I'm having trouble finding a similar setting for grails.

original hibernate issue.

Solution: Bypass hibernate. From 10 minutes to 15 seconds.

Class MyController { 
 def DataSource

 def index = {...}
 ...
 def csv = {
   ...
   def out = response.getOutoutStream()
   Sql sql = new Sql(dataSource)

   sql.eachRow("select c1, c2 from t1",{
     out.println( it.c1 + "," + it.c2 )
   })
   ...
 }

*same = Cut and paste from the SQL Server Profiler, but excluding the wrapping sp_cursorprepexec sproc.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

放肆 2024-09-26 23:51:03

如果 GORM 不支持某些内容，可以很简单地直接下拉到 Hibernate：

import org.hibernate.ScrollMode

class MyController { 

   def index = {...}

   def csv = {
      DomainClass.withSession { session ->
         def rs = session.createCriteria(DomainClass).scroll(ScrollMode.FORWARD_ONLY)
         while (rs.next()) {
            response.outputStream.print rs.getString(1)
         }
      }
   }
}

您可以使用 session.createQuery(...) 对 HQL 查询执行相同的操作。

It's simple to drop down to Hibernate directly if something isn't supported by GORM:

import org.hibernate.ScrollMode

class MyController { 

   def index = {...}

   def csv = {
      DomainClass.withSession { session ->
         def rs = session.createCriteria(DomainClass).scroll(ScrollMode.FORWARD_ONLY)
         while (rs.next()) {
            response.outputStream.print rs.getString(1)
         }
      }
   }
}

You could do the same for an HQL query using session.createQuery(...) instead.

回复收藏 0 原文

世界和平 2024-09-26 23:51:03

Hibernate 并不是真正为批量加载而设计的，但是您可以尝试一些操作（其中大多数操作要求您放弃 ScrollableResult 的使用，而只对对象结果进行常规查询）。

只需绕过 Hibernate/GORM 并直接使用 SQL 来查找（希望如此）少数您需要的地方。是的，我知道，但如果情况变得更糟...
调用 session.setReadOnly() 或 query.setReadOnly() 来禁用 Hibernate 的状态快照
尝试一下 Hibernate 的无状态会话。如果你所做的只是阅读，那么这可能会很有效。无状态会话的开销比常规 Hibernate 会话低得多，但您将放弃所有缓存和对象状态跟踪。您必须执行以下操作才能使用它：
<块引用>
```
def Session statelessSession = sessionFactory.openStatelessSession()
statelessSession.beginTransaction()

// ...

statelessSession.getTransaction().commit()
无状态Session.close()
```
以 25 或 50 为一批刷新会话。本质上，当您迭代带回的项目时，请执行 session.flush()。如果不这样做，会话将继续增长，直到内存耗尽并且垃圾收集器开始疯狂。这可能就是您的处理器被锁定的原因。

祝你好运！

Hibernate isn't really made for batch loading, but there are some things you can try (most of which require you to drop the ScrollableResult usage and just do regular queries with object results).

Just bypass Hibernate/GORM and go to SQL directly for the (hopefully) handful of places where you need it. Ya, I know, but if worse comes to worse...
Call session.setReadOnly() or query.setReadOnly() to disable Hibernate's state snapshots
Give Hibernate's Stateless Session a try. If all you're doing is reading, this may work fine. The Stateless Session has a much lower overhead than the regular Hibernate session, but you'll give up all your caching and object state tracking. You'll have to do something like this to use it:
```
def Session statelessSession = sessionFactory.openStatelessSession()
statelessSession.beginTransaction()

// ...

statelessSession.getTransaction().commit()
statelessSession.close()
```
Flush the session in batches of 25 or 50. Essentially, as you're iterating over the items that you've brought back, do a session.flush(). If you don't, the session will keep growing until you run out of memory and your garbage collector starts going crazy. This might be why your processor is getting pegged.

Good luck!

回复收藏 0 原文

伴随着你 2024-09-26 23:51:03

使用 Grails 标准和 ScrollMode 的另一种方法：

Criteria criteria = Domain.createCriteria().buildCriteria{
    eq('id', id)
}
ScrollableResults results = criteria.scroll(ScrollMode.FORWARD_ONLY)

int i = 0
while (results.next()){
    ...
    if (++i % 50 == 0){
        Domain.withSession { Session session ->
            session.flush()
            session.clear()
        }
    }
}

Another way to use Grails criteria and ScrollMode:

Criteria criteria = Domain.createCriteria().buildCriteria{
    eq('id', id)
}
ScrollableResults results = criteria.scroll(ScrollMode.FORWARD_ONLY)

int i = 0
while (results.next()){
    ...
    if (++i % 50 == 0){
        Domain.withSession { Session session ->
            session.flush()
            session.clear()
        }
    }
}

回复收藏 0 原文

唐婉 2024-09-26 23:51:03

有几点值得注意：

您不应该在控制器中运行查询，因为它可能会为每个调用打开一个新的数据库会话和事务。
如果您只需要几列，您可以根据您的标准使用投影。
记录 hibernate 正在触发的 sql 可能会有所帮助。到目前为止，hibernate 延迟加载每个对象的可为空的一对一关系，即使您不这样做使用它。不过，如果您使用投影，则不必担心。

回复收藏 0 原文

意中人 2024-09-26 23:51:03

使用批量插入，它比 gorm 清理方法和无状态会话方法更快。下面的示例帮助您如何在 grails 中实现批量插入。

    Date startTime   = new Date()
    Session session = sessionFactory.openSession();
    Transaction tx = session.beginTransaction();

    (1..50000).each {counter ->
        Person person           = new Person()
        person.firstName        = "abc"
        person.middleName       = "abc"
        person.lastName         = "abc"
        person.address          = "abc"
        person.favouriteGame    = "abc"
        person.favouriteActor   = "abc"

        session.save(person)
        if(counter.mod(100)==0) {
            session.flush();
            session.clear();
        }

        if(counter.mod(10000)==0) {
            Date endTime    =new Date()
            println "Total record insert Counter =>"+counter+" Time =>"+TimeCategory.minus(endTime,startTime)
        }
    }

    tx.commit();
    session.close();

Use batch insert and it faster than gorm cleanup method and stateless session method.Below example helps you how to implement batch insert in grails.

    Date startTime   = new Date()
    Session session = sessionFactory.openSession();
    Transaction tx = session.beginTransaction();

    (1..50000).each {counter ->
        Person person           = new Person()
        person.firstName        = "abc"
        person.middleName       = "abc"
        person.lastName         = "abc"
        person.address          = "abc"
        person.favouriteGame    = "abc"
        person.favouriteActor   = "abc"

        session.save(person)
        if(counter.mod(100)==0) {
            session.flush();
            session.clear();
        }

        if(counter.mod(10000)==0) {
            Date endTime    =new Date()
            println "Total record insert Counter =>"+counter+" Time =>"+TimeCategory.minus(endTime,startTime)
        }
    }

    tx.commit();
    session.close();

回复收藏 0 原文

~没有更多了~