为什么 NHibernate 拒绝批量插入?
针对 SQL Server 2008 使用 NHibernate 2.1.2.4000。目标表没有触发器或无关索引。很简单:
create table LogEntries (
Id INT IDENTITY NOT NULL,
HostName NVARCHAR(32) not null,
UserName NVARCHAR(64) not null,
LogName NVARCHAR(512) not null,
Timestamp DATETIME not null,
Level INT not null,
Thread NVARCHAR(64) not null,
Message NVARCHAR(MAX) not null,
primary key (Id)
)
我的实体映射是:
<class name="LogEntry" table="LogEntries">
<id name="Id" unsaved-value="0">
<generator class="native"/>
</id>
<property name="HostName" length="32" not-null="true"/>
<property name="UserName" length="64" not-null="true"/>
<property name="LogName" length="512" not-null="true"/>
<property name="Timestamp" type="utcdatetime" not-null="true"/>
<property name="Level" not-null="true"/>
<property name="Thread" length="64" not-null="true"/>
<property name="Message">
<column name="Message" sql-type="NVARCHAR(MAX)" not-null="true"/>
</property>
</class>
现在,考虑以下测试用例:
[Fact]
public void bulk_insert_test()
{
var batchSize = 100;
var numberItems = 10000;
var configuration = new NHibernate.Cfg.Configuration().Configure();
configuration.SetProperty("connection.connection_string", @"my_conn_string");
configuration.SetProperty("adonet.batch_size", batchSize.ToString());
var sessionFactory = configuration.BuildSessionFactory();
var ts = this.WriteWithNH(sessionFactory, numberItems);
////var ts = this.WriteWithBC(sessionFactory, numberItems, batchSize);
Console.WriteLine("Saving {0} items with batch size {1}: {2}", numberItems, batchSize, ts);
}
public TimeSpan WriteWithNH(ISessionFactory sessionFactory, int numberItems)
{
using (var session = sessionFactory.OpenStatelessSession())
using (var transaction = session.BeginTransaction())
{
session.Insert(new LogEntry()
{
HostName = "host",
UserName = "user",
LogName = "log",
Level = 0,
Thread = "thread",
Timestamp = DateTime.UtcNow,
Message = "Warm up"
});
transaction.Commit();
}
var sw = Stopwatch.StartNew();
using (var session = sessionFactory.OpenStatelessSession())
using (var transaction = session.BeginTransaction())
{
for (var i = 0; i < numberItems; ++i)
{
session.Insert(new LogEntry()
{
HostName = "host",
UserName = "user",
LogName = "log",
Level = 0,
Thread = "thread",
Timestamp = DateTime.UtcNow,
Message = "Message " + i
});
}
transaction.Commit();
}
return sw.Elapsed;
}
public TimeSpan WriteWithBC(ISessionFactory sessionFactory, int numberItems, int batchSize)
{
using (var session = sessionFactory.OpenStatelessSession())
using (var bulkCopy = new SqlBulkCopy((SqlConnection)session.Connection))
{
bulkCopy.BatchSize = batchSize;
bulkCopy.DestinationTableName = "LogEntries";
var table = new DataTable("LogEntries");
table.Columns.Add("Id", typeof(int));
table.Columns.Add("HostName", typeof(string));
table.Columns.Add("UserName", typeof(string));
table.Columns.Add("LogName", typeof(string));
table.Columns.Add("Timestamp", typeof(DateTime));
table.Columns.Add("Level", typeof(int));
table.Columns.Add("Thread", typeof(string));
table.Columns.Add("Message", typeof(string));
var row = table.NewRow();
row["HostName"] = "host";
row["UserName"] = "user";
row["LogName"] = "log";
row["Timestamp"] = DateTime.UtcNow;
row["Level"] = 0L;
row["Thread"] = "thread";
row["Message"] = "Warm up";
table.Rows.Add(row);
bulkCopy.WriteToServer(table);
}
var sw = Stopwatch.StartNew();
using (var session = sessionFactory.OpenStatelessSession())
using (var bulkCopy = new SqlBulkCopy((SqlConnection)session.Connection))
{
bulkCopy.BatchSize = batchSize;
bulkCopy.DestinationTableName = "LogEntries";
var table = new DataTable("LogEntries");
table.Columns.Add("Id", typeof(int));
table.Columns.Add("HostName", typeof(string));
table.Columns.Add("UserName", typeof(string));
table.Columns.Add("LogName", typeof(string));
table.Columns.Add("Timestamp", typeof(DateTime));
table.Columns.Add("Level", typeof(int));
table.Columns.Add("Thread", typeof(string));
table.Columns.Add("Message", typeof(string));
for (var i = 0; i < numberItems; ++i)
{
var row = table.NewRow();
row["HostName"] = "host";
row["UserName"] = "user";
row["LogName"] = "log";
row["Timestamp"] = DateTime.UtcNow;
row["Level"] = 0;
row["Thread"] = "thread";
row["Message"] = "Message " + i;
table.Rows.Add(row);
}
bulkCopy.WriteToServer(table);
}
return sw.Elapsed;
}
这是使用 NHibernate 执行插入时的一些示例输出:
Saving 10000 items with batch size 500: 00:00:12.3064936
Saving 10000 items with batch size 100: 00:00:12.3600981
Saving 10000 items with batch size 1: 00:00:12.8102670
作为比较点,您会看到我还实现了一个基于 BCP 的解决方案。以下是一些示例输出:
Saving 10000 items with batch size 500: 00:00:00.3142613
Saving 10000 items with batch size 100: 00:00:00.6757417
Saving 10000 items with batch size 1: 00:00:26.2509605
显然,BCP 解决方案比 NH 解决方案快数英里。同样明显的是,批处理会影响 BCP 解决方案的速度,但不会影响 NH 解决方案的速度。当使用 NHibernate 进行插入时,NHProf 显示以下内容:
alt 文本 http://img9. imageshack.us/img9/8407/screenshotac.png
只有 INSERT
,没有 SELECT
。有趣的是,NHProf 从来没有给过我这个警告。
我尝试根据上面的测试用例在我的配置文件和代码中指定 adonet.batch_size
。
现在,我并不期望 NH 解决方案能够达到 BCP 解决方案的速度,但我至少想知道为什么批处理不起作用。如果启用批处理功能足够好,那么我可能会使用 NH 解决方案而不是 BCP,只是为了保持代码库更简单。
谁能解释为什么 NH 拒绝遵守 ADO.NET 批处理,以及我可以采取什么措施来解决它?我读过的所有分散的 NH“文档”都指出,您所需要做的就是指定 adonet.batch_size 并(最好)使用无状态会话,但我正在做这两件事。
谢谢
Using NHibernate 2.1.2.4000 against SQL Server 2008. The target table has no triggers or extraneous indexes. It is simply:
create table LogEntries (
Id INT IDENTITY NOT NULL,
HostName NVARCHAR(32) not null,
UserName NVARCHAR(64) not null,
LogName NVARCHAR(512) not null,
Timestamp DATETIME not null,
Level INT not null,
Thread NVARCHAR(64) not null,
Message NVARCHAR(MAX) not null,
primary key (Id)
)
My entity mapping is:
<class name="LogEntry" table="LogEntries">
<id name="Id" unsaved-value="0">
<generator class="native"/>
</id>
<property name="HostName" length="32" not-null="true"/>
<property name="UserName" length="64" not-null="true"/>
<property name="LogName" length="512" not-null="true"/>
<property name="Timestamp" type="utcdatetime" not-null="true"/>
<property name="Level" not-null="true"/>
<property name="Thread" length="64" not-null="true"/>
<property name="Message">
<column name="Message" sql-type="NVARCHAR(MAX)" not-null="true"/>
</property>
</class>
Now, consider the following test case:
[Fact]
public void bulk_insert_test()
{
var batchSize = 100;
var numberItems = 10000;
var configuration = new NHibernate.Cfg.Configuration().Configure();
configuration.SetProperty("connection.connection_string", @"my_conn_string");
configuration.SetProperty("adonet.batch_size", batchSize.ToString());
var sessionFactory = configuration.BuildSessionFactory();
var ts = this.WriteWithNH(sessionFactory, numberItems);
////var ts = this.WriteWithBC(sessionFactory, numberItems, batchSize);
Console.WriteLine("Saving {0} items with batch size {1}: {2}", numberItems, batchSize, ts);
}
public TimeSpan WriteWithNH(ISessionFactory sessionFactory, int numberItems)
{
using (var session = sessionFactory.OpenStatelessSession())
using (var transaction = session.BeginTransaction())
{
session.Insert(new LogEntry()
{
HostName = "host",
UserName = "user",
LogName = "log",
Level = 0,
Thread = "thread",
Timestamp = DateTime.UtcNow,
Message = "Warm up"
});
transaction.Commit();
}
var sw = Stopwatch.StartNew();
using (var session = sessionFactory.OpenStatelessSession())
using (var transaction = session.BeginTransaction())
{
for (var i = 0; i < numberItems; ++i)
{
session.Insert(new LogEntry()
{
HostName = "host",
UserName = "user",
LogName = "log",
Level = 0,
Thread = "thread",
Timestamp = DateTime.UtcNow,
Message = "Message " + i
});
}
transaction.Commit();
}
return sw.Elapsed;
}
public TimeSpan WriteWithBC(ISessionFactory sessionFactory, int numberItems, int batchSize)
{
using (var session = sessionFactory.OpenStatelessSession())
using (var bulkCopy = new SqlBulkCopy((SqlConnection)session.Connection))
{
bulkCopy.BatchSize = batchSize;
bulkCopy.DestinationTableName = "LogEntries";
var table = new DataTable("LogEntries");
table.Columns.Add("Id", typeof(int));
table.Columns.Add("HostName", typeof(string));
table.Columns.Add("UserName", typeof(string));
table.Columns.Add("LogName", typeof(string));
table.Columns.Add("Timestamp", typeof(DateTime));
table.Columns.Add("Level", typeof(int));
table.Columns.Add("Thread", typeof(string));
table.Columns.Add("Message", typeof(string));
var row = table.NewRow();
row["HostName"] = "host";
row["UserName"] = "user";
row["LogName"] = "log";
row["Timestamp"] = DateTime.UtcNow;
row["Level"] = 0L;
row["Thread"] = "thread";
row["Message"] = "Warm up";
table.Rows.Add(row);
bulkCopy.WriteToServer(table);
}
var sw = Stopwatch.StartNew();
using (var session = sessionFactory.OpenStatelessSession())
using (var bulkCopy = new SqlBulkCopy((SqlConnection)session.Connection))
{
bulkCopy.BatchSize = batchSize;
bulkCopy.DestinationTableName = "LogEntries";
var table = new DataTable("LogEntries");
table.Columns.Add("Id", typeof(int));
table.Columns.Add("HostName", typeof(string));
table.Columns.Add("UserName", typeof(string));
table.Columns.Add("LogName", typeof(string));
table.Columns.Add("Timestamp", typeof(DateTime));
table.Columns.Add("Level", typeof(int));
table.Columns.Add("Thread", typeof(string));
table.Columns.Add("Message", typeof(string));
for (var i = 0; i < numberItems; ++i)
{
var row = table.NewRow();
row["HostName"] = "host";
row["UserName"] = "user";
row["LogName"] = "log";
row["Timestamp"] = DateTime.UtcNow;
row["Level"] = 0;
row["Thread"] = "thread";
row["Message"] = "Message " + i;
table.Rows.Add(row);
}
bulkCopy.WriteToServer(table);
}
return sw.Elapsed;
}
Here is some sample output when using NHibernate to perform the inserts:
Saving 10000 items with batch size 500: 00:00:12.3064936
Saving 10000 items with batch size 100: 00:00:12.3600981
Saving 10000 items with batch size 1: 00:00:12.8102670
As a point of comparison, you'll see I also implemented a BCP-based solution. Here is some sample output:
Saving 10000 items with batch size 500: 00:00:00.3142613
Saving 10000 items with batch size 100: 00:00:00.6757417
Saving 10000 items with batch size 1: 00:00:26.2509605
Clearly, the BCP solution is miles faster than the NH one. Also evident is that batching is affecting the speed of the BCP solution, but not the NH one. When using NHibernate to do inserts, NHProf shows the following:
alt text http://img9.imageshack.us/img9/8407/screenshotac.png
There are only INSERT
s, no SELECT
s. Interestingly, at no point does NHProf give me this warning.
I have tried specifying adonet.batch_size
in both my config file and in code as per the test case above.
Now, I'm not expecting the NH solution to ever reach the speed of the BCP solution, but I'd at least like to know why batching isn't working. If it's good enough with batching enabled, then I may use the NH solution over the BCP just to keep the code base simpler.
Can anyone explain why NH is refusing to honor ADO.NET batching, and what I can do to fix it? All the scattered NH "documentation" I've read states that all you need to do is specify adonet.batch_size
and (preferably) use a stateless session, but I'm doing both those things.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用
identity
会破坏批处理。由 Fabio Maulo 此处 解释
最好的选择是切换到不同的生成器(我总是推荐
hilo
或guid.comb
)Using
identity
breaks batching.Explained by Fabio Maulo here
The best alternative is switching to a different generator (I always recommend
hilo
orguid.comb
)如果您使用身份作为主键,
ISession
或IStatelessSession
都无法批量插入。当进行插入时,Nhibernate 会将正确的值放入您的 Id 属性中。但是,当您使用 Identity 时,唯一可以获取此 Id 的地方就是数据库。使用 hilo 进行批量插入。
Neither
ISession
orIStatelessSession
will be able to batch inserts if you are using identity as primary key.When Insert is made Nhibernate will put the correct value in your Id property. But when you are using Identity the only place where this Id can be taken is the data base. Use hilo to make batch inserts.