使用 Dapper ORM 提高 SQLite 批量插入的性能

发布于 2024-12-28 19:46:34 字数 1011 浏览 0 评论 0原文

我正在开发一个桌面应用程序,它使用 SQLite 将数万行批量插入到 SQLite 数据库中。我希望帮助优化批量插入性能。目前将 60 兆数据插入数据库最多需要 50 秒。

  • 我可以使用哪些连接字符串参数来改进 表现?我应该更改缓冲区大小吗?这可以通过 连接字符串参数?还有其他连接字符串吗 提高性能的参数?我当前的连接字符串是:

    <块引用>

    数据源=Batch.db;版本=3;池化=True;最大池 Size=10;Synchronous=off;FailIfMissing=True;日志模式=Off;

  • 我正在使用 Dapper ORM。 (由 StackOverflow 的人员构建)是否有更快的方法可以在 .net 中批量插入 Sqlite?

  • System.Data.Sqlite 正在用于插入到 SQLite 中。获得一个特殊编译的 sqlite 版本怎么样? 表现? SQLite 的一个版本比另一个版本更好吗?现在 使用 http://sqlite.phxsoftware.com

    中的 System.Data.SQLite
  • 中的 System.Data.SQLite 当前,我将插入包装在事务中以进行它们更快(这取得了很好的改进)。

  • 我一次将一个表插入到 17 个表中。我可以在不同的线程上并行化它并使其更快吗?

当前表现。 这是典型的吗?我可以做得更好吗?

  • 55,000 行插入 19 列表:2.25 秒插入(24k 插入/秒)
  • 10,000 行插入 63 列表:2.74 秒插入(3.7k/秒)

我喜欢 SQLite,但我想让它更快一点。目前,使用 XML 序列化将我的对象保存到 XML 文件比保存到 SQLite 数据库更快,所以我的老板问:为什么要切换到 SQLite?或者我应该使用 MongoDB 或其他一些对象数据库?

I am working on a desktop application which uses SQLite to bulk insert tens of thousands of rows into a SQLite database. I would like help optimizing the bulk insert performance. It currently takes up to 50 seconds to insert 60 megs worth of data into the database.

  • what connection string paramaters could I use to improve
    performance? Should I change the buffer size? Is this possible via a
    connection string parameter? Are there any other connection string
    parameters to improve performance? My current connection string is:

    Data Source=Batch.db;Version=3;Pooling=True;Max Pool
    Size=10;Synchronous=off;FailIfMissing=True;Journal Mode=Off;

  • I am using Dapper ORM. (built by the guys at StackOverflow) Is there a faster way to bulk insert into Sqlite, in .net?

  • System.Data.Sqlite is being used to insert into SQLite. What about getting a special compiled version of sqlite which improves
    performance? Is one version of SQLite better than another? Currently
    using System.Data.SQLite from http://sqlite.phxsoftware.com

  • Currently, I am wrapping inserts inside a transaction to make them faster (this made a good improvement).

  • I am inserting into one table at a time into 17 tables. Could I parallelize this on different threads and make this faster?

Current Performance.
Is this typical? Can I do better?

  • 55,000 rows into table with 19 columns: 2.25 sec to insert (24k inserts/sec)
  • 10,000 rows into table with 63 columns: 2.74 sec to insert (3.7k/sec)

I like SQLite, but I would love to make it a bit faster. Currently saving my objects to an XML file using XML serialization is faster than saving to a SQLite database, so my boss is asking: why switch to SQLite? Or should I be using MongoDB, or some other object database?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

情绪失控 2025-01-04 19:46:34

所以我终于找到了一个在 SQLite 中使用 .NET 进行高性能批量插入的技巧 System.Data.SQLite

  • 这个技巧将插入性能提高了 4.1 倍!
  • 我的总保存时间从 27 秒减少到 6.6 秒。哇!

本文介绍了进行批量插入的最快方法进入 SQLite (archive.org 链接)。

  • 关键是重用相同的参数对象,
  • 但对于要插入的每条记录,分配不同的值。

.NET 构建所有这些 DbParameter 对象所花费的时间确实会增加。例如,100k 行和 30 列 = 必须创建 300 万个参数对象。相反,创建和重用仅 30 个参数对象要快得多。

更新新性能:

  • 0.53 秒内 55,000 行(19 列)= 100k 插入/秒
internal const string PeakResultsInsert = @"INSERT INTO PeakResult 
           VALUES(@Id,@PeakID,@QuanPeakID,@ISTDRetentionTimeDiff)";
                    
var command = cnn.CreateCommand();
command.CommandText = BatchConstants.PeakResultsInsert;
        
string[] parameterNames = new[]
{
    "@Id",
    "@PeakID",
    "@QuanPeakID",
    "@ISTDRetentionTimeDiff"
};
        
DbParameter[] parameters = parameterNames.Select(pn =>
{
    DbParameter parameter = command.CreateParameter();
    parameter.ParameterName = pn;
    command.Parameters.Add(parameter);
    return parameter;
}).ToArray();
        
foreach (var peakResult in peakResults)
{
    parameters[0].Value = peakResult.Id;
    parameters[1].Value = peakResult.PeakID;
    parameters[2].Value = peakResult.QuanPeakID;
    parameters[3].Value = peakResult.ISTDRetentionTimeDiff;

    command.ExecuteNonQuery();
}

最终我无法使用 Dapper 插入到我的大型表中。 (对于我的小桌子,我仍然使用 Dapper)。

请注意,我发现的其他一些事情:

  • 我尝试使用多个线程将数据插入到同一个数据库中,但这并没有带来任何改进。 (没有什么区别)

  • 从 System.Data.Sqlite 1.0.69 升级到 1.0.79。 (我所看到的性能没有变化)

  • 我没有为 DbParameter 分配类型,它似乎不会对性能产生影响。

  • 对于读取,我无法改进 Dapper 的性能。

So I finally found a trick to high performance bulk inserts in SQLite using .NET with System.Data.SQLite.

  • This trick improved insert performance by a factor of 4.1!
  • My total save time went from 27 seconds to 6.6 seconds. wow!

This article explains the fastest way to do bulk inserts into SQLite (archive.org link).

  • The key is reusing the same parameter objects
  • but for each record to insert, assigning a different value.

The time that .NET takes constructing all those DbParameter objects really adds up. For example with 100k rows and 30 columns = 3 million parameter objects which must be created. Instead, creating and reusing just 30 parameter objects is much faster.

Update New performance:

  • 55,000 rows (19 columns) in .53 seconds = 100k inserts/second
internal const string PeakResultsInsert = @"INSERT INTO PeakResult 
           VALUES(@Id,@PeakID,@QuanPeakID,@ISTDRetentionTimeDiff)";
                    
var command = cnn.CreateCommand();
command.CommandText = BatchConstants.PeakResultsInsert;
        
string[] parameterNames = new[]
{
    "@Id",
    "@PeakID",
    "@QuanPeakID",
    "@ISTDRetentionTimeDiff"
};
        
DbParameter[] parameters = parameterNames.Select(pn =>
{
    DbParameter parameter = command.CreateParameter();
    parameter.ParameterName = pn;
    command.Parameters.Add(parameter);
    return parameter;
}).ToArray();
        
foreach (var peakResult in peakResults)
{
    parameters[0].Value = peakResult.Id;
    parameters[1].Value = peakResult.PeakID;
    parameters[2].Value = peakResult.QuanPeakID;
    parameters[3].Value = peakResult.ISTDRetentionTimeDiff;

    command.ExecuteNonQuery();
}

It ends up that I could not use Dapper for inserting into my large tables. (For my small tables, I still use Dapper).

Note, some other things that I found:

  • I tried using multiple threads to insert data into the same database, this did not make any improvement. (didn't make a difference)

  • Upgraded from System.Data.Sqlite 1.0.69 to 1.0.79. (didn't make a difference in performance that I could see)

  • I am not assigning a Type to the DbParameter, it doesn't seem to make a performance difference either way.

  • For reads, I could not improve on Dapper's performance.

简美 2025-01-04 19:46:34

目前,我正在将插入内容包装在事务中以使它们
更快(这取得了很大的进步)。

我在批量插入速度方面看到的最大收益是将插入分成更小的块。我确信,每个平台/模式/等的块有多大。我相信在我的测试中它接近 1000 左右。

Currently, I am wrapping inserts inside a transaction to make them
faster (this made a good improvement).

The biggest gain I've seen in bulk insert speed was to break inserts into smaller chunks. How small of a chunk varies per platform/schema/etc, I'm sure. I believe during my tests it was near 1000 or so.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文