ADO.NET 的库可将 .csv 文件中的数据快速批量插入数据库?

发布于 2024-10-17 23:55:17 字数 207 浏览 9 评论 0原文

我想知道您是否可以推荐任何用于处理数据库的高级 ADO.NET 库。

我发现 LINQ-to-Entities 非常适合从数据库中提取数据,但对于将数据插入数据库来说一点用处都没有。它缺少快速批量插入、重复项剔除等功能,以及大多数可以通过纯 SQL 实现的高级功能。

那么:您能否推荐一些提供 LINQ-to-Entities 所缺少的高级功能的 ADO.NET 库?

I'd like to know if you can recommend any advanced ADO.NET libraries for working with databases.

I've discovered that LINQ-to-Entities is great for pulling data out of databases, but not at all useful for inserting data into databases. Its missing functionality like fast bulk insert, culling of duplicates, and most of the advanced functionality you can achieve with pure SQL.

So: can you recommend some ADO.NET libraries that offer the sorts of advanced functionality that LINQ-to-Entities is missing?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

≈。彩虹 2024-10-24 23:55:18

ADO.net SqlBulkCopy 类可以快速、批量地将记录上传到表中:

    DataTable dt = s_EmptyUploadTable.Copy();
    foreach (var itm in yourList) {
        DataRow row = dt.NewRow();
        row["Field1"] = itm.Field1;
        row["Field2"] = itm.Field2;
        dt.Rows.Add(row);
    }
    using (SqlConnection cn = new SqlConnection(yourConnectionString)) {
        cn.Open();
        using (SqlBulkCopy bulkCopy = new SqlBulkCopy(cn)) {
            bulkCopy.DestinationTableName = "dbo.YourActualSQLServerTableName";
            bulkCopy.WriteToServer(dt);
        }
        cn.Close();
    }

The ADO.net SqlBulkCopy class enables quick, mass upload of records into a table:

    DataTable dt = s_EmptyUploadTable.Copy();
    foreach (var itm in yourList) {
        DataRow row = dt.NewRow();
        row["Field1"] = itm.Field1;
        row["Field2"] = itm.Field2;
        dt.Rows.Add(row);
    }
    using (SqlConnection cn = new SqlConnection(yourConnectionString)) {
        cn.Open();
        using (SqlBulkCopy bulkCopy = new SqlBulkCopy(cn)) {
            bulkCopy.DestinationTableName = "dbo.YourActualSQLServerTableName";
            bulkCopy.WriteToServer(dt);
        }
        cn.Close();
    }
想念有你 2024-10-24 23:55:18

您可以使用 LINQ Entity Data Reader 将 IEnumerable 列表写入数据库,并在后面使用 SQL Bulk Copy场景。您可以使用此库将 LINQ 查询的结果直接批量上传到数据库中,因为 LINQ 查询的结果是 IEnumerable。

由于存在 LINQ-to-everything 适配器,因此您可以使用 LINQ to CSV 库等技巧 使用 LINQ 查询从 .csv 文件中获取数据,然后使用 LINQ 实体数据读取器 将此数据直接批量写入数据库。

案例研究

问题:将 .csv 文件快速读入数据库。与 SQL 数据库的连接是通过 C# 的 LINQ-to-Entitys 实现的。

解决方案1:使用LINQ to CSV库,构建一个 LINQ 查询来提取所需的数据,然后使用标准 LINQ 到实体调用(ctx.AddObject()、ctx.SaveChanges() 等)将其写入。花费的时间:20,000 条记录30 秒,因为 LINQ 最终为每条记录生成一个查询(慢!!!!!)。

解决方案2:使用LINQ to CSV库,构造LINQ 查询将所需数据提取到 IEnumerable 中,请使用 LINQ Entity Data Reader 批量处理将此数据直接写入目标数据表。所用时间:20,000 条记录3 秒

解决方案 3:使用带有 SQL“批量复制”的存储过程。所用时间:20,000 条记录2 秒。然而,这个解决方案非常脆弱,因为它依赖于存储过程,并且 SQL 批量复制与某些 .csv 文件格式不兼容。此方法还要求您在实际目标表和 .csv 文件之间使用临时表,以处理文件格式问题并帮助规范化。

并且,这是解决方案#2 的源代码:

static void WriteCSVtoSQLtable()
{
  // Step 1: Read .csv file into IEnumerable using LINQ-to-CSV class.

  // This section requires "LINQtoCSV" class as described at http://www.codeproject.com/KB/linq/LINQtoCSV.asp

  string inputFilePath = @"T:\x.csv";

  CsvFileDescription inputFileDescription = new CsvFileDescription
  {
    SeparatorChar = ',',
    FirstLineHasColumnNames = true
  };

  IEnumerable<MyCustomColumnMappingClass> csvChains = cc.Read<MyCustomColumnMappingClass>(inputFilePath, inputFileDescription);

  // Step 2: Now write into the target table on SQL Server.

  // This section requires "EntityDataReader" class described at http://archive.msdn.microsoft.com/LinqEntityDataReader.
  public static string dbSqlConnectionString = @";Data Source=(local);Initial Catalog=PhiEngine;Integrated Security=True;MultipleActiveResultSets=True";

  SqlConnection dbSql(dbSqlConnectionString);

  using (var tran = dbSql.BeginTransaction())
  {

    var csvFile = from p in csvChains
                select p;

    SqlBulkCopy bc = new SqlBulkCopy(dbSql,
      SqlBulkCopyOptions.CheckConstraints |
      SqlBulkCopyOptions.FireTriggers |
      SqlBulkCopyOptions.KeepNulls, tran)
                     {
                       BatchSize = 1000,
                       DestinationTableName = "TStagingTable" // Temporary staging table in database.
                     };

    bc.WriteToServer(csvFile.AsDataReader()); // Extension method .AsDataReader depends on adding the EntityDataReader class to your C# project (see above).

    tran.Commit();
  }
}

// This class is used by LINQ to CSV to query the .csv file, see "LINQtoCSV" website.
public class MyCustomColumnMappingClass
{
  [CsvColumn(Name = "symbol", FieldIndex = 1)]
  public string Symbol { get; set; }

  [CsvColumn(Name = "date", FieldIndex = 3, OutputFormat = @"MM/dd/yyyy")]
  public DateTime Date { get; set; }
}

You can use LINQ Entity Data Reader to write an IEnumerable list to a database using SQL Bulk Copy behind the scenes. You can use this library to bulk upload the results of a LINQ query straight into the database, because the results of a LINQ query are IEnumerable.

As there are LINQ-to-everything adapters, you can do tricks like use the LINQ to CSV library to grab the data out of a .csv file using a LINQ query, then the LINQ Entity Data Reader to bulk write this data directly into the database.

Case study:

Problem: read a .csv file quickly into a database. The connection to the SQL database is via LINQ-to-Entitys from C#.

Solution 1: Use LINQ to CSV library, construct a LINQ query to pull out the data you want, then write it in using the standard LINQ-to-Entity calls (ctx.AddObject(), ctx.SaveChanges(), etc). Time taken: 30 seconds for 20,000 records, as LINQ ends up generating a query for every single record (slooooow!!!!!).

Solution 2: Use LINQ to CSV library, construct a LINQ query to pull out the data you want into an IEnumerable, use LINQ Entity Data Reader to bulk write this data directly into the target data table. Time taken: 3 seconds for 20,000 records.

Solution 3: Use a a stored procedure with SQL "bulk copy". Time taken: 2 seconds for 20,000 records. However, this solution is quite brittle as it relies on a stored procedure, and SQL bulk copy is just not compatible with some .csv file formats. This method also requires that you use a staging table between the actual target table and the .csv file, to deal with file formatting issues and to help with normalization.

And, here is the source code for solution #2:

static void WriteCSVtoSQLtable()
{
  // Step 1: Read .csv file into IEnumerable using LINQ-to-CSV class.

  // This section requires "LINQtoCSV" class as described at http://www.codeproject.com/KB/linq/LINQtoCSV.asp

  string inputFilePath = @"T:\x.csv";

  CsvFileDescription inputFileDescription = new CsvFileDescription
  {
    SeparatorChar = ',',
    FirstLineHasColumnNames = true
  };

  IEnumerable<MyCustomColumnMappingClass> csvChains = cc.Read<MyCustomColumnMappingClass>(inputFilePath, inputFileDescription);

  // Step 2: Now write into the target table on SQL Server.

  // This section requires "EntityDataReader" class described at http://archive.msdn.microsoft.com/LinqEntityDataReader.
  public static string dbSqlConnectionString = @";Data Source=(local);Initial Catalog=PhiEngine;Integrated Security=True;MultipleActiveResultSets=True";

  SqlConnection dbSql(dbSqlConnectionString);

  using (var tran = dbSql.BeginTransaction())
  {

    var csvFile = from p in csvChains
                select p;

    SqlBulkCopy bc = new SqlBulkCopy(dbSql,
      SqlBulkCopyOptions.CheckConstraints |
      SqlBulkCopyOptions.FireTriggers |
      SqlBulkCopyOptions.KeepNulls, tran)
                     {
                       BatchSize = 1000,
                       DestinationTableName = "TStagingTable" // Temporary staging table in database.
                     };

    bc.WriteToServer(csvFile.AsDataReader()); // Extension method .AsDataReader depends on adding the EntityDataReader class to your C# project (see above).

    tran.Commit();
  }
}

// This class is used by LINQ to CSV to query the .csv file, see "LINQtoCSV" website.
public class MyCustomColumnMappingClass
{
  [CsvColumn(Name = "symbol", FieldIndex = 1)]
  public string Symbol { get; set; }

  [CsvColumn(Name = "date", FieldIndex = 3, OutputFormat = @"MM/dd/yyyy")]
  public DateTime Date { get; set; }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文