sqlbulkcopy 内存。管理
我正在使用 SQLBULKCOPY 将一些数据表复制到数据库表中,但是,因为我正在复制的文件大小有时超过 600mb,所以我总是耗尽内存。
我希望在将表提交到数据库之前获得一些有关管理表大小的建议,以便我可以释放一些内存以继续写入。
以下是我的代码的一些示例(为了简单起见,删除了一些列和行)
SqlBulkCopy sqlbulkCopy = new SqlBulkCopy(ServerConfiguration); //Define the Server Configuration
System.IO.StreamReader rdr = new System.IO.StreamReader(fileName);
Console.WriteLine("Counting number of lines...");
Console.WriteLine("{0}, Contains: {1} Lines", fileName, countLines(fileName));
DataTable dt = new DataTable();
sqlbulkCopy.DestinationTableName = "[dbo].[buy.com]"; //You need to define the target table name where the data will be copied
dt.Columns.Add("PROGRAMNAME");
dt.Columns.Add("PROGRAMURL");
dt.Columns.Add("CATALOGNAME");
string inputLine = "";
DataRow row; //Declare a row, which will be added to the above data table
while ((inputLine = rdr.ReadLine()) != null) //Read while the line is not null
{
i = 0;
string[] arr;
Console.Write("\rWriting Line: {0}", k);
arr = inputLine.Split('\t'); //splitting the line which was read by the stream reader object (tab delimited)
row = dt.NewRow();
row["PROGRAMNAME"] = arr[i++];
row["PROGRAMURL"] = arr[i++];
row["CATALOGNAME"] = arr[i++];
row["LASTUPDATED"] = arr[i++];
row["NAME"] = arr[i++];
dt.Rows.Add(row);
k++;
}
// Set the timeout, 600 secons (10 minutes) given table size--damn that's a lota hooch
sqlbulkCopy.BulkCopyTimeout = 600;
try
{
sqlbulkCopy.WriteToServer(dt);
}
catch (Exception e)
{
Console.WriteLine(e);
}
sqlbulkCopy.Close();//Release the resources
dt.Dispose();
Console.WriteLine("\nDB Table Written: \"{0}\" \n\n", sqlbulkCopy.DestinationTableName.ToString());
}
我在使用 SQLBulkCopy 时仍然遇到问题,并且我意识到在将每条记录输入数据库之前我需要对其进行更多工作,因此我开发了一个简单的 LinQ to Sql 方法按记录更新进行记录,因此我可以编辑其他信息并在运行时创建更多记录信息,
问题:此方法运行速度相当慢(即使在 Core i3 机器上),关于如何加快速度(线程?)——在单个处理器核心上,具有 1GB 内存时,它会崩溃,或者有时需要 6-8 小时才能写入与 SQLBulkCopy 需要一些时间的相同数量的数据。但它确实可以更好地管理内存。
while ((inputLine = rdr.ReadLine()) != null) //Read while the line is not null
{
Console.Write("\rWriting Line: {0}", k);
string[] arr;
arr = inputLine.Split('\t');
/* items */
if (fileName.Contains(",,"))
{
Item = Table(arr);
table.tables.InsertOnSubmit(Item);
/* Check to see if the item is in the db */
bool exists = table.tables.Where(u => u.ProductID == Item.ProductID).Any();
/* Commit */
if (!exists)
{
try
{
table.SubmitChanges();
}
catch (Exception e)
{
Console.WriteLine(e);
// Make some adjustments.
// ...
// Try again.
table.SubmitChanges();
}
}
}
使用辅助方法:
public static class extensionMethods
{
/// <summary>
/// Method that provides the T-SQL EXISTS call for any IQueryable (thus extending Linq).
/// </summary>
/// <remarks>Returns whether or not the predicate conditions exists at least one time.</remarks>
public static bool Exists<TSource>(this IQueryable<TSource> source, Expression<Func<TSource, bool>> predicate)
{
return source.Where(predicate).Any();
}
}
I'm using SQLBULKCOPY to copy some data-tables into a database table, however, because the size of the files I'm copying run sometimes in excess of 600mb, I keep running out of memory.
I'm hoping to get some advice about managing the table size before I commit it to the database so I can free up some memory to continue writing.
Here are some examples of my code (some columns and rows eliminated for simplicity)
SqlBulkCopy sqlbulkCopy = new SqlBulkCopy(ServerConfiguration); //Define the Server Configuration
System.IO.StreamReader rdr = new System.IO.StreamReader(fileName);
Console.WriteLine("Counting number of lines...");
Console.WriteLine("{0}, Contains: {1} Lines", fileName, countLines(fileName));
DataTable dt = new DataTable();
sqlbulkCopy.DestinationTableName = "[dbo].[buy.com]"; //You need to define the target table name where the data will be copied
dt.Columns.Add("PROGRAMNAME");
dt.Columns.Add("PROGRAMURL");
dt.Columns.Add("CATALOGNAME");
string inputLine = "";
DataRow row; //Declare a row, which will be added to the above data table
while ((inputLine = rdr.ReadLine()) != null) //Read while the line is not null
{
i = 0;
string[] arr;
Console.Write("\rWriting Line: {0}", k);
arr = inputLine.Split('\t'); //splitting the line which was read by the stream reader object (tab delimited)
row = dt.NewRow();
row["PROGRAMNAME"] = arr[i++];
row["PROGRAMURL"] = arr[i++];
row["CATALOGNAME"] = arr[i++];
row["LASTUPDATED"] = arr[i++];
row["NAME"] = arr[i++];
dt.Rows.Add(row);
k++;
}
// Set the timeout, 600 secons (10 minutes) given table size--damn that's a lota hooch
sqlbulkCopy.BulkCopyTimeout = 600;
try
{
sqlbulkCopy.WriteToServer(dt);
}
catch (Exception e)
{
Console.WriteLine(e);
}
sqlbulkCopy.Close();//Release the resources
dt.Dispose();
Console.WriteLine("\nDB Table Written: \"{0}\" \n\n", sqlbulkCopy.DestinationTableName.ToString());
}
I continued to have problems getting SQLBulkCopy to work, and I realized I needed to do more work on each record before it was entered into the database, so I developed a simple LinQ to Sql method to do record by record updates, so I could edit other information and create more record information as it was being run,
Problem: This method's been running pretty slow (even on Core i3 machine), any ideas on how to speed it up (threading?) -- on a single processor core, with 1gb of memory it crashes or takes sometimes 6-8 hours to write the same amount of data as one SQLBulkCopy that takes a few moments. It does manage memory better though.
while ((inputLine = rdr.ReadLine()) != null) //Read while the line is not null
{
Console.Write("\rWriting Line: {0}", k);
string[] arr;
arr = inputLine.Split('\t');
/* items */
if (fileName.Contains(",,"))
{
Item = Table(arr);
table.tables.InsertOnSubmit(Item);
/* Check to see if the item is in the db */
bool exists = table.tables.Where(u => u.ProductID == Item.ProductID).Any();
/* Commit */
if (!exists)
{
try
{
table.SubmitChanges();
}
catch (Exception e)
{
Console.WriteLine(e);
// Make some adjustments.
// ...
// Try again.
table.SubmitChanges();
}
}
}
With helper method:
public static class extensionMethods
{
/// <summary>
/// Method that provides the T-SQL EXISTS call for any IQueryable (thus extending Linq).
/// </summary>
/// <remarks>Returns whether or not the predicate conditions exists at least one time.</remarks>
public static bool Exists<TSource>(this IQueryable<TSource> source, Expression<Func<TSource, bool>> predicate)
{
return source.Where(predicate).Any();
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
尝试将 BatchSize 属性指定为 1000,这将以 1000 条记录批次而不是整批记录的形式对插入进行批处理。您可以调整该值以找到最佳值。我已经使用 sqlbulkcopy 来处理类似大小的数据,并且效果很好。
Try specifying the BatchSize property to 1000 which will batch up the insert in a 1000 record batch rather than the whole lot. You can tweak this value to find what is optimal. I have used sqlbulkcopy for similar size data and it works well.
面对同样的问题,发现OutOfMemory Exception的问题是在DataTable.Rows最大数量限制中。
通过重新创建表解决,最大行数限制为 500000 行。
希望我的解决方案会有所帮助:
Faced with the same issue, found that problem of OutOfMemory Exception was in DataTable.Rows maximum quantity limitations.
Solved with recreating table, with maximum 500000 rows limit.
Hope, my solution will be helpfull: