从文本文件加载数据然后将其存储到数据库的最快方法
我有问题。
我正在开发一个项目,但我陷入了这部分:
我想从文本文件加载数据并将其存储到数据库访问中 东西是每个文本文件中的数据,大约 12.000 行数据 每个文本文件大约需要 10 分钟来处理它。
注意:在存储数据之前,我将文本文件中的每一行数据分开并将其放入字符串中,然后检查数据是否已经在数据库中。 如果在数据库内部,我会更新它。如果没有,那么我使用插入语句..
我正在使用 C# 来开发这个程序? 有没有最快的方法来加载和存储这些数据?
更新:
这是我的代码,我希望它有助于理解我的问题:
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;
using System.IO;
using System.Collections;
using System.Data.OleDb;
namespace DAF
{
public partial class FrontForm : Form
{
public Boolean status;
public FrontForm()
{
InitializeComponent();
//define location of the database
string connection = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Users\PC\Desktop\Graduation Project\Research\DAF\Data\DAFLogisticDepartment.mdb";
//define location of the text file data
DirectoryInfo di = new DirectoryInfo(@"C:\Users\PC\Desktop\Graduation Project\Research\DAF\Data\");
FileInfo[] fiarr = di.GetFiles("*.txt");
//define connection to database
OleDbConnection con = new OleDbConnection(connection);
String query;
OleDbDataReader rdr = null;
con.Open();
//get all table from database
OleDbCommand cmd = con.CreateCommand();
DataTable dt = con.GetSchema("tables");
DataRow[] dttable = dt.Select();
con.Close();
//read each new textfile inside the folder
foreach (FileInfo fri in fiarr)
{
StreamReader sr = new StreamReader(fri.FullName, System.Text.Encoding.Default);
String line;
String tabledbs, dbsName;
while ((line = sr.ReadLine()) != null)
{
String VRSD, locationID, truckID, yearIn, yearOut, weekIn, weekOut, dayIn, dayOut, timeIn, timeOut, route;
int plantID;
//process each line of data and put into each variable
VRSD = line.Substring(0, 4).Trim();
plantID = Convert.ToInt32(line.Substring(4, 1).Trim());
locationID = line.Substring(5, 4).Trim();
truckID = line.Substring(24, 5).Trim();
yearIn = line.Substring(32, 4).Trim();
weekIn = line.Substring(36, 2).Trim();
dayIn = line.Substring(38, 1).Trim();
timeIn = line.Substring(39, 8).Trim();
yearOut = line.Substring(47, 4).Trim();
weekOut = line.Substring(51, 2).Trim();
dayOut = line.Substring(53, 1).Trim();
timeOut = line.Substring(54, 8).Trim();
route = line.Substring(64, 2).Trim();
//make database name
dbsName = plantID + locationID;
con.Open();
//check if the table exist in database
for (int i = 0; i < dttable.Length - 9; i++)
{
tabledbs = dttable[i]["TABLE_NAME"].ToString();
ArrayList indexlist = new ArrayList();
if (tabledbs == dbsName)
{
//if the table exist, status = true
status = true;
break;
}
}
con.Close();
con.Open();
if (status == true)
{
try
{
//if the data not in the system, insert statement
query = @"insert into " + plantID + locationID + " values('" + VRSD.ToString() + "'," + plantID + ",'" + locationID + "','" + truckID + "','" + yearIn + "','" + weekIn + "','" + dayIn + "','" + timeIn + "','" + yearOut + "','" + weekOut + "','" + dayOut + "','" + timeOut + "')";
cmd = new OleDbCommand(query, con);
rdr = cmd.ExecuteReader();
con.Close();
}
catch
{
//if the data in the system, update statement
query = @"update " + dbsName + " set YearIn='" + yearIn + "', YearOut='" + yearOut + "', WeekIn='" + weekIn + "', WeekOut='" + weekOut + "', DayIn='" + dayIn + "', DayOut='" + dayOut + "', TimeIn='" + timeIn + "', TimeOut='" + timeOut + "' where LocationID='" + locationID + "' and PlantID=" + plantID + "";
cmd = new OleDbCommand(query, con);
rdr = cmd.ExecuteReader();
con.Close();
}
}
else
{
//create new table
string attribute = "VRSD String,PlantID Integer, LocationID String, TruckID String," +
"YearIn String, WeekIn String, DayIn String, TimeIn String," +
"YearOut String, WeekOut String, DayOut String, TimeOut String";
query = @"CREATE TABLE " + plantID + locationID + "(" + attribute + ")";
cmd = new OleDbCommand(query, con);
cmd.ExecuteNonQuery();
//insert the data
query = @"insert into " + plantID + locationID + " values('" + VRSD.ToString() + "'," + plantID + ",'" + locationID + "','" + truckID + "','" + yearIn + "','" + weekIn + "','" + dayIn + "','" + timeIn + "','" + yearOut + "','" + weekOut + "','" + dayOut + "','" + timeOut + "')";
cmd = new OleDbCommand(query, con);
rdr = cmd.ExecuteReader();
con.Close();
}
status = false;
}
sr.Close();
//after the text file load into database, the text file moved to history folder
MessageBox.Show(fri.FullName.ToString(), "File Manager", MessageBoxButtons.OK);
fri.MoveTo(@"C:\Users\PC\Desktop\Graduation Project\Research\DAF\Data\History\" + fri.Name.ToString() + ".txt");
}
}
private void button2_Click(object sender, EventArgs e)
{
StandardReport sr = new StandardReport();
sr.Show();
}
private void FrontForm_Load(object sender, EventArgs e)
{
}
}
}
I have problem.
I'm developing a project but I'm stuck in this part:
I want to load a data from text file and store it into database access
the things is the data inside each text file about 12.000 lines of data
and each text file it takes about 10 minute to process it..
NOTE : before store the data, I separate each line of data from text file and put it into string then I check whether the data is already inside database or not.
if inside the database I update it. If not then I use insert statement..
I'm using C# to develop this program?
is there any fastest way to load and store this data?
UPDATED:
This is my code I hope it will help to understand my problems:
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;
using System.IO;
using System.Collections;
using System.Data.OleDb;
namespace DAF
{
public partial class FrontForm : Form
{
public Boolean status;
public FrontForm()
{
InitializeComponent();
//define location of the database
string connection = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Users\PC\Desktop\Graduation Project\Research\DAF\Data\DAFLogisticDepartment.mdb";
//define location of the text file data
DirectoryInfo di = new DirectoryInfo(@"C:\Users\PC\Desktop\Graduation Project\Research\DAF\Data\");
FileInfo[] fiarr = di.GetFiles("*.txt");
//define connection to database
OleDbConnection con = new OleDbConnection(connection);
String query;
OleDbDataReader rdr = null;
con.Open();
//get all table from database
OleDbCommand cmd = con.CreateCommand();
DataTable dt = con.GetSchema("tables");
DataRow[] dttable = dt.Select();
con.Close();
//read each new textfile inside the folder
foreach (FileInfo fri in fiarr)
{
StreamReader sr = new StreamReader(fri.FullName, System.Text.Encoding.Default);
String line;
String tabledbs, dbsName;
while ((line = sr.ReadLine()) != null)
{
String VRSD, locationID, truckID, yearIn, yearOut, weekIn, weekOut, dayIn, dayOut, timeIn, timeOut, route;
int plantID;
//process each line of data and put into each variable
VRSD = line.Substring(0, 4).Trim();
plantID = Convert.ToInt32(line.Substring(4, 1).Trim());
locationID = line.Substring(5, 4).Trim();
truckID = line.Substring(24, 5).Trim();
yearIn = line.Substring(32, 4).Trim();
weekIn = line.Substring(36, 2).Trim();
dayIn = line.Substring(38, 1).Trim();
timeIn = line.Substring(39, 8).Trim();
yearOut = line.Substring(47, 4).Trim();
weekOut = line.Substring(51, 2).Trim();
dayOut = line.Substring(53, 1).Trim();
timeOut = line.Substring(54, 8).Trim();
route = line.Substring(64, 2).Trim();
//make database name
dbsName = plantID + locationID;
con.Open();
//check if the table exist in database
for (int i = 0; i < dttable.Length - 9; i++)
{
tabledbs = dttable[i]["TABLE_NAME"].ToString();
ArrayList indexlist = new ArrayList();
if (tabledbs == dbsName)
{
//if the table exist, status = true
status = true;
break;
}
}
con.Close();
con.Open();
if (status == true)
{
try
{
//if the data not in the system, insert statement
query = @"insert into " + plantID + locationID + " values('" + VRSD.ToString() + "'," + plantID + ",'" + locationID + "','" + truckID + "','" + yearIn + "','" + weekIn + "','" + dayIn + "','" + timeIn + "','" + yearOut + "','" + weekOut + "','" + dayOut + "','" + timeOut + "')";
cmd = new OleDbCommand(query, con);
rdr = cmd.ExecuteReader();
con.Close();
}
catch
{
//if the data in the system, update statement
query = @"update " + dbsName + " set YearIn='" + yearIn + "', YearOut='" + yearOut + "', WeekIn='" + weekIn + "', WeekOut='" + weekOut + "', DayIn='" + dayIn + "', DayOut='" + dayOut + "', TimeIn='" + timeIn + "', TimeOut='" + timeOut + "' where LocationID='" + locationID + "' and PlantID=" + plantID + "";
cmd = new OleDbCommand(query, con);
rdr = cmd.ExecuteReader();
con.Close();
}
}
else
{
//create new table
string attribute = "VRSD String,PlantID Integer, LocationID String, TruckID String," +
"YearIn String, WeekIn String, DayIn String, TimeIn String," +
"YearOut String, WeekOut String, DayOut String, TimeOut String";
query = @"CREATE TABLE " + plantID + locationID + "(" + attribute + ")";
cmd = new OleDbCommand(query, con);
cmd.ExecuteNonQuery();
//insert the data
query = @"insert into " + plantID + locationID + " values('" + VRSD.ToString() + "'," + plantID + ",'" + locationID + "','" + truckID + "','" + yearIn + "','" + weekIn + "','" + dayIn + "','" + timeIn + "','" + yearOut + "','" + weekOut + "','" + dayOut + "','" + timeOut + "')";
cmd = new OleDbCommand(query, con);
rdr = cmd.ExecuteReader();
con.Close();
}
status = false;
}
sr.Close();
//after the text file load into database, the text file moved to history folder
MessageBox.Show(fri.FullName.ToString(), "File Manager", MessageBoxButtons.OK);
fri.MoveTo(@"C:\Users\PC\Desktop\Graduation Project\Research\DAF\Data\History\" + fri.Name.ToString() + ".txt");
}
}
private void button2_Click(object sender, EventArgs e)
{
StandardReport sr = new StandardReport();
sr.Show();
}
private void FrontForm_Load(object sender, EventArgs e)
{
}
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
这里最大的时间杀手将是您正在使用的数据库连接的绝对数量 - 尝试构建内存中的命令列表(与从文件中读取数据相比,每个对象几乎不需要时间),并且一旦您构建您的列表,通过单个连接执行它们。打开每个连接都需要时间,而且您执行此操作的次数远远超出了所需的次数。编辑 - 实际上注意到您正在打开/关闭每个文件每行 2 个连接!
目前(为了清晰起见,伪代码):
建议:(强烈建议您考虑动态添加表的影响,这通常不是一个好的解决方案,但如果它强加给您,也许您别无选择)
The big time killer here will be the sheer number of database connections you are using - try building an in-memory list of commands (which will take almost no time per object compared with reading the data from the file), and once you've built your list, execute them all over a single connection. It takes time to open each connection and you're doing that far more times than needed. Edit - actually noticed you are opening/closing 2 connections per line per file!
Currently (pseudo code for clarity):
Suggestion: (and strongly suggest you think about the implications of adding tables dynamically, it's not normally a good solution, but if it's imposed on you maybe you have no choice)
为什么不尝试创建和使用 SSIS 包呢?它非常擅长这类事情,具有出色的工具,并且从代码中使用起来非常简单
http://msdn.microsoft.com/en-us/library/ms141026.aspx
http://blogs.msdn.com/b/michen/archive/2007/03/22/running-ssis-package-programmatically.aspx
Why don't you try creating and using an SSIS package? It's very good at this sort of thing, has excellent tooling and quite simple to use from code
http://msdn.microsoft.com/en-us/library/ms141026.aspx
http://blogs.msdn.com/b/michen/archive/2007/03/22/running-ssis-package-programmatically.aspx
您可以使用 Jet 文本驱动程序通过查询来插入不匹配的记录。
编辑
我想知道为什么你没有一个包含所有植物和位置的主表。然后,您可以将所有文件插入到临时表中,并相应地从临时表中追加或更新。
您似乎使用的是固定长度格式,因此
DSN=Test Spec
是通过以固定宽度格式导出文件然后使用“高级”按钮保存规范而创建的 Access 规范。You can use a query to insert unmatched records using the Jet driver for text.
EDIT
I wonder why you do not have a main table containing all plants and locations. You could then insert all the files into a temp table and either append or update from temp accordingly.
You seem to be using a fixed length format, so
DSN=Test Spec
is an Access specification create by exporting the file in fixed-width format and then saving the specification using the Advanced button.这里的一个问题可能是您使用 SINGLE SQL EXECUTE 语句逐行插入每条记录。
另一个解决方案是:
One problem here might be that you are inserting each record line-by-line with SINGLE SQL EXECUTE statements.
Another solution would be: