将 csv 加载到 oleDB 中并强制所有推断的数据类型为字符串

发布于 2024-08-10 21:43:05 字数 428 浏览 2 评论 0原文

我正在尝试使用 oledb 将 csv 文件加载到数据表中。

这没问题,但不幸的是,其中一个看似数字的字段在大约 3% 的字段中具有字符串值,因此未填充。

因为我将 csv 转换为 xml,所以我真的不关心推断数据类型,只需要字符串中的数据,因为我可以稍后在 Linq2XMl 阶段中对其进行转换。

我希望能够在连接字符串中做到这一点。

我不想只是复制表,用我想要的数据类型设置新列,然后将数据写入其中,因为这将涉及加载 csv 文件两次。

有什么想法吗?

我当前的连接字符串是

Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + thefile.DirectoryName + ";Extended Properties='text;HDR=Yes;FMT=Delimited'";

Im trying to load a csv file into a datatable using oledb.

This is no problem but unfortunately one of the fields which looks numeric has a string value in about 3% of the fields and so is not being populated.

because im converting the csv into xml i really don't care about inferring datatypes and simply need the data in a string as i can cast it later in a Linq2XMl phase.

I am hoping to be able to do this in the connection string.

I don't want to just copy the table, set it up with new columns with the datatype I want and then write the data into it because that would involve loading the csv file twice.

any ideas?

my current connection string is

Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + thefile.DirectoryName + ";Extended Properties='text;HDR=Yes;FMT=Delimited'";

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

玻璃人 2024-08-17 21:43:05

做了一些研究,答案是使用 schema.ini,但为您的数据集动态生成它。

http://msdn.microsoft.com/en-us/ Library/ms709353(VS.85).aspx

包含所需的信息。
构建架构:

   public static void ConstructSchema(FileInfo theFile)
    {
        StringBuilder schema = new StringBuilder(); 
        DataTable data = LoadCSV(theFile); 
        schema.AppendLine("[" + theFile.Name + "]");
        schema.AppendLine("ColNameHeader=True"); 
        for (int i = 0; i < data.Columns.Count; i++)
        {
            schema.AppendLine("col" + (i + 1).ToString() + "=" + data.Columns[i].ColumnName + " Text");
        }   
        string schemaFileName = theFile.DirectoryName + @"\Schema.ini";
        TextWriter tw = new StreamWriter(schemaFileName);   
        tw.WriteLine(schema.ToString());
        tw.Close();  
    }

将 csv 加载为数据表

public static DataTable LoadCSV(FileInfo theFile)
    {   
        string sqlString = "Select * FROM [" + theFile.Name + "];";
        string conStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source="
            + theFile.DirectoryName + ";" + "Extended Properties='text;HDR=YES;'";
        DataTable theCSV = new DataTable();

        using (OleDbConnection conn = new OleDbConnection(conStr))
        {
            using (OleDbCommand comm = new OleDbCommand(sqlString, conn))
            {
                using (OleDbDataAdapter adapter = new OleDbDataAdapter(comm))
                {
                    adapter.Fill(theCSV);
                }
            }
        }
        return theCSV;
    }

以转换为 xml

 public static XElement GetXMLFromCSV(FileInfo theFile, string rootNodeName, string itemName)
    {
        XElement retVal;
        DataTable data;
        data = CrateCsvAndSchema(theFile); 
        DataSet ds = new DataSet(rootNodeName);
        data.TableName = itemName;
        ds.Tables.Add(data); 
        retVal = XElement.Parse(ds.GetXml());  
        return retVal;
    }

Did some researchand the answer is use a schema.ini but generate it on the fly for your dataset.

http://msdn.microsoft.com/en-us/library/ms709353(VS.85).aspx

contains the info required.
to construct the schema:

   public static void ConstructSchema(FileInfo theFile)
    {
        StringBuilder schema = new StringBuilder(); 
        DataTable data = LoadCSV(theFile); 
        schema.AppendLine("[" + theFile.Name + "]");
        schema.AppendLine("ColNameHeader=True"); 
        for (int i = 0; i < data.Columns.Count; i++)
        {
            schema.AppendLine("col" + (i + 1).ToString() + "=" + data.Columns[i].ColumnName + " Text");
        }   
        string schemaFileName = theFile.DirectoryName + @"\Schema.ini";
        TextWriter tw = new StreamWriter(schemaFileName);   
        tw.WriteLine(schema.ToString());
        tw.Close();  
    }

to load the csv as datatable

public static DataTable LoadCSV(FileInfo theFile)
    {   
        string sqlString = "Select * FROM [" + theFile.Name + "];";
        string conStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source="
            + theFile.DirectoryName + ";" + "Extended Properties='text;HDR=YES;'";
        DataTable theCSV = new DataTable();

        using (OleDbConnection conn = new OleDbConnection(conStr))
        {
            using (OleDbCommand comm = new OleDbCommand(sqlString, conn))
            {
                using (OleDbDataAdapter adapter = new OleDbDataAdapter(comm))
                {
                    adapter.Fill(theCSV);
                }
            }
        }
        return theCSV;
    }

to convert to xml

 public static XElement GetXMLFromCSV(FileInfo theFile, string rootNodeName, string itemName)
    {
        XElement retVal;
        DataTable data;
        data = CrateCsvAndSchema(theFile); 
        DataSet ds = new DataSet(rootNodeName);
        data.TableName = itemName;
        ds.Tables.Add(data); 
        retVal = XElement.Parse(ds.GetXml());  
        return retVal;
    }
吖咩 2024-08-17 21:43:05

为了将 CSV 读入数据表,我推荐使用这个 CSV 解析器

它真的很容易使用。以下是如何使用它用来自逗号分隔、引号限定的 CSV 的数据填充数据表:

    DataTable dt = null;
    using (GenericParserAdapter gp = new GenericParser.GenericParserAdapter(yourCsvFullname)) {
        dt = gp.GetDataTable();
    }

您可以设置许多选项:分隔符、文本限定符、是否是 CSV 显示列中的第一行标题(如果为 true,DataTable 中的每个 DataColumn 都会相应地命名)等。

有许多快速、灵活的 CSV 解析器,但对于简单的要求,这个是无可匹敌的。

For reading a CSV into a DataTable I recommend this CSV parser.

It's really easy to use. Here's how you can use it to fill a DataTable with data from a comma delimited, quote qualified CSV:

    DataTable dt = null;
    using (GenericParserAdapter gp = new GenericParser.GenericParserAdapter(yourCsvFullname)) {
        dt = gp.GetDataTable();
    }

There are a number of options you can set: the delimiter, the text qualifer character(s)whether the first line in the CSV show column headers (if true, each DataColumn in your DataTable will be named accordingly), etc.

There are a number of fast, flexible CSV parsers out there but for simple requirements this one can't be beat.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文