使用 DataReader 和 OLEDB Jet 数据提供程序读取 CSV 文件时,如何控制列数据类型?
在我的 C# 应用程序中,我使用 Microsoft Jet OLEDB 数据提供程序来读取 CSV 文件。 连接字符串如下所示:
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=c:\Data;Extended Properties="text;HDR=Yes;FMT=Delimited
我使用该连接字符串打开 ADO.NET OleDbConnection,并使用以下命令从 CSV 文件中选择所有行:
select * from Data.csv
当我打开 OleDbDataReader 并检查它返回的列的数据类型时,我发现堆栈中的某些内容尝试根据文件中的第一行数据猜测数据类型。 例如,假设 CSV 文件包含:
House,Street,Town
123,Fake Street,Springfield
12a,Evergreen Terrace,Springfield
调用 House 列的 OleDbDataReader.GetDataTypeName 方法将显示该列已被赋予数据类型“DBTYPE_I4”,因此从该列读取的所有值都被解释为整数。 我的问题是 House 应该是一个字符串 - 当我尝试从第二行读取 House 值时,OleDbDataReader 返回 null。
我如何告诉 Jet 数据库提供程序或 OleDbDataReader 将列解释为字符串而不是数字?
In my C# application I am using the Microsoft Jet OLEDB data provider to read a CSV file. The connection string looks like this:
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=c:\Data;Extended Properties="text;HDR=Yes;FMT=Delimited
I open an ADO.NET OleDbConnection using that connection string and select all the rows from the CSV file with the command:
select * from Data.csv
When I open an OleDbDataReader and examine the data types of the columns it returns, I find that something in the stack has tried to guess at the data types based on the first row of data in the file. For example, suppose the CSV file contains:
House,Street,Town
123,Fake Street,Springfield
12a,Evergreen Terrace,Springfield
Calling the OleDbDataReader.GetDataTypeName method for the House column will reveal that the column has been given the data type "DBTYPE_I4", so all values read from it are interpreted as integers. My problem is that House should be a string - when I try to read the House value from the second row, the OleDbDataReader returns null.
How can I tell either the Jet database provider or the OleDbDataReader to interpret a column as strings instead of numbers?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您可以创建一个架构文件,它会告诉 ADO.NET 如何解释 CSV - 实际上给它一个结构。
试试这个: http://www .aspdotnetcodes.com/Importing_CSV_Database_Schema.ini.aspx
或最新的 MS 文档
There's a schema file you can create that would tell ADO.NET how to interpret the CSV - in effect giving it a structure.
Try this: http://www.aspdotnetcodes.com/Importing_CSV_Database_Schema.ini.aspx
Or the most recent MS Documentation
请检查
http://kbcsv.codeplex.com/
Please check
http://kbcsv.codeplex.com/
为了扩展 Marc 的答案,我需要创建一个名为 Schema.ini 的文本文件,并将其放在与 CSV 文件相同的目录中。 除了列类型之外,此文件还可以指定文件格式、日期时间格式、区域设置和列名称(如果文件中未包含这些内容)。
为了使我在问题中给出的示例起作用,架构文件应该如下所示:
我还可以尝试让数据提供者在尝试猜测数据类型之前检查文件中的所有行:
在现实生活中,我的应用程序从具有动态名称的文件导入数据,因此我必须动态创建一个 Schema.ini 文件,并将其写入与 CSV 文件相同的目录,然后再打开连接。
更多详细信息可以在这里找到 - http://msdn.microsoft .com/en-us/library/ms709353(VS.85).aspx - 或通过在 MSDN 库中搜索“Schema.ini 文件”。
To expand on Marc's answer, I need to create a text file called Schema.ini and put it in the same directory as the CSV file. As well as column types, this file can specify the file format, date time format, regional settings, and the column names if they're not included in the file.
To make the example I gave in the question work, the Schema file should look like this:
I could also try this to make the data provider examine all the rows in the file before it tries to guess the data types:
In real life, my application imports data from files with dynamic names, so I have to create a Schema.ini file on the fly and write it to the same directory as the CSV file before I open my connection.
Further details can be found here - http://msdn.microsoft.com/en-us/library/ms709353(VS.85).aspx - or by searching the MSDN Library for "Schema.ini file".
您需要告诉驱动程序扫描所有行以确定架构。 否则,如果前几行是数字,其余行是字母数字,则字母数字单元格将为空白。
就像Rory一样,我发现我需要动态创建一个schema.ini文件,因为没有办法以编程方式告诉驱动程序扫描所有行。 (Excel 文件不是这种情况)
您的 schema.ini 中必须有
MaxScanRows=0
这是一个代码示例:
如果您在同一目录上运行它,则需要进行一些修改同时在多个线程中。
You need to tell the driver to scan all rows to determine the schema. Otherwise if the first few rows are numeric and the rest are alphanumeric, the alphanumeric cells will be blank.
Like Rory, I found that I needed to create a schema.ini file dynamically because there is no way to programatically tell the driver to scan all rows. (this is not the case for excel files)
You must have
MaxScanRows=0
in your schema.iniHere's a code example:
You'll need to do some modification if you are running this on the same directory in multiple threads at the same time.