导入 csv 数据时,csv 列文本之间的引号会导致跳过剩余列
我正在使用以下代码从 csv 文件中获取数据:
public DataTable GetCSVData(string CSVFileName)
{
string CSVConnectionString = "Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" + ConfigurationSettings.AppSettings["CSVFolder"].ToString() + ";Extensions=asc,csv,tab,txt;Persist Security Info=False;";
using (OdbcConnection Connection = new OdbcConnection(CSVConnectionString))
{
DataTable CSVDataTable = new DataTable();
string SelectQuery = string.Format(@"SELECT * FROM [{0}]", CSVFileName);
OdbcDataAdapter Adapter = new OdbcDataAdapter(SelectQuery, Connection);
Adapter.Fill(CSVDataTable);
return CSVDataTable;
}
}
确切的问题是,如果 csv 列包含一个数据,该数据以粗体字母突出显示,如
Row1-> 下面的 Row1 所示。 col1,"cdwdf" dsdfs,col2,col3
在使用上述代码获取数据时,将跳过 col2 和 col3(突出显示文本后面的列),并继续从下一行获取数据。
如果 Row1 中提到的列文本完全包含在引号内 ("cdwdf dsdfs"),则可以正确获取数据。
任何人请告诉我如何在这种情况下从 csv 获取数据......
I am using the following code to fetch data from a csv file:
public DataTable GetCSVData(string CSVFileName)
{
string CSVConnectionString = "Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" + ConfigurationSettings.AppSettings["CSVFolder"].ToString() + ";Extensions=asc,csv,tab,txt;Persist Security Info=False;";
using (OdbcConnection Connection = new OdbcConnection(CSVConnectionString))
{
DataTable CSVDataTable = new DataTable();
string SelectQuery = string.Format(@"SELECT * FROM [{0}]", CSVFileName);
OdbcDataAdapter Adapter = new OdbcDataAdapter(SelectQuery, Connection);
Adapter.Fill(CSVDataTable);
return CSVDataTable;
}
}
The exact problem is if a csv column contains a data, which is highlighted in bold letters,shown in Row1 below
Row1-> col1,"cdwdf" dsdfs,col2,col3
the col2 and col3 (columns after the highlighted text) are skipped while fetching the data using the above code and it continues with fetching data from the next row.
If the mentioned column text in Row1 is fully within quotes ("cdwdf dsdfs") the data is fetched correctly.
Any one please tell me how to fetch data from csv in such a situation...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
双引号是 csv 规范的一部分。如果您的数据包含双引号,则整个字段(或列)必须用双引号括起来,并且字段内的任何双引号都必须使用双双引号进行转义。
所以你的行应该是这样的:
我没有使用任何 CSV 库,所以我不能推荐任何库,但你可以自己轻松解析该文件。只需逐行读取文件并用“,”分隔即可。问题是跨多行的字段......
编辑:所以总而言之,您需要修改您的 CSV 输入文件或找到一个更宽容的解析器或至少会当发现格式错误的 CSV 记录时抛出异常。乍一看,其他人建议的 FAST CSV reader 似乎是一个不错的起点它声称格式错误的 CSV 会导致它失败并出现有意义的异常。
Double quotes are a part of csv specification. If you have data that contains double quotes, than the entire field (or column) must be enclosed in double quotes and any double quotes inside the field must be escaped using double double quotes.
So your line should read like this:
I haven't used any CSV libraries so I can't recommend any, but you could easily parse the file yourself. Just read the file line by line and split by ','. Problems with this are fields that span multiple lines....
EDIT: So to sum it up you'll need to modify your CSV input file or find a parser that is more forgiving or that will at least throw an exception when it finds a malformed CSV record. At first glance FAST CSV reader others suggested seems like a good place to start as it claims that malformed CSV causes it to fail with a meaningful exception.
我会使用 Fast CSV Reader 因为它非常快并且擅长识别 csv 文件结构。
I would use Fast CSV Reader as it is quite fast and good at identifying csv file structure.