ASP.NET 中的 Excel 读取:如果列具有不同的数据格式,则不会读取数据

发布于 2024-08-10 07:07:09 字数 727 浏览 9 评论 0原文

我有一个 asp.net C# 应用程序,我正在其中使用 OLEDBConnection 读取电子表格的内容。我正在使用下面的代码行从 Excel 电子表格中读取数据。

 OleDbConnection con = new OleDbConnection(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fullFilePath + ";Extended Properties='Excel 8.0;HDR=Yes;IMEX=1'");

我的一列在各行中包含各种格式的数据,如字符串、数字、日期等。运行此程序时,当数据格式不同时,它不会从 Excel 文件中读取该值。我在网上搜索了很多,发现我们需要在连接字符串中提及IMEX属性。我添加了这一点,但没有积极的回应!

经过大量浏览后,我发现任何内置 Excel 驱动程序都会查询工作表的前 8 行,然后确定(未经您的许可或不知情)它是什么类型的列,从而忽略任何不符合的列稍后将在工作表中介绍此数据类型。

http://www.mattjwilson.com /blog/2009/02/13/microsoft-excel-drivers-and-imex/

有没有办法解决这个问题?

I have an asp.net C# application where i am reading the contents of a spreadsheet using OLEDBConnection. I am using the below line of code to read from the excel spreadsheet.

 OleDbConnection con = new OleDbConnection(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fullFilePath + ";Extended Properties='Excel 8.0;HDR=Yes;IMEX=1'");

One of my column has data in various formats like strings,numbers,date etc in various rows.When running this ,When the data format is different,its not reading that value fromthe excel file. I searched in net a lot and found that we need to mention IMEX proprety in connection string.I added that,but no positive response !.

After surfine a lot, ifound that Any built-in Excel driver will query the first 8 rows of a sheet and then make a determination (without your permission or knowledge) as to what type of column it is, thereby ignoring anything that doesn’t meet this data type later in the sheet.

http://www.mattjwilson.com/blog/2009/02/13/microsoft-excel-drivers-and-imex/

Is there anyway to get rid of this problem ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

压抑⊿情绪 2024-08-17 07:07:09

您正在遇到 JET 引擎的众多有趣功能之一。这个基本上会对单列的每行中的所有数据进行采样,并尝试猜测数据格式。如果您希望您的代码“正常工作”,那么有一个注册表设置可以帮助您实现这一点。但请注意,此注册表设置将影响 JET 如何处理系统上的所有导入,而不仅仅是您的特定导入。

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel]
"ImportMixedTypes"="Text"
"TypeGuessRows"=dword:00000000

此注册表设置将告诉 JET 在猜测格式之前检查列中每一行的格式。如果发现混合内容,它将将该行导入为文本。

默认情况下
JET 在类型猜测时测试前 25 行。

或者,您可以将 TypeGuessRows 更改为 1,JET 将仅在类型猜测时检查第一行。这意味着,如果第一行是数字,第二行是字符串,JET 将假定所有行都是数字,并且您将无法使用 ADO.NET 读取它们。

另一个警告:请确保在编辑注册表时小心。如果不小心的话,您的系统可能会很快遭到破坏。

You are running into one of the many fun features of the JET engine. This one will basically sample all the data in each row for a single column and it will try to guess the data format. If you want your code to "just work" then there is a registry setting that will help with this. However be forewarned that this registry setting will affect how JET works with all imports on a system, not just your particular import.

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel]
"ImportMixedTypes"="Text"
"TypeGuessRows"=dword:00000000

This registry setting will tell JET to check the format of every row in a column before guessing a format. If it finds mixed content it will import the row as text.

By default
JET tests the first 25 rows when type guessing.

Alternatively you can change TypeGuessRows to 1 and JET will check the first row only when type guessing. That means if the first row is a number and the second row is a string JET will assume all rows are numbers and you will not be able to read them using ADO.NET

Another caveat: Make sure you are careful when editing your registry. You can decimate your system very quickly if you do not use care.

爱,才寂寞 2024-08-17 07:07:09

更新:看来微软确实不推荐 在服务器上使用 Excel COM 服务。尽管如此,许多开发人员仍然这样做,无论是在非 .NET(正如我的雇主所做的)还是在 .NET 上(请参阅 此处)环境,因为替代品成本高昂。所有问题大多都是可以解决的(除了大容量应用程序中潜在的可扩展性和性能问题以及某些情况下的许可证问题)。昂贵的替代方案是使用第三方解决方案,例如 这个

当一列中有不同数据类型的数据时,不应使用 OleDbConnection。例如,您可以尝试使用 Excel COM/OLE API 从 Excel 中读取数据(从 此处,可能包含错误):

将以下引用包含到项目中:

Microsoft Excel 10.0 对象库

Microsoft Office 10.0 对象库

包含名称空间 Excel。

  using Excel;
  ...
      Excel.ApplicationClass xl = new Excel.Application();
      xl.Visible = false;
      xl.UserControl = false;
      Excel.Workbook theWorkbook = xl.Workbooks.Open(
         fileName, 0, true, 5,
          "", "", true, Excel.XlPlatform.xlWindows, "\t", false, false,
          0, true); 
     Excel.Sheets sheets = theWorkbook.Worksheets;
     Excel.Worksheet worksheet = (Excel.Worksheet)sheets.get_Item(1);
     System.Array myvalues;
     Excel.Range range = worksheet.get_Range("A1", "E1".ToString());
     myvalues = (System.Array)range.Cells.Value;

重要的!您应该释放所使用的资源。来自此处

// Need all following code to clean up and extingush all references!!!
theWorkbook.Close(null,null,null);
xl.Workbooks.Close();
xl.Quit();
System.Runtime.InteropServices.Marshal.ReleaseComObject (range);
System.Runtime.InteropServices.Marshal.ReleaseComObject (sheets);
System.Runtime.InteropServices.Marshal.ReleaseComObject (xl);
System.Runtime.InteropServices.Marshal.ReleaseComObject (worksheet);
System.Runtime.InteropServices.Marshal.ReleaseComObject (theWorkbook);
worksheet=null;
sheets=null;
theWorkbook=null;
xl = null;
GC.Collect(); // force final cleanup!

Update: it seems like Microsoft really does not recommend using Excel COM services on servers. Still, many developers do, both on non-.NET (as my employer does) and .NET (see here) enviroment, as alternatives are costly. All problems are mostly solvable (aside from potential scalability and performance problems in high-volume applications and in some cases licence problems). Costly alternatives are using third-party solutions like this.

You should not use OleDbConnection when you have data of different datatypes in one column. You can try to read from Excel using Excel COM/OLE API, for example (compiled from here, may contain errors):

Include the following reference into the project :

Microsoft Excel 10.0 Object Library

Microsoft Office 10.0 Object Library

Include the name space Excel.

  using Excel;
  ...
      Excel.ApplicationClass xl = new Excel.Application();
      xl.Visible = false;
      xl.UserControl = false;
      Excel.Workbook theWorkbook = xl.Workbooks.Open(
         fileName, 0, true, 5,
          "", "", true, Excel.XlPlatform.xlWindows, "\t", false, false,
          0, true); 
     Excel.Sheets sheets = theWorkbook.Worksheets;
     Excel.Worksheet worksheet = (Excel.Worksheet)sheets.get_Item(1);
     System.Array myvalues;
     Excel.Range range = worksheet.get_Range("A1", "E1".ToString());
     myvalues = (System.Array)range.Cells.Value;

Important! You should free the resources used. From here:

// Need all following code to clean up and extingush all references!!!
theWorkbook.Close(null,null,null);
xl.Workbooks.Close();
xl.Quit();
System.Runtime.InteropServices.Marshal.ReleaseComObject (range);
System.Runtime.InteropServices.Marshal.ReleaseComObject (sheets);
System.Runtime.InteropServices.Marshal.ReleaseComObject (xl);
System.Runtime.InteropServices.Marshal.ReleaseComObject (worksheet);
System.Runtime.InteropServices.Marshal.ReleaseComObject (theWorkbook);
worksheet=null;
sheets=null;
theWorkbook=null;
xl = null;
GC.Collect(); // force final cleanup!
寻找一个思念的角度 2024-08-17 07:07:09

SpreadsheetGear for .NET 可以读取、写入、计算等...Excel 工作簿,并允许您访问使用 IWorksheet.Cells[rowIndex, colIndex].Value 或 IWorksheet.Cells[rowIndex, colIndex].Text 等 API 的任何单元格的基础数据(数字、文本、逻辑、错误)或任何单元格的格式化文本。每个列/单元格中的数据类型没有限制。 SpreadsheetGear 是 100% 安全的 .NET 代码(没有 COM 互操作、没有不安全的本机调用等),因此它比其他选项更容易部署 - 特别是在服务器场景中。

您可以此处查看实时示例并下载免费试用版此处

免责声明:我拥有 SpreadsheetGear LLC

SpreadsheetGear for .NET can read, write, calculate, etc... Excel workbooks and allows you to access the underlying data (number, text, logical, error) of any cell or the formatted text of any cell using APIs such as IWorksheet.Cells[rowIndex, colIndex].Value or IWorksheet.Cells[rowIndex, colIndex].Text. There is no limitation based on the type of data in each column / cell. SpreadsheetGear is 100% safe .NET code (no COM interop, no unsafe native calls, etc...) so it is easier to deploy than other options - especially in server scenarios.

You can see live samples here and download the free trial here.

Disclaimer: I own SpreadsheetGear LLC

始终不够爱げ你 2024-08-17 07:07:09

当其他一切都失败时,这就是我所做的...从 Excel 导入时,我在连接字符串中指定了 HDR = NO 。这会将标题导入为第一行,从而使所有列数据类型都为文本。之后是一个简单的函数来提及数据表的列名。像下面的代码一样......

private DataTable NameHeaderRows(DataTable dt)
{
    for (int i = 0; i < dt.Columns.Count; i++)
    {
        dt.Columns[i].ColumnName = dt.Rows[0][i].ToString();

    }
    dt.Rows.RemoveAt(0);
    return dt;
}

我知道这很乏味,但没有找到任何可行的解决方案。任何其他建议都是受欢迎的。

When everything else failed this is what I did... While importing from excel I specified HDR = NO in connection string. This imported the header as the first row, thus making all the column datatypes as text. After that a simple function to mention the columnname for the datatable. Something like the below code...

private DataTable NameHeaderRows(DataTable dt)
{
    for (int i = 0; i < dt.Columns.Count; i++)
    {
        dt.Columns[i].ColumnName = dt.Rows[0][i].ToString();

    }
    dt.Rows.RemoveAt(0);
    return dt;
}

I know it is tedious, but did not find any feasible solution. Any other suggestion is a welcome.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文