有没有更好的方法来指示“null”? Excel 中的值?
我有一个 Excel 2007 工作簿,其中包含我使用 ADO.NET 导入到 DataTable
对象中的数据表。
通过一些实验,我设法找到两种不同的方法来指示 ADO.NET 应将单元格视为“空”:
- 单元格完全空白。
- 该单元格包含
#N/A
。
不幸的是,这两个问题都是有问题的:
我在 Excel 中的大部分数据列都是通过公式生成的,但在 Excel 中不可能生成导致完全空白单元格的公式。只有完全空白的单元格才会被视为 null(空字符串不起作用)。
任何计算结果为
<块引用>#N/A
的公式(由于实际查找错误或因为使用了NA()
函数)都会 被视为 null。这似乎是理想的解决方案,直到我发现必须打开 Excel 工作簿才能实现此目的。一旦您关闭工作簿,OLEDB 就会突然开始将所有这些#N/A
视为字符串。这会导致在填充 DataTable 时抛出如下异常:输入字符串的格式不正确。无法存储<#N/A>在值列中。预期类型为 Int32。
问题:如何通过 Excel 公式指示空值,而无需在填充DataTable
时打开工作簿?或者如何才能使 #N/A
值即使在工作簿关闭时也被视为 null?
如果很重要,我的连接字符串是使用以下方法构建的:(
var builder = new OleDbConnectionStringBuilder
{
Provider = "Microsoft.ACE.OLEDB.12.0",
DataSource = _workbookPath
};
builder.Add("Extended Properties", "Excel 12.0 Xml;HDR=Yes;IMEX=0");
return builder.ConnectionString;
_workbookPath
是工作簿的完整路径)。
我尝试过 IMEX=0
和 IMEX=1
但没有什么区别。
I have an Excel 2007 workbook that contains tables of data that I'm importing into DataTable
objects using ADO.NET.
Through some experimentation, I've managed to find two different ways to indicate that a cell should be treated as "null" by ADO.NET:
- The cell is completely blank.
- The cell contains
#N/A
.
Unfortunately, both of these are problematic:
Most of my columns of data in Excel are generated via formulas, but it's not possible in Excel to generate a formula that results in a completely blank cell. And only a completely blank cell will be considered null (an empty string will not work).
Any formula that evaluates to
#N/A
(either due to an actual lookup error or because theNA()
function was used) will be considered null. This seemed like the ideal solution until I discovered that the Excel workbook must be open for this to work. As soon as you close the workbook, OLEDB suddenly starts seeing all those#N/A
s as strings. This causes exceptions like the following to be thrown when filling the DataTable:Input string was not in a correct format. Couldn't store <#N/A> in Value Column. Expected type is Int32.
Question: How can I indicate a null value via an Excel formula without having to have the workbook open when I fill the DataTable
? Or what can be done to make #N/A
values be considered null even when the workbook is closed?
In case it's important, my connection string is built using the following method:
var builder = new OleDbConnectionStringBuilder
{
Provider = "Microsoft.ACE.OLEDB.12.0",
DataSource = _workbookPath
};
builder.Add("Extended Properties", "Excel 12.0 Xml;HDR=Yes;IMEX=0");
return builder.ConnectionString;
(_workbookPath
is the full path to the workbook).
I've tried both IMEX=0
and IMEX=1
but it makes no difference.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您遇到了许多非常沮丧的 Excel 用户所经历的障碍。不幸的是,Excel 作为一种公司工具很普遍,而且看起来相当强大,不幸的是,因为每个单元格/列/行都有不同的数据类型,所以使用其他工具(例如 MySQL、SQL Server、R、RapidMiner、SPSS 和名单还在继续。似乎 Excel 2007/2010 没有得到很好的支持,考虑到 32/64 位版本时更是如此,这在当今时代是可耻的。
主要问题是,当 ACE/Jet 访问 Excel 中的每个字段时,它们使用注册表设置“TypeGuessRows”来确定用于评估数据类型的行数。 “要扫描的行数”的默认值为 8 行。注册表设置“TypeGuessRows”可以指定从一 (1) 到十六 (16) 行的整数值,或者您可以指定零 (0) 以扫描所有现有行。如果您无法更改注册表设置(例如在 90% 的办公环境中),则事情会变得困难,因为要猜测的行仅限于前 8 行。
例如,不更改注册表
如果#N/A 第一次出现在前 8 行内,则 IMEX = 1 将以字符串“#N/A”的形式返回错误。如果 IMEX = 0,则 #N/A 将返回“Null”。
如果 #N/A 第一次出现超出前 8 行,则 IMEX = 0 & IMEX = 1 都返回“Null”(假设所需的数据类型是数字)。
通过更改注册表(TypeGuessRows = 0),一切都应该没问题。
也许有 4 个选项:
更改注册表设置 TypeGuessRows = 0
将前 8 行中所有可能的类型变体列为“虚拟数据”(例如备注字段/ nchar(max)/errors#N/A 等)
更正 Excel 中的所有数据类型异常
不要使用 Excel - 真的值得考虑!
不要
编辑:
只是为了把靴子放进去:)另外两件事真正让我烦恼的是;如果工作表上的第一个字段在前 8 行中为空白,并且您无法编辑注册表设置,则整个工作表将返回为空白(许多有趣的对话告诉经理他们合并单元格是愚蠢的!)。另外,如果在 Excel 2007/2010 中,您的部门返回的工作表包含 >255 列/字段,那么如果您需要非连续导入(例如,在第 1 列中键入数据,在第 255 列以上中键入数据),那么您会遇到很大的问题
You're hitting the brickwall that many very frustrated users of Excel are experiencing. Unfortunately Excel as a company tool is widespread and seems quite robust, unfortunately because each cell/column/row has a variant data type it makes it a nightmare to handle with other tools such as MySQL, SQL Server, R, RapidMiner, SPSS and the list goes on. It seems that Excel 2007/2010 is not very well supported and even more so when taking 32/64 bit versions into account, which is scandalous in this day and age.
The main problem is that when ACE/Jet access each field in Excel they use a registry setting 'TypeGuessRows' to determine how many rows to use to assess the datatype. The default for "Rows to Scan" is 8 rows. The registry setting 'TypeGuessRows' can specify an integer value from one (1) to sixteen (16) rows, or you can specify zero (0) to scan all existing rows. If you can't change the registry setting (such as in 90% of office environments) it makes life difficult as the rows to guess are limited to the first 8.
For example, without the registry change
If the first occurrence of #N/A is within the first 8 rows then IMEX = 1 will return the error as a string "#N/A". If IMEX = 0 then an #N/A will return 'Null'.
If the first occurrence of #N/A is beyond the first 8 rows then both IMEX = 0 & IMEX = 1 both return 'Null' (assuming required data type is numeric).
With the registry change (TypeGuessRows = 0) then all should be fine.
Perhaps there are 4 options:
Change the registry setting TypeGuessRows = 0
List all possible type variations in the first 8 rows as 'dummy data' (eg memo fields/nchar(max)/ errors #N/A etc)
Correct ALL data type anomalies in Excel
Don't use Excel - Seriously worth considering!
Edit:
Just to put the boot in :) another 2 things that really annoy me are; if the first field on a sheet is blank over the first 8 rows and you can't edit the registry setting then the whole sheet is returned as blank (Many fun conversations telling managers they're fools for merging cells!). Also, if in Excel 2007/2010 you have a department return a sheet with >255 columns/fields then you have huge problems if you need non-contiguous import (eg key in col 1 and data in cols 255+)