使用 C# 访问 Excel 电子表格有时会返回某些单元格的空白值

发布于 2024-07-08 09:30:04 字数 500 浏览 4 评论 0原文

我需要访问 Excel 电子表格并将电子表格中的数据插入 SQL 数据库。 然而,主键是混合的,大多数是数字,有些是字母数字。

我遇到的问题是,当数字和字母数字键位于同一电子表格中时,字母数字单元格返回空白值,而所有其他单元格返回数据没有问题。

我正在使用 OleDb 方法访问 Excel 文件。 使用命令字符串检索数据后,我将数据放入 DataAdapter 中,然后填充 DataSet。 我迭代数据集中第一个数据表中的所有行 (dr)。

我通过使用 dr["..."].ToString() 引用列。

如果我在 Visual Studio 2008 中调试项目并查看“扩展属性”,通过将鼠标悬停在“dr”上,我可以查看DataRow 的值,但应为字母数字的主键是 {}。 其他值用引号引起来,但空白值带有大括号。

这是 C# 问题还是 Excel 问题?

有没有人以前遇到过这个问题,或者可能找到了解决方法/修复?

提前致谢。

I need to access an excel spreadsheet and insert the data from the spreadsheet into a SQL Database. However the Primary Keys are mixed, most are numeric and some are alpha-numeric.

The problem I have is that when the numeric and alpha-numeric Keys are in the same spreadsheet the alpha-numeric cells return blank values, whereas all the other cells return their data without problems.

I am using the OleDb method to access the Excel file. After retrieving the data with a Command string I put the data into a DataAdapter and then I fill a DataSet. I iterate through all the rows (dr) in the first DataTable in the DataSet.

I reference the columns by using, dr["..."].ToString()

If I debug the project in Visual Studio 2008 and I view the "extended properties", by holding my mouse over the "dr" I can view the values of the DataRow, but the Primary Key that should be alpha-numeric is {}. The other values are enclosed in quotes, but the blank value has braces.

Is this a C# problem or an Excel problem?

Has anyone ever encountered this problem before, or maybe found a workaround/fix?

Thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

幸福还没到 2024-07-15 09:30:04

解决方案:

连接字符串:

Provider=Microsoft.Jet.OLEDB.4.0;数据源=FilePath;扩展
属性=“Excel 8.0;HDR=是;IMEX=1”;

  1. HDR=Yes; 表示第一行包含列名,而不是数据。 HDR=No; 表示相反。

  2. IMEX=1; 告诉驱动程序始终将“混合”(数字、日期、字符串等)数据列读取为文本。 请注意,此选项可能会对 Excel 工作表写入访问产生负面影响。

SQL 语法SELECT * FROM [sheet1$]。 即 Excel 工作表名称后跟 $ 并用 [ ] 括号括起来。

重要提示:

  • 检查位于注册表 REG_DWORD“TypeGuessRows”的 [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel]。 这是不让 Excel 仅使用前 8 行来猜测列数据类型的关键。 将此值设置为 0 以扫描所有行。 这可能会影响性能。

  • 如果 Excel 工作簿受密码保护,则即使通过连接字符串提供正确的密码,也无法打开它进行数据访问。 如果您尝试,您会收到以下错误消息:“无法解密文件。”

Solution:

Connection String:

Provider=Microsoft.Jet.OLEDB.4.0;Data Source=FilePath;Extended
Properties="Excel 8.0;HDR=Yes;IMEX=1";

  1. HDR=Yes; indicates that the first row contains columnnames, not data. HDR=No; indicates the opposite.

  2. IMEX=1; tells the driver to always read "intermixed" (numbers, dates, strings etc) data columns as text. Note that this option might affect excel sheet write access negative.

SQL syntax SELECT * FROM [sheet1$]. I.e. excel worksheet name followed by a $ and wrapped in [ ] brackets.

Important:

  • Check out the [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel] located registry REG_DWORD "TypeGuessRows". That's the key to not letting Excel use only the first 8 rows to guess the columns data type. Set this value to 0 to scan all rows. This might hurt performance.

  • If the Excel workbook is protected by a password, you cannot open it for data access, even by supplying the correct password with your connection string. If you try, you receive the following error message: "Could not decrypt file."

潜移默化 2024-07-15 09:30:04

Excel 数据源为整个列选择列类型。 如果其中一个单元格与该类型不完全匹配,则会留下这样的空白。 我们遇到了问题,打字员在数字列中输入了“8”(数字前的空格,因此 Excel 将其转换为该单元格的字符串)。 对我来说,尝试 .Net Parse 方法是有意义的,因为它们更强大,但我想这不是 Excel 驱动程序的工作方式。

由于我们使用数据库导入服务,我们的修复方法是记录所有以这种方式“失败”的行。 然后,我们返回 XLS 文档并重新输入这些单元格,以确保基础类型正确。 (我们发现仅仅删除空格并不能解决问题——我们必须先清除整个单元格,然后重新输入“8”。)感觉很老套,也不灵活,但这是我们发现的最好的方法。 如果 Excel 驱动程序本身无法正确读取数据,那么一旦您进入 .Net,您就无法从那里取出该数据。

这是 Office 以简单性为名向用户隐藏重要详细信息的另一种情况,因此当您必须精确地使用电源时,这会变得更加困难。

The Excel data source picks a column type for the entire column. If one of the cells doesn't match that type exactly, it leaves blanks like that. We had issues where our typist entered a " 8" (a space before the number, so Excel converted it to a string for that cell) in a numeric column. It would make sense to me that it would try the .Net Parse methods as they are more robust, but I guess that's not how the Excel driver works.

Our fix, since we were using database import services, was to log all the rows that 'failed' this way. Then, we went back to the XLS document and re-typed those cells, to ensure the underlying type was correct. (We found just deleting the space didn't fix it--we had to Clear the whole cell first, than re-type the '8'.) Feels hacky and isn't elagent, but that was the best method we found. If the Excel driver can't read it in correctly by itself, there's nothing you can do to get that data out of there once you're in .Net.

Just another case where Office hides the important details from users in the name of simplicity, and therefore making it more difficult when you have to be exact for power uses.

凯凯我们等你回来 2024-07-15 09:30:04

{} 表示这是某种空对象,而不是字符串。 当您将鼠标悬停在对象上时,您应该能够看到其类型。 同样,当您使用 Quickwatch 查看 dr["..."] 时,您应该看到对象类型。 您收到的对象是什么类型?

The {} means this is some sort of empty object and not a string. When you hover over the object you should be able to see its type. Likewise, when you use quickwatch to view dr["..."] you should see the object type. What type is the object you receive?

故事灯 2024-07-15 09:30:04

ItemArray 是一个对象数组。 因此,我假设我尝试引用的 DataRow 中的“列”是对象类型。

The ItemArray is an Object Array. So I assume that the "column" in the DataRow, that I am trying to reference, is of type object.

花想c 2024-07-15 09:30:04

为了兼容 VISTA,您可以在连接字符串中使用 EXCEL 12.0 驱动程序。 这应该可以解决您的问题。 它做到了我的。

For VISTA compatibility you can use EXCEL 12.0 driver in connection string. This should resolve your issue. It did mine.

迷途知返 2024-07-15 09:30:04

解决方案:

  1. 您设置 HDR=No,以便第一行不被视为列标题。
    连接字符串:Provider=Microsoft.Jet.OLEDB.4.0;数据源=FilePath;扩展属性=“Excel 8.0;HDR=No;IMEX=1”;
  2. 您忽略第一行,并通过任何您想要的方式访问数据(DataTable、DataReader 等)。 您可以通过数字索引而不是列名来访问列。

这对我有用。 这样你就不必修改寄存器了!

Solution:

  1. You put HDR=No so that the first row is not considered the column header.
    Connection String: Provider=Microsoft.Jet.OLEDB.4.0;Data Source=FilePath;Extended Properties="Excel 8.0;HDR=No;IMEX=1";
  2. You ignore the first row and you acces the data by any means you want (DataTable, DataReader ect). You acces the columns by numeric indexes, instead of column names.

It worked for me. This way you don't have to modify registers!

相守太难 2024-07-15 09:30:04

我在此处回答了类似的问题。 为了您的方便,我在这里复制并粘贴了相同的答案:

我遇到了同样的问题,但能够解决它,而无需求助于 Excel COM 接口或第 3 方软件。 它涉及一点处理开销,但似乎对我有用。

  1. 首先读入数据以获取列名称,
  2. 然后使用每个列创建一个新的 DataSet,将每个列的 DataType 设置为字符串。
  3. 再次将数据读入这个新的
    数据集。 瞧——科学的
    符号现在消失了,所有内容都作为字符串读入。

这里有一些代码说明了这一点,作为额外的好处,它甚至是 StyleCopped!

public void ImportSpreadsheet(string path)
{
    string extendedProperties = "Excel 12.0;HDR=YES;IMEX=1";
    string connectionString = string.Format(
        CultureInfo.CurrentCulture,
        "Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"{1}\"",
        path,
        extendedProperties);

    using (OleDbConnection connection = new OleDbConnection(connectionString))
    {
        using (OleDbCommand command = connection.CreateCommand())
        {
            command.CommandText = "SELECT * FROM [Worksheet1$]";
            connection.Open();

            using (OleDbDataAdapter adapter = new OleDbDataAdapter(command))
            using (DataSet columnDataSet = new DataSet())
            using (DataSet dataSet = new DataSet())
            {
                columnDataSet.Locale = CultureInfo.CurrentCulture;
                adapter.Fill(columnDataSet);

                if (columnDataSet.Tables.Count == 1)
                {
                    var worksheet = columnDataSet.Tables[0];

                    // Now that we have a valid worksheet read in, with column names, we can create a
                    // new DataSet with a table that has preset columns that are all of type string.
                    // This fixes a problem where the OLEDB provider is trying to guess the data types
                    // of the cells and strange data appears, such as scientific notation on some cells.
                    dataSet.Tables.Add("WorksheetData");
                    DataTable tempTable = dataSet.Tables[0];

                    foreach (DataColumn column in worksheet.Columns)
                    {
                        tempTable.Columns.Add(column.ColumnName, typeof(string));
                    }

                    adapter.Fill(dataSet, "WorksheetData");

                    if (dataSet.Tables.Count == 1)
                    {
                        worksheet = dataSet.Tables[0];

                        foreach (var row in worksheet.Rows)
                        {
                            // TODO: Consume some data.
                        }
                    }
                }
            }
        }
    }
}

I answered a similar question here. Here I've copied and pasted the same answer for your convenience:

I had this same problem, but was able to work around it without resorting to the Excel COM interface or 3rd party software. It involves a little processing overhead, but appears to be working for me.

  1. First read in the data to get the column names
  2. Then create a new DataSet with each of these columns, setting each of their DataTypes to string.
  3. Read the data in again into this new
    dataset. Voila - the scientific
    notation is now gone and everything is read in as a string.

Here's some code that illustrates this, and as an added bonus, it's even StyleCopped!

public void ImportSpreadsheet(string path)
{
    string extendedProperties = "Excel 12.0;HDR=YES;IMEX=1";
    string connectionString = string.Format(
        CultureInfo.CurrentCulture,
        "Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"{1}\"",
        path,
        extendedProperties);

    using (OleDbConnection connection = new OleDbConnection(connectionString))
    {
        using (OleDbCommand command = connection.CreateCommand())
        {
            command.CommandText = "SELECT * FROM [Worksheet1$]";
            connection.Open();

            using (OleDbDataAdapter adapter = new OleDbDataAdapter(command))
            using (DataSet columnDataSet = new DataSet())
            using (DataSet dataSet = new DataSet())
            {
                columnDataSet.Locale = CultureInfo.CurrentCulture;
                adapter.Fill(columnDataSet);

                if (columnDataSet.Tables.Count == 1)
                {
                    var worksheet = columnDataSet.Tables[0];

                    // Now that we have a valid worksheet read in, with column names, we can create a
                    // new DataSet with a table that has preset columns that are all of type string.
                    // This fixes a problem where the OLEDB provider is trying to guess the data types
                    // of the cells and strange data appears, such as scientific notation on some cells.
                    dataSet.Tables.Add("WorksheetData");
                    DataTable tempTable = dataSet.Tables[0];

                    foreach (DataColumn column in worksheet.Columns)
                    {
                        tempTable.Columns.Add(column.ColumnName, typeof(string));
                    }

                    adapter.Fill(dataSet, "WorksheetData");

                    if (dataSet.Tables.Count == 1)
                    {
                        worksheet = dataSet.Tables[0];

                        foreach (var row in worksheet.Rows)
                        {
                            // TODO: Consume some data.
                        }
                    }
                }
            }
        }
    }
}
花桑 2024-07-15 09:30:04

按 ascii 代码按降序对 xls 文件中的记录进行排序,以便字母数字字段将显示在标题行下方的顶部。 这确保读取的第一行数据将数据类型定义为“varchar”或“nvarchar”

Order the records in the xls file by ascii code in descending order so that alpha-numeric fields will appear at the top below the header row. This ensures that the first row of data read will define the data type as "varchar" or "nvarchar"

ぶ宁プ宁ぶ 2024-07-15 09:30:04

嗨,所有这些代码也获取字母数字值

using System.Data.OleDb;

string ConnectionString = @"Provider=Microsoft.Jet.OLEDB.4.0;" + "Data Source=" + filepath + ";" + "Extended Properties="+(char)34+"Excel 8.0;IMEX=1;"+(char)34;

string CommandText = "select * from [Sheet1$]";

OleDbConnection myConnection = new OleDbConnection(ConnectionString);
myConnection.Open();

OleDbDataAdapter myAdapter = new OleDbDataAdapter(CommandText, myConnection);

ds = null;
ds = new DataSet();
myAdapter.Fill(ds);

hi all this code is gets alphanumeric values also

using System.Data.OleDb;

string ConnectionString = @"Provider=Microsoft.Jet.OLEDB.4.0;" + "Data Source=" + filepath + ";" + "Extended Properties="+(char)34+"Excel 8.0;IMEX=1;"+(char)34;

string CommandText = "select * from [Sheet1$]";

OleDbConnection myConnection = new OleDbConnection(ConnectionString);
myConnection.Open();

OleDbDataAdapter myAdapter = new OleDbDataAdapter(CommandText, myConnection);

ds = null;
ds = new DataSet();
myAdapter.Fill(ds);
月亮坠入山谷 2024-07-15 09:30:04

这并不完全正确! 显然,如果前 8 行为空,无论 IMEX=1 为何,Jet/ACE 始终假定为字符串类型。 即使我将注册表中的行读取为 0,我仍然遇到同样的问题。 这是让它正常工作的唯一可靠的方法:
<代码> <代码>

try
{
    Console.Write(wsReader.GetDouble(j).ToString());
}
catch   //Lame unfixable bug
{
    Console.Write(wsReader.GetString(j));
}

>

This isn't completely right! Apparently, Jet/ACE ALWAYS assumes a string type if the first 8 rows are blank, regardless of IMEX=1. Even when I made the rows read to 0 in the registry, I still had the same problem. This was the only sure fire way to get it to work:

try
{
    Console.Write(wsReader.GetDouble(j).ToString());
}
catch   //Lame unfixable bug
{
    Console.Write(wsReader.GetString(j));
}

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文