使用 C# 和 Open XML 从 Excel 到 DataTable

发布于 2024-09-11 00:47:05 字数 161 浏览 8 评论 0原文

我使用的是 Visual Studio 2008,需要使用 Open XML SDK 2.0 从 Excel 工作表创建一个 DataTable。我需要使用工作表第一行的 DataTable 列创建它,并使用其余值完成它。

有谁有示例代码或链接可以帮助我做到这一点?

I'm using Visual Studio 2008 and I need create a DataTable from a Excel Sheet using the Open XML SDK 2.0. I need to create it with the DataTable columns with the first row of the sheet and complete it with the rest of values.

Does anyone have a example code or a link that can help me to do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

铜锣湾横着走 2024-09-18 00:47:06
 Public Shared Function ExcelToDataTable(filename As String) As DataTable
        Try

            Dim dt As New DataTable()

            Using doc As SpreadsheetDocument = SpreadsheetDocument.Open(filename, False)

                Dim workbookPart As WorkbookPart = doc.WorkbookPart
                Dim sheets As IEnumerable(Of Sheet) = doc.WorkbookPart.Workbook.GetFirstChild(Of Sheets)().Elements(Of Sheet)()
                Dim relationshipId As String = sheets.First().Id.Value
                Dim worksheetPart As WorksheetPart = DirectCast(doc.WorkbookPart.GetPartById(relationshipId), WorksheetPart)
                Dim workSheet As Worksheet = worksheetPart.Worksheet
                Dim sheetData As SheetData = workSheet.GetFirstChild(Of SheetData)()
                Dim rows As IEnumerable(Of Row) = sheetData.Descendants(Of Row)()

                For Each cell As Cell In rows.ElementAt(0)
                    dt.Columns.Add(GetCellValue(doc, cell))
                Next

                For Each row As Row In rows
                    'this will also include your header row...
                    Dim tempRow As DataRow = dt.NewRow()

                    For i As Integer = 0 To row.Descendants(Of Cell)().Count() - 1
                        tempRow(i) = GetCellValue(doc, row.Descendants(Of Cell)().ElementAt(i))
                    Next

                    dt.Rows.Add(tempRow)
                Next
            End Using

            dt.Rows.RemoveAt(0)

            Return dt

        Catch ex As Exception
            Throw ex
        End Try
    End Function


    Public Shared Function GetCellValue(document As SpreadsheetDocument, cell As Cell) As String
        Try

            If IsNothing(cell.CellValue) Then
                Return ""
            End If

            Dim value As String = cell.CellValue.InnerXml

            If cell.DataType IsNot Nothing AndAlso cell.DataType.Value = CellValues.SharedString Then
                Dim stringTablePart As SharedStringTablePart = document.WorkbookPart.SharedStringTablePart
                Return stringTablePart.SharedStringTable.ChildElements(Int32.Parse(value)).InnerText
            Else
                Return value
            End If

        Catch ex As Exception
            Return ""
        End Try
    End Function
 Public Shared Function ExcelToDataTable(filename As String) As DataTable
        Try

            Dim dt As New DataTable()

            Using doc As SpreadsheetDocument = SpreadsheetDocument.Open(filename, False)

                Dim workbookPart As WorkbookPart = doc.WorkbookPart
                Dim sheets As IEnumerable(Of Sheet) = doc.WorkbookPart.Workbook.GetFirstChild(Of Sheets)().Elements(Of Sheet)()
                Dim relationshipId As String = sheets.First().Id.Value
                Dim worksheetPart As WorksheetPart = DirectCast(doc.WorkbookPart.GetPartById(relationshipId), WorksheetPart)
                Dim workSheet As Worksheet = worksheetPart.Worksheet
                Dim sheetData As SheetData = workSheet.GetFirstChild(Of SheetData)()
                Dim rows As IEnumerable(Of Row) = sheetData.Descendants(Of Row)()

                For Each cell As Cell In rows.ElementAt(0)
                    dt.Columns.Add(GetCellValue(doc, cell))
                Next

                For Each row As Row In rows
                    'this will also include your header row...
                    Dim tempRow As DataRow = dt.NewRow()

                    For i As Integer = 0 To row.Descendants(Of Cell)().Count() - 1
                        tempRow(i) = GetCellValue(doc, row.Descendants(Of Cell)().ElementAt(i))
                    Next

                    dt.Rows.Add(tempRow)
                Next
            End Using

            dt.Rows.RemoveAt(0)

            Return dt

        Catch ex As Exception
            Throw ex
        End Try
    End Function


    Public Shared Function GetCellValue(document As SpreadsheetDocument, cell As Cell) As String
        Try

            If IsNothing(cell.CellValue) Then
                Return ""
            End If

            Dim value As String = cell.CellValue.InnerXml

            If cell.DataType IsNot Nothing AndAlso cell.DataType.Value = CellValues.SharedString Then
                Dim stringTablePart As SharedStringTablePart = document.WorkbookPart.SharedStringTablePart
                Return stringTablePart.SharedStringTable.ChildElements(Int32.Parse(value)).InnerText
            Else
                Return value
            End If

        Catch ex As Exception
            Return ""
        End Try
    End Function
岁月打碎记忆 2024-09-18 00:47:06

我知道这个帖子开始已经很久了。然而,上述解决方案对我来说都不起作用。空单元格问题等。

我在 GitHub 上找到了一个非常好的带有“MIT”许可证的解决方案:
https://github.com/ExcelDataReader/ExcelDataReader
这对于 C# 和 VBnet 应用程序都适用。
来自 VBNET 的示例调用(C# 的示例代码位于 GitHub 上):

        Using stream As FileStream = New FileStream(DataPath & "\" & fName.Name, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)

            Using reader As IExcelDataReader = ExcelReaderFactory.CreateReader(stream)
                ds = reader.AsDataSet(New ExcelDataSetConfiguration() With {
                    .UseColumnDataType = False,
                    .ConfigureDataTable = Function(tableReader) New ExcelDataTableConfiguration() With {
                        .UseHeaderRow = True
                    }
                })
            End Using

        End Using

结果是一个数据集,工作簿中的每个工作表都有一个表。

我真的很喜欢自己编译用 C# 编写的 dll,而不是使用现成的 dll。这样我就可以控制向客户交付的内容。

I know it is a long time ago since this thread started. However, none of the solutions above did not really work for me. Empty cells issue and others.

I found a very good solution with 'MIT' license on GitHub:
https://github.com/ExcelDataReader/ExcelDataReader
This worked for me for both C# and VBnet applications.
Sample call from VBNET (the sample code for c# is on GitHub) :

        Using stream As FileStream = New FileStream(DataPath & "\" & fName.Name, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)

            Using reader As IExcelDataReader = ExcelReaderFactory.CreateReader(stream)
                ds = reader.AsDataSet(New ExcelDataSetConfiguration() With {
                    .UseColumnDataType = False,
                    .ConfigureDataTable = Function(tableReader) New ExcelDataTableConfiguration() With {
                        .UseHeaderRow = True
                    }
                })
            End Using

        End Using

The result was a dataset with one table for each sheet in the workbook.

An I really like to compile the dll made in C# by myself rather then using a ready dll. So I can control what I am delivering to customers.

人生戏 2024-09-18 00:47:06

根据我的要求,我从 DLMAN 的最佳答案中修改了“ExcelUtility”Read() 的几部分代码。

还添加了 saveDataTablesToExcel() 和 ExportDataSet() 方法以在 xlsx 文件中保存多个 DataTable。

以下是新“ExcelUtility”类的完整代码及其用法。

    using System;
    using System.Collections.Generic;
    using System.Data;
    using System.Linq;
    using System.Text;
    using System.Threading.Tasks;
    using DocumentFormat.OpenXml.Packaging;
    using DocumentFormat.OpenXml.Spreadsheet;

    namespace myNamespace
    {
        static class ExcelUtility
        {
            // SS Note: isHeaderOnTopRow functionality is to set column names as the first row of 'sheet'
            public static DataTable[] Read(string path, bool isHeaderOnTopRow = false)
            {
                try
                {
                    using (var ssDoc = SpreadsheetDocument.Open(path, false))
                    {
                        var sheets = ssDoc.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
                        DataTable[] dtArray = new DataTable[sheets.ToList().Count];
                        int counti = 0;
                        foreach (Sheet sheet in sheets)
                        {
                            var dt = new DataTable();

                            var relationshipId = sheet.Id.Value;
                            var worksheetPart = (WorksheetPart)ssDoc.WorkbookPart.GetPartById(relationshipId);
                            var workSheet = worksheetPart.Worksheet;
                            var sheetData = workSheet.GetFirstChild<SheetData>();
                            var rows = sheetData.Descendants<Row>().ToList();

                            int rowIndex = 0;
                            foreach (var row in rows) //this will also include your header row...
                            {
                                var tempRow = dt.NewRow();

                                var colCount = row.Descendants<Cell>().Count();
                                int colIndex = 0;
                                foreach (var cell in row.Descendants<Cell>())
                                {
                                    var index = GetIndex(cell.CellReference);
                                    // SS Note: ADDED next line as we were getting cell.CellReference (or index) as -1 in our provided xlsx file.
                                    index = (index < 0 ? colIndex++ : index);

                                    // Add Columns
                                    for (var i = dt.Columns.Count; i <= index; i++)
                                        dt.Columns.Add();

                                    if (isHeaderOnTopRow && rowIndex == 0)
                                    {
                                        string heading = GetCellValue(ssDoc, cell);
                                        heading = (heading.Length > 0 ? heading : $"Column{index + 1}");
                                        dt.Columns[index].ColumnName = heading;
                                    }
                                    else
                                    {
                                        tempRow[index] = GetCellValue(ssDoc, cell);
                                    }
                                }
                                if (rowIndex > 0 || isHeaderOnTopRow == false)
                                {
                                    dt.Rows.Add(tempRow);
                                }
                                rowIndex++;
                            }
                            dtArray[counti++] = dt;
                        }
                        return dtArray;
                    }
                }
                catch (Exception e)
                {
                    Console.WriteLine(e);
                }
                return null;
            }

            private static string GetCellValue(SpreadsheetDocument document, Cell cell)
            {
                var stringTablePart = document.WorkbookPart.SharedStringTablePart;
                var value = cell.CellValue.InnerXml;

                if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
                    return stringTablePart.SharedStringTable.ChildElements[int.Parse(value)].InnerText;

                return value;
            }

            public static int GetIndex(string name)
            {
                if (string.IsNullOrWhiteSpace(name))
                    return -1;

                int index = 0;
                foreach (var ch in name)
                {
                    if (char.IsLetter(ch))
                    {
                        int value = ch - 'A' + 1;
                        index = value + index * 26;
                    }
                    else
                        break;
                }

                return index - 1;
            }

            public static void ExportDataSet(DataSet ds, string destination)
            {
                try
                {
                    using (var workbook = SpreadsheetDocument.Create(destination, DocumentFormat.OpenXml.SpreadsheetDocumentType.Workbook))
                    {
                        var workbookPart = workbook.AddWorkbookPart();

                        workbook.WorkbookPart.Workbook = new DocumentFormat.OpenXml.Spreadsheet.Workbook();

                        workbook.WorkbookPart.Workbook.Sheets = new DocumentFormat.OpenXml.Spreadsheet.Sheets();

                        foreach (System.Data.DataTable table in ds.Tables)
                        {

                            var sheetPart = workbook.WorkbookPart.AddNewPart<WorksheetPart>();
                            var sheetData = new DocumentFormat.OpenXml.Spreadsheet.SheetData();
                            sheetPart.Worksheet = new DocumentFormat.OpenXml.Spreadsheet.Worksheet(sheetData);

                            DocumentFormat.OpenXml.Spreadsheet.Sheets sheets = workbook.WorkbookPart.Workbook.GetFirstChild<DocumentFormat.OpenXml.Spreadsheet.Sheets>();
                            string relationshipId = workbook.WorkbookPart.GetIdOfPart(sheetPart);

                            uint sheetId = 1;
                            if (sheets.Elements<DocumentFormat.OpenXml.Spreadsheet.Sheet>().Count() > 0)
                            {
                                sheetId =
                                    sheets.Elements<DocumentFormat.OpenXml.Spreadsheet.Sheet>().Select(s => s.SheetId.Value).Max() + 1;
                            }

                            DocumentFormat.OpenXml.Spreadsheet.Sheet sheet = new DocumentFormat.OpenXml.Spreadsheet.Sheet() { Id = relationshipId, SheetId = sheetId, Name = table.TableName };
                            sheets.Append(sheet);

                            DocumentFormat.OpenXml.Spreadsheet.Row headerRow = new DocumentFormat.OpenXml.Spreadsheet.Row();

                            List<String> columns = new List<string>();
                            foreach (System.Data.DataColumn column in table.Columns)
                            {
                                columns.Add(column.ColumnName);

                                DocumentFormat.OpenXml.Spreadsheet.Cell cell = new DocumentFormat.OpenXml.Spreadsheet.Cell();
                                cell.DataType = DocumentFormat.OpenXml.Spreadsheet.CellValues.String;
                                cell.CellValue = new DocumentFormat.OpenXml.Spreadsheet.CellValue(column.ColumnName);
                                headerRow.AppendChild(cell);
                            }


                            sheetData.AppendChild(headerRow);

                            foreach (System.Data.DataRow dsrow in table.Rows)
                            {
                                DocumentFormat.OpenXml.Spreadsheet.Row newRow = new DocumentFormat.OpenXml.Spreadsheet.Row();
                                foreach (String col in columns)
                                {
                                    DocumentFormat.OpenXml.Spreadsheet.Cell cell = new DocumentFormat.OpenXml.Spreadsheet.Cell();
                                    cell.DataType = DocumentFormat.OpenXml.Spreadsheet.CellValues.String;
                                    cell.CellValue = new DocumentFormat.OpenXml.Spreadsheet.CellValue(dsrow[col].ToString()); //
                                    newRow.AppendChild(cell);
                                }

                                sheetData.AppendChild(newRow);
                            }

                        }
                    }
                } 
                catch (Exception e)
                {
                    Console.WriteLine(e);
                }
            }
            
            public static void saveDataTablesToExcel(DataTable[] dataTables, string saveToFilePath)
            {
                // Create a DataSet
                DataSet dataSet = new DataSet("Tables");
                // We can add multiple DataTable to DataSet
                foreach (DataTable dt in dataTables)
                {
                    dataSet.Tables.Add(dt);
                }

                ExportDataSet(dataSet, saveToFilePath);
            }

        }
    }
    

用法 :

    // save three datatables in xlsx file
    DataTable[] dataTables = new DataTable[3];
    dataTables[0] = firstDataTable;
    dataTables[1] = secondDataTable;
    dataTables[2] = thirdDataTable;
    string fileName = "saved.xlsx";
    saveDataTablesToExcel(dataTables, $"{ExcelFileSaveFolder}{fileName}");

    // retrieve data from first sheet and set it to 'returnTable'
    DataTable returnTable = null;
    var path = $"{ExcelFileSaveFolder}{fileName}";
    DataTable[] getDataTables = ExcelUtility.Read(path, true);
    if (getDataTables != null && getDataTables.Length > 0)
        returnTable = getDataTables[0];

As per my requirements, I have modified few part of code of 'ExcelUtility' Read() from the best answer by D.L.MAN.

Also added saveDataTablesToExcel() and ExportDataSet() method to save multiple DataTables in xlsx file.

Following is the full code of new 'ExcelUtility' class and it's usage.

    using System;
    using System.Collections.Generic;
    using System.Data;
    using System.Linq;
    using System.Text;
    using System.Threading.Tasks;
    using DocumentFormat.OpenXml.Packaging;
    using DocumentFormat.OpenXml.Spreadsheet;

    namespace myNamespace
    {
        static class ExcelUtility
        {
            // SS Note: isHeaderOnTopRow functionality is to set column names as the first row of 'sheet'
            public static DataTable[] Read(string path, bool isHeaderOnTopRow = false)
            {
                try
                {
                    using (var ssDoc = SpreadsheetDocument.Open(path, false))
                    {
                        var sheets = ssDoc.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
                        DataTable[] dtArray = new DataTable[sheets.ToList().Count];
                        int counti = 0;
                        foreach (Sheet sheet in sheets)
                        {
                            var dt = new DataTable();

                            var relationshipId = sheet.Id.Value;
                            var worksheetPart = (WorksheetPart)ssDoc.WorkbookPart.GetPartById(relationshipId);
                            var workSheet = worksheetPart.Worksheet;
                            var sheetData = workSheet.GetFirstChild<SheetData>();
                            var rows = sheetData.Descendants<Row>().ToList();

                            int rowIndex = 0;
                            foreach (var row in rows) //this will also include your header row...
                            {
                                var tempRow = dt.NewRow();

                                var colCount = row.Descendants<Cell>().Count();
                                int colIndex = 0;
                                foreach (var cell in row.Descendants<Cell>())
                                {
                                    var index = GetIndex(cell.CellReference);
                                    // SS Note: ADDED next line as we were getting cell.CellReference (or index) as -1 in our provided xlsx file.
                                    index = (index < 0 ? colIndex++ : index);

                                    // Add Columns
                                    for (var i = dt.Columns.Count; i <= index; i++)
                                        dt.Columns.Add();

                                    if (isHeaderOnTopRow && rowIndex == 0)
                                    {
                                        string heading = GetCellValue(ssDoc, cell);
                                        heading = (heading.Length > 0 ? heading : 
quot;Column{index + 1}");
                                        dt.Columns[index].ColumnName = heading;
                                    }
                                    else
                                    {
                                        tempRow[index] = GetCellValue(ssDoc, cell);
                                    }
                                }
                                if (rowIndex > 0 || isHeaderOnTopRow == false)
                                {
                                    dt.Rows.Add(tempRow);
                                }
                                rowIndex++;
                            }
                            dtArray[counti++] = dt;
                        }
                        return dtArray;
                    }
                }
                catch (Exception e)
                {
                    Console.WriteLine(e);
                }
                return null;
            }

            private static string GetCellValue(SpreadsheetDocument document, Cell cell)
            {
                var stringTablePart = document.WorkbookPart.SharedStringTablePart;
                var value = cell.CellValue.InnerXml;

                if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
                    return stringTablePart.SharedStringTable.ChildElements[int.Parse(value)].InnerText;

                return value;
            }

            public static int GetIndex(string name)
            {
                if (string.IsNullOrWhiteSpace(name))
                    return -1;

                int index = 0;
                foreach (var ch in name)
                {
                    if (char.IsLetter(ch))
                    {
                        int value = ch - 'A' + 1;
                        index = value + index * 26;
                    }
                    else
                        break;
                }

                return index - 1;
            }

            public static void ExportDataSet(DataSet ds, string destination)
            {
                try
                {
                    using (var workbook = SpreadsheetDocument.Create(destination, DocumentFormat.OpenXml.SpreadsheetDocumentType.Workbook))
                    {
                        var workbookPart = workbook.AddWorkbookPart();

                        workbook.WorkbookPart.Workbook = new DocumentFormat.OpenXml.Spreadsheet.Workbook();

                        workbook.WorkbookPart.Workbook.Sheets = new DocumentFormat.OpenXml.Spreadsheet.Sheets();

                        foreach (System.Data.DataTable table in ds.Tables)
                        {

                            var sheetPart = workbook.WorkbookPart.AddNewPart<WorksheetPart>();
                            var sheetData = new DocumentFormat.OpenXml.Spreadsheet.SheetData();
                            sheetPart.Worksheet = new DocumentFormat.OpenXml.Spreadsheet.Worksheet(sheetData);

                            DocumentFormat.OpenXml.Spreadsheet.Sheets sheets = workbook.WorkbookPart.Workbook.GetFirstChild<DocumentFormat.OpenXml.Spreadsheet.Sheets>();
                            string relationshipId = workbook.WorkbookPart.GetIdOfPart(sheetPart);

                            uint sheetId = 1;
                            if (sheets.Elements<DocumentFormat.OpenXml.Spreadsheet.Sheet>().Count() > 0)
                            {
                                sheetId =
                                    sheets.Elements<DocumentFormat.OpenXml.Spreadsheet.Sheet>().Select(s => s.SheetId.Value).Max() + 1;
                            }

                            DocumentFormat.OpenXml.Spreadsheet.Sheet sheet = new DocumentFormat.OpenXml.Spreadsheet.Sheet() { Id = relationshipId, SheetId = sheetId, Name = table.TableName };
                            sheets.Append(sheet);

                            DocumentFormat.OpenXml.Spreadsheet.Row headerRow = new DocumentFormat.OpenXml.Spreadsheet.Row();

                            List<String> columns = new List<string>();
                            foreach (System.Data.DataColumn column in table.Columns)
                            {
                                columns.Add(column.ColumnName);

                                DocumentFormat.OpenXml.Spreadsheet.Cell cell = new DocumentFormat.OpenXml.Spreadsheet.Cell();
                                cell.DataType = DocumentFormat.OpenXml.Spreadsheet.CellValues.String;
                                cell.CellValue = new DocumentFormat.OpenXml.Spreadsheet.CellValue(column.ColumnName);
                                headerRow.AppendChild(cell);
                            }


                            sheetData.AppendChild(headerRow);

                            foreach (System.Data.DataRow dsrow in table.Rows)
                            {
                                DocumentFormat.OpenXml.Spreadsheet.Row newRow = new DocumentFormat.OpenXml.Spreadsheet.Row();
                                foreach (String col in columns)
                                {
                                    DocumentFormat.OpenXml.Spreadsheet.Cell cell = new DocumentFormat.OpenXml.Spreadsheet.Cell();
                                    cell.DataType = DocumentFormat.OpenXml.Spreadsheet.CellValues.String;
                                    cell.CellValue = new DocumentFormat.OpenXml.Spreadsheet.CellValue(dsrow[col].ToString()); //
                                    newRow.AppendChild(cell);
                                }

                                sheetData.AppendChild(newRow);
                            }

                        }
                    }
                } 
                catch (Exception e)
                {
                    Console.WriteLine(e);
                }
            }
            
            public static void saveDataTablesToExcel(DataTable[] dataTables, string saveToFilePath)
            {
                // Create a DataSet
                DataSet dataSet = new DataSet("Tables");
                // We can add multiple DataTable to DataSet
                foreach (DataTable dt in dataTables)
                {
                    dataSet.Tables.Add(dt);
                }

                ExportDataSet(dataSet, saveToFilePath);
            }

        }
    }
    

Usage :

    // save three datatables in xlsx file
    DataTable[] dataTables = new DataTable[3];
    dataTables[0] = firstDataTable;
    dataTables[1] = secondDataTable;
    dataTables[2] = thirdDataTable;
    string fileName = "saved.xlsx";
    saveDataTablesToExcel(dataTables, 
quot;{ExcelFileSaveFolder}{fileName}");

    // retrieve data from first sheet and set it to 'returnTable'
    DataTable returnTable = null;
    var path = 
quot;{ExcelFileSaveFolder}{fileName}";
    DataTable[] getDataTables = ExcelUtility.Read(path, true);
    if (getDataTables != null && getDataTables.Length > 0)
        returnTable = getDataTables[0];
世界和平 2024-09-18 00:47:06

如果行值为 null 或空,则获取值错误。

如果工作正常,所有列都会填充数据。但也许所有行都不是

if rows value is null or empty get values wrong work.

all columns filled with data if it is working true. but maybe all rows not

痴意少年 2024-09-18 00:47:05

我认为这应该满足你的要求。另一个函数只是为了处理您是否有共享字符串,我假设您在列标题中这样做。不确定这是否完美,但我希望它有所帮助。

static void Main(string[] args)
{
    DataTable dt = new DataTable();

    using (SpreadsheetDocument spreadSheetDocument = SpreadsheetDocument.Open(@"..\..\example.xlsx", false))
    {

        WorkbookPart workbookPart = spreadSheetDocument.WorkbookPart;
        IEnumerable<Sheet> sheets = spreadSheetDocument.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
        string relationshipId = sheets.First().Id.Value;
        WorksheetPart worksheetPart = (WorksheetPart)spreadSheetDocument.WorkbookPart.GetPartById(relationshipId);
        Worksheet workSheet = worksheetPart.Worksheet;
        SheetData sheetData = workSheet.GetFirstChild<SheetData>();
        IEnumerable<Row> rows = sheetData.Descendants<Row>();

        foreach (Cell cell in rows.ElementAt(0))
        {
            dt.Columns.Add(GetCellValue(spreadSheetDocument, cell));
        }

        foreach (Row row in rows) //this will also include your header row...
        {
            DataRow tempRow = dt.NewRow();

            for (int i = 0; i < row.Descendants<Cell>().Count(); i++)
            {
                tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i-1));
            }

            dt.Rows.Add(tempRow);
        }

    }
    dt.Rows.RemoveAt(0); //...so i'm taking it out here.

}


public static string GetCellValue(SpreadsheetDocument document, Cell cell)
{
    SharedStringTablePart stringTablePart = document.WorkbookPart.SharedStringTablePart;
    string value = cell.CellValue.InnerXml;

    if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
    {
        return stringTablePart.SharedStringTable.ChildElements[Int32.Parse(value)].InnerText;
    }
    else
    {
        return value;
    }
}

I think this should do what you're asking. The other function is there just to deal with if you have shared strings, which I assume you do in your column headers. Not sure this is perfect, but I hope it helps.

static void Main(string[] args)
{
    DataTable dt = new DataTable();

    using (SpreadsheetDocument spreadSheetDocument = SpreadsheetDocument.Open(@"..\..\example.xlsx", false))
    {

        WorkbookPart workbookPart = spreadSheetDocument.WorkbookPart;
        IEnumerable<Sheet> sheets = spreadSheetDocument.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
        string relationshipId = sheets.First().Id.Value;
        WorksheetPart worksheetPart = (WorksheetPart)spreadSheetDocument.WorkbookPart.GetPartById(relationshipId);
        Worksheet workSheet = worksheetPart.Worksheet;
        SheetData sheetData = workSheet.GetFirstChild<SheetData>();
        IEnumerable<Row> rows = sheetData.Descendants<Row>();

        foreach (Cell cell in rows.ElementAt(0))
        {
            dt.Columns.Add(GetCellValue(spreadSheetDocument, cell));
        }

        foreach (Row row in rows) //this will also include your header row...
        {
            DataRow tempRow = dt.NewRow();

            for (int i = 0; i < row.Descendants<Cell>().Count(); i++)
            {
                tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i-1));
            }

            dt.Rows.Add(tempRow);
        }

    }
    dt.Rows.RemoveAt(0); //...so i'm taking it out here.

}


public static string GetCellValue(SpreadsheetDocument document, Cell cell)
{
    SharedStringTablePart stringTablePart = document.WorkbookPart.SharedStringTablePart;
    string value = cell.CellValue.InnerXml;

    if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
    {
        return stringTablePart.SharedStringTable.ChildElements[Int32.Parse(value)].InnerText;
    }
    else
    {
        return value;
    }
}
梦幻之岛 2024-09-18 00:47:05

您好上面的代码工作正常,除了一个更改

将下面的代码行替换

tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i-1));

tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i));

如果您使用 (i-1) 它将引发异常:

specified argument was out of the range of valid values. parameter name index.

Hi The above code is working fine except one change

replace the below line of code

tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i-1));

with

tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i));

If you use (i-1) it will throw an exception:

specified argument was out of the range of valid values. parameter name index.
还在原地等你 2024-09-18 00:47:05

此解决方案适用于没有空单元格的电子表格。

要处理空单元格,您需要将此行:替换

tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i-1));

为类似以下内容:

Cell cell = row.Descendants<Cell>().ElementAt(i);
int index = CellReferenceToIndex(cell);
tempRow[index] = GetCellValue(spreadSheetDocument, cell);

并添加此方法:

private static int CellReferenceToIndex(Cell cell)
{
    int index = -1;
    string reference = cell.CellReference.ToString().ToUpper();
    foreach (char ch in reference)
    {
        if (Char.IsLetter(ch))
        {
            int value = (int)ch - (int)'A';
            index = (index + 1) * 26 + value;
        }
        else
            return index;
    }
    return index;
}

This solution works for spreadsheets without empty cells.

To handle empty cells, you will need to replace this line:

tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i-1));

with something like this:

Cell cell = row.Descendants<Cell>().ElementAt(i);
int index = CellReferenceToIndex(cell);
tempRow[index] = GetCellValue(spreadSheetDocument, cell);

And add this method:

private static int CellReferenceToIndex(Cell cell)
{
    int index = -1;
    string reference = cell.CellReference.ToString().ToUpper();
    foreach (char ch in reference)
    {
        if (Char.IsLetter(ch))
        {
            int value = (int)ch - (int)'A';
            index = (index + 1) * 26 + value;
        }
        else
            return index;
    }
    return index;
}
老子叫无熙 2024-09-18 00:47:05

这是我的完整解决方案,其中还考虑了空单元格。

public static class ExcelHelper
        {
            //To get the value of the cell, even it's empty. Unable to use loop by index
            private static string GetCellValue(WorkbookPart wbPart, List<Cell> theCells, string cellColumnReference)
            {
                Cell theCell = null;
                string value = "";
                foreach (Cell cell in theCells)
                {
                    if (cell.CellReference.Value.StartsWith(cellColumnReference))
                    {
                        theCell = cell;
                        break;
                    }
                }
                if (theCell != null)
                {
                    value = theCell.InnerText;
                    // If the cell represents an integer number, you are done. 
                    // For dates, this code returns the serialized value that represents the date. The code handles strings and 
                    // Booleans individually. For shared strings, the code looks up the corresponding value in the shared string table. For Booleans, the code converts the value into the words TRUE or FALSE.
                    if (theCell.DataType != null)
                    {
                        switch (theCell.DataType.Value)
                        {
                            case CellValues.SharedString:
                                // For shared strings, look up the value in the shared strings table.
                                var stringTable = wbPart.GetPartsOfType<SharedStringTablePart>().FirstOrDefault();
                                // If the shared string table is missing, something is wrong. Return the index that is in the cell. Otherwise, look up the correct text in the table.
                                if (stringTable != null)
                                {
                                    value = stringTable.SharedStringTable.ElementAt(int.Parse(value)).InnerText;
                                }
                                break;
                            case CellValues.Boolean:
                                switch (value)
                                {
                                    case "0":
                                        value = "FALSE";
                                        break;
                                    default:
                                        value = "TRUE";
                                        break;
                                }
                                break;
                        }
                    }
                }
                return value;
            }

            private static string GetCellValue(WorkbookPart wbPart, List<Cell> theCells, int index)
            {
                return GetCellValue(wbPart, theCells, GetExcelColumnName(index));
            }

            private static string GetExcelColumnName(int columnNumber)
            {
                int dividend = columnNumber;
                string columnName = String.Empty;
                int modulo;
                while (dividend > 0)
                {
                    modulo = (dividend - 1) % 26;
                    columnName = Convert.ToChar(65 + modulo).ToString() + columnName;
                    dividend = (int)((dividend - modulo) / 26);
                }
                return columnName;
            }

            //Only xlsx files
            public static DataTable GetDataTableFromExcelFile(string filePath, string sheetName = "")
            {
                DataTable dt = new DataTable();
                try
                {
                    using (SpreadsheetDocument document = SpreadsheetDocument.Open(filePath, false))
                    {
                        WorkbookPart wbPart = document.WorkbookPart;
                        IEnumerable<Sheet> sheets = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
                        string sheetId = sheetName != "" ? sheets.Where(q => q.Name == sheetName).First().Id.Value : sheets.First().Id.Value;
                        WorksheetPart wsPart = (WorksheetPart)wbPart.GetPartById(sheetId);
                        SheetData sheetdata = wsPart.Worksheet.Elements<SheetData>().FirstOrDefault();
                        int totalHeaderCount = sheetdata.Descendants<Row>().ElementAt(0).Descendants<Cell>().Count();
                        //Get the header                    
                        for (int i = 1; i <= totalHeaderCount; i++)
                        {
                            dt.Columns.Add(GetCellValue(wbPart, sheetdata.Descendants<Row>().ElementAt(0).Elements<Cell>().ToList(), i));
                        }
                        foreach (Row r in sheetdata.Descendants<Row>())
                        {
                            if (r.RowIndex > 1)
                            {
                                DataRow tempRow = dt.NewRow();

                                //Always get from the header count, because the index of the row changes where empty cell is not counted
                                for (int i = 1; i <= totalHeaderCount; i++)
                                {
                                    tempRow[i - 1] = GetCellValue(wbPart, r.Elements<Cell>().ToList(), i);
                                }
                                dt.Rows.Add(tempRow);
                            }
                        }                    
                    }
                }
                catch (Exception ex)
                {

                }
                return dt;
            }
        }

This is my complete solution where empty cell is also taken into consideration.

public static class ExcelHelper
        {
            //To get the value of the cell, even it's empty. Unable to use loop by index
            private static string GetCellValue(WorkbookPart wbPart, List<Cell> theCells, string cellColumnReference)
            {
                Cell theCell = null;
                string value = "";
                foreach (Cell cell in theCells)
                {
                    if (cell.CellReference.Value.StartsWith(cellColumnReference))
                    {
                        theCell = cell;
                        break;
                    }
                }
                if (theCell != null)
                {
                    value = theCell.InnerText;
                    // If the cell represents an integer number, you are done. 
                    // For dates, this code returns the serialized value that represents the date. The code handles strings and 
                    // Booleans individually. For shared strings, the code looks up the corresponding value in the shared string table. For Booleans, the code converts the value into the words TRUE or FALSE.
                    if (theCell.DataType != null)
                    {
                        switch (theCell.DataType.Value)
                        {
                            case CellValues.SharedString:
                                // For shared strings, look up the value in the shared strings table.
                                var stringTable = wbPart.GetPartsOfType<SharedStringTablePart>().FirstOrDefault();
                                // If the shared string table is missing, something is wrong. Return the index that is in the cell. Otherwise, look up the correct text in the table.
                                if (stringTable != null)
                                {
                                    value = stringTable.SharedStringTable.ElementAt(int.Parse(value)).InnerText;
                                }
                                break;
                            case CellValues.Boolean:
                                switch (value)
                                {
                                    case "0":
                                        value = "FALSE";
                                        break;
                                    default:
                                        value = "TRUE";
                                        break;
                                }
                                break;
                        }
                    }
                }
                return value;
            }

            private static string GetCellValue(WorkbookPart wbPart, List<Cell> theCells, int index)
            {
                return GetCellValue(wbPart, theCells, GetExcelColumnName(index));
            }

            private static string GetExcelColumnName(int columnNumber)
            {
                int dividend = columnNumber;
                string columnName = String.Empty;
                int modulo;
                while (dividend > 0)
                {
                    modulo = (dividend - 1) % 26;
                    columnName = Convert.ToChar(65 + modulo).ToString() + columnName;
                    dividend = (int)((dividend - modulo) / 26);
                }
                return columnName;
            }

            //Only xlsx files
            public static DataTable GetDataTableFromExcelFile(string filePath, string sheetName = "")
            {
                DataTable dt = new DataTable();
                try
                {
                    using (SpreadsheetDocument document = SpreadsheetDocument.Open(filePath, false))
                    {
                        WorkbookPart wbPart = document.WorkbookPart;
                        IEnumerable<Sheet> sheets = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
                        string sheetId = sheetName != "" ? sheets.Where(q => q.Name == sheetName).First().Id.Value : sheets.First().Id.Value;
                        WorksheetPart wsPart = (WorksheetPart)wbPart.GetPartById(sheetId);
                        SheetData sheetdata = wsPart.Worksheet.Elements<SheetData>().FirstOrDefault();
                        int totalHeaderCount = sheetdata.Descendants<Row>().ElementAt(0).Descendants<Cell>().Count();
                        //Get the header                    
                        for (int i = 1; i <= totalHeaderCount; i++)
                        {
                            dt.Columns.Add(GetCellValue(wbPart, sheetdata.Descendants<Row>().ElementAt(0).Elements<Cell>().ToList(), i));
                        }
                        foreach (Row r in sheetdata.Descendants<Row>())
                        {
                            if (r.RowIndex > 1)
                            {
                                DataRow tempRow = dt.NewRow();

                                //Always get from the header count, because the index of the row changes where empty cell is not counted
                                for (int i = 1; i <= totalHeaderCount; i++)
                                {
                                    tempRow[i - 1] = GetCellValue(wbPart, r.Elements<Cell>().ToList(), i);
                                }
                                dt.Rows.Add(tempRow);
                            }
                        }                    
                    }
                }
                catch (Exception ex)
                {

                }
                return dt;
            }
        }
吃素的狼 2024-09-18 00:47:05

首先将ExcelUtility.cs添加到您的项目中:

ExcelUtility.cs

using System.Data;
using System.Linq;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;

namespace Core_Excel.Utilities
{
    static class ExcelUtility
    {
        public static DataTable Read(string path)
        {
            var dt = new DataTable();

            using (var ssDoc = SpreadsheetDocument.Open(path, false))
            {
                var sheets = ssDoc.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
                var relationshipId = sheets.First().Id.Value;
                var worksheetPart = (WorksheetPart) ssDoc.WorkbookPart.GetPartById(relationshipId);
                var workSheet = worksheetPart.Worksheet;
                var sheetData = workSheet.GetFirstChild<SheetData>();
                var rows = sheetData.Descendants<Row>().ToList();

                foreach (var row in rows) //this will also include your header row...
                {
                    var tempRow = dt.NewRow();

                    var colCount = row.Descendants<Cell>().Count();
                    foreach (var cell in row.Descendants<Cell>())
                    {
                        var index = GetIndex(cell.CellReference);

                        // Add Columns
                        for (var i = dt.Columns.Count; i <= index; i++)
                            dt.Columns.Add();

                        tempRow[index] = GetCellValue(ssDoc, cell);
                    }

                    dt.Rows.Add(tempRow);
                }
            }

            return dt;
        }

        private static string GetCellValue(SpreadsheetDocument document, Cell cell)
        {
            var stringTablePart = document.WorkbookPart.SharedStringTablePart;
            var value = cell.CellValue.InnerXml;

            if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
                return stringTablePart.SharedStringTable.ChildElements[int.Parse(value)].InnerText;

            return value;
        }

        public static int GetIndex(string name)
        {
            if (string.IsNullOrWhiteSpace(name))
                return -1;

            int index = 0;
            foreach (var ch in name)
            {
                if (char.IsLetter(ch))
                {
                    int value = ch - 'A' + 1;
                    index = value + index * 26;
                }
                else
                    break;
            }

            return index - 1;
        }
    }
}

用法:

var path = "D:\\Documents\\test.xlsx";
var dt = ExcelUtility.Read(path);

然后享受它!

First Add ExcelUtility.cs to your project :

ExcelUtility.cs

using System.Data;
using System.Linq;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;

namespace Core_Excel.Utilities
{
    static class ExcelUtility
    {
        public static DataTable Read(string path)
        {
            var dt = new DataTable();

            using (var ssDoc = SpreadsheetDocument.Open(path, false))
            {
                var sheets = ssDoc.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
                var relationshipId = sheets.First().Id.Value;
                var worksheetPart = (WorksheetPart) ssDoc.WorkbookPart.GetPartById(relationshipId);
                var workSheet = worksheetPart.Worksheet;
                var sheetData = workSheet.GetFirstChild<SheetData>();
                var rows = sheetData.Descendants<Row>().ToList();

                foreach (var row in rows) //this will also include your header row...
                {
                    var tempRow = dt.NewRow();

                    var colCount = row.Descendants<Cell>().Count();
                    foreach (var cell in row.Descendants<Cell>())
                    {
                        var index = GetIndex(cell.CellReference);

                        // Add Columns
                        for (var i = dt.Columns.Count; i <= index; i++)
                            dt.Columns.Add();

                        tempRow[index] = GetCellValue(ssDoc, cell);
                    }

                    dt.Rows.Add(tempRow);
                }
            }

            return dt;
        }

        private static string GetCellValue(SpreadsheetDocument document, Cell cell)
        {
            var stringTablePart = document.WorkbookPart.SharedStringTablePart;
            var value = cell.CellValue.InnerXml;

            if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
                return stringTablePart.SharedStringTable.ChildElements[int.Parse(value)].InnerText;

            return value;
        }

        public static int GetIndex(string name)
        {
            if (string.IsNullOrWhiteSpace(name))
                return -1;

            int index = 0;
            foreach (var ch in name)
            {
                if (char.IsLetter(ch))
                {
                    int value = ch - 'A' + 1;
                    index = value + index * 26;
                }
                else
                    break;
            }

            return index - 1;
        }
    }
}

Usage :

var path = "D:\\Documents\\test.xlsx";
var dt = ExcelUtility.Read(path);

then enjoy it!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文