如何使用 OpenXML Format SDK 从电子表格读取数据?

发布于 2024-08-28 18:19:30 字数 134 浏览 11 评论 0原文

我需要使用 Open XML SDK 2.0 从 Excel 2007 工作簿中的单个工作表读取数据。我花了很多时间寻找执行此操作的基本指南,但我只找到了创建电子表格的帮助。

如何使用此 SDK 迭代工作表中的行,然后迭代每行中的单元格?

I need to read data from a single worksheet in an Excel 2007 workbook using the Open XML SDK 2.0. I have spent a lot of time searching for basic guidelines to doing this, but I have only found help on creating spreadsheets.

How do I iterate rows in a worksheet and then iterate the cells in each row, using this SDK?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

指尖上得阳光 2024-09-04 18:19:30

另一个答案似乎更像是元答案。我一直在努力解决这个问题,因为使用 LINQ 确实可以处理分离的文档部分。以下代码包含一个包装函数,用于从 Cell 获取值,从而解析任何可能的字符串查找。

public void ExcelDocTest()
{
    Debug.WriteLine("Running through sheet.");
    int rowsComplete = 0;

    using (SpreadsheetDocument spreadsheetDocument =
                    SpreadsheetDocument.Open(@"path\to\Spreadsheet.xlsx", false))
    {
        WorkbookPart workBookPart = spreadsheetDocument.WorkbookPart;

        foreach (Sheet s in workBookPart.Workbook.Descendants<Sheet>())
        {
            WorksheetPart wsPart = workBookPart.GetPartById(s.Id) as WorksheetPart;
            Debug.WriteLine("Worksheet {1}:{2} - id({0}) {3}", s.Id, s.SheetId, s.Name,
                wsPart == null ? "NOT FOUND!" : "found.");

            if (wsPart == null)
            {
                continue;
            }

            Row[] rows = wsPart.Worksheet.Descendants<Row>().ToArray();

            //assumes the first row contains column names 
            foreach (Row row in wsPart.Worksheet.Descendants<Row>())
            {
                rowsComplete++;

                bool emptyRow = true;
                List<object> rowData = new List<object>();
                string value;

                foreach (Cell c in row.Elements<Cell>())
                {
                    value = GetCellValue(c);
                    emptyRow = emptyRow && string.IsNullOrWhiteSpace(value);
                    rowData.Add(value);
                }

                Debug.WriteLine("Row {0}: {1}", row,
                    emptyRow ? "EMPTY!" : string.Join(", ", rowData));
            }
        }

    }
    Debug.WriteLine("Done, processed {0} rows.", rowsComplete);
}

public static string GetCellValue(Cell cell)
{
    if (cell == null)
        return null;
    if (cell.DataType == null)
        return cell.InnerText;

    string value = cell.InnerText;
    switch (cell.DataType.Value)
    {
        case CellValues.SharedString:
            // For shared strings, look up the value in the shared strings table.
            // Get worksheet from cell
            OpenXmlElement parent = cell.Parent;
            while (parent.Parent != null && parent.Parent != parent
                    && string.Compare(parent.LocalName, "worksheet", true) != 0)
            {
                parent = parent.Parent;
            }
            if (string.Compare(parent.LocalName, "worksheet", true) != 0)
            {
                throw new Exception("Unable to find parent worksheet.");
            }

            Worksheet ws = parent as Worksheet;
            SpreadsheetDocument ssDoc = ws.WorksheetPart.OpenXmlPackage as SpreadsheetDocument;
            SharedStringTablePart sstPart = ssDoc.WorkbookPart.GetPartsOfType<SharedStringTablePart>().FirstOrDefault();

            // lookup value in shared string table
            if (sstPart != null && sstPart.SharedStringTable != null)
            {
                value = sstPart.SharedStringTable.ElementAt(int.Parse(value)).InnerText;
            }
            break;

        //this case within a case is copied from msdn. 
        case CellValues.Boolean:
            switch (value)
            {
                case "0":
                    value = "FALSE";
                    break;
                default:
                    value = "TRUE";
                    break;
            }
            break;
    }
    return value;
}

编辑:感谢 @Nitin-Jadhav 对 GetCellValue() 的更正。

The other answer seemed more like a meta-answer. I have been struggling with this since using LINQ does work with separated document parts. The following code includes a wrapper function to get the value from a Cell, resolving any possible string lookups.

public void ExcelDocTest()
{
    Debug.WriteLine("Running through sheet.");
    int rowsComplete = 0;

    using (SpreadsheetDocument spreadsheetDocument =
                    SpreadsheetDocument.Open(@"path\to\Spreadsheet.xlsx", false))
    {
        WorkbookPart workBookPart = spreadsheetDocument.WorkbookPart;

        foreach (Sheet s in workBookPart.Workbook.Descendants<Sheet>())
        {
            WorksheetPart wsPart = workBookPart.GetPartById(s.Id) as WorksheetPart;
            Debug.WriteLine("Worksheet {1}:{2} - id({0}) {3}", s.Id, s.SheetId, s.Name,
                wsPart == null ? "NOT FOUND!" : "found.");

            if (wsPart == null)
            {
                continue;
            }

            Row[] rows = wsPart.Worksheet.Descendants<Row>().ToArray();

            //assumes the first row contains column names 
            foreach (Row row in wsPart.Worksheet.Descendants<Row>())
            {
                rowsComplete++;

                bool emptyRow = true;
                List<object> rowData = new List<object>();
                string value;

                foreach (Cell c in row.Elements<Cell>())
                {
                    value = GetCellValue(c);
                    emptyRow = emptyRow && string.IsNullOrWhiteSpace(value);
                    rowData.Add(value);
                }

                Debug.WriteLine("Row {0}: {1}", row,
                    emptyRow ? "EMPTY!" : string.Join(", ", rowData));
            }
        }

    }
    Debug.WriteLine("Done, processed {0} rows.", rowsComplete);
}

public static string GetCellValue(Cell cell)
{
    if (cell == null)
        return null;
    if (cell.DataType == null)
        return cell.InnerText;

    string value = cell.InnerText;
    switch (cell.DataType.Value)
    {
        case CellValues.SharedString:
            // For shared strings, look up the value in the shared strings table.
            // Get worksheet from cell
            OpenXmlElement parent = cell.Parent;
            while (parent.Parent != null && parent.Parent != parent
                    && string.Compare(parent.LocalName, "worksheet", true) != 0)
            {
                parent = parent.Parent;
            }
            if (string.Compare(parent.LocalName, "worksheet", true) != 0)
            {
                throw new Exception("Unable to find parent worksheet.");
            }

            Worksheet ws = parent as Worksheet;
            SpreadsheetDocument ssDoc = ws.WorksheetPart.OpenXmlPackage as SpreadsheetDocument;
            SharedStringTablePart sstPart = ssDoc.WorkbookPart.GetPartsOfType<SharedStringTablePart>().FirstOrDefault();

            // lookup value in shared string table
            if (sstPart != null && sstPart.SharedStringTable != null)
            {
                value = sstPart.SharedStringTable.ElementAt(int.Parse(value)).InnerText;
            }
            break;

        //this case within a case is copied from msdn. 
        case CellValues.Boolean:
            switch (value)
            {
                case "0":
                    value = "FALSE";
                    break;
                default:
                    value = "TRUE";
                    break;
            }
            break;
    }
    return value;
}

Edit: Thanks @Nitin-Jadhav for the correction to GetCellValue().

轻拂→两袖风尘 2024-09-04 18:19:30

我这样做的方法是使用 Linq。关于这个主题有很多示例,从使用 SDK 到仅使用纯 Open XML(无 SDK)。请查看:

The way I do this is with Linq. There are lots of sample around on this subject from using the SDK to just going with pure Open XML (no SDK). Take a look at:

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文