有没有办法从 20MB+ 中获取工作表名称?使用 PHPExcel 生成 Excel 文件?
我正在使用 PHPExcel 从 Excel 文件中读取数据。
使用以下代码,我可以在几秒钟内从 3MB Excel 文件中读取一个特定工作表。效果很好。
但是,我现在有 27MB 和 88MB Excel 文件,我需要从中获取数据。它们太大了,甚至 OpenOffice 也无法打开它们。
我发现加载工作表时可以使用索引号而不是名称,但这似乎不一致,例如在一个特定的 Excel 文件中 setLoadSheetsOnly(0)
给了我第三< /strong> 工作表,而 setLoadSheetsOnly(1)
给了我一个错误,即使文件中有四个工作表。因此,出于某种原因,这似乎不可靠。
有没有一种方法可以从大文件中读出工作表的名称,以便我一次只能访问其中一个工作表?
$objReader = PHPExcel_IOFactory::createReaderForFile("data/" . $file_name);
$objReader->setLoadSheetsOnly(array($sheet_name));
$objReader->setReadDataOnly(true);
$objPHPExcel = $objReader->load("data/" . $file_name);
echo '<table border="1">';
for ($row = 1; $row < $number_of_rows; $row++) {
echo '<tr>';
for ($column = 0; $column < $number_of_columns; $column++) {
$value = $objPHPExcel->setActiveSheetIndex(0)->getCellByColumnAndRow($column, $row)->getValue();
echo '<td>';
echo $value . ' ';
echo '</td>';
}
echo '</tr>';
}
echo '</table>';
die;
附录:
我发现了一些接近的代码,但它似乎并不总是准确的,例如这里它错过了 27MB 文件中的第二个工作表:
在这里它只得到了第三个工作表,错过了另外 3 个:
$objReader = PHPExcel_IOFactory::createReaderForFile("data/" . $file_name);
$objReader->setLoadSheetsOnly(0);
$objReader->setReadDataOnly(true);
$objPHPExcel = $objReader->load("data/" . $file_name);
echo $objPHPExcel->getSheetCount(), ' worksheets<hr/>';
$loadedSheetNames = $objPHPExcel->getSheetNames();
foreach ($loadedSheetNames as $sheetIndex => $loadedSheetName) {
echo $sheetIndex, ' -> ', $loadedSheetName, '<br />';
}
die;
I am using PHPExcel to read data out of Excel files.
With the following code, I am able to read one particular worksheet out of a 3MB Excel file in just a couple seconds. Works nicely.
However, I now have 27MB and 88MB Excel files which I need to get data out of. They are so large that even OpenOffice cannot open them.
I have found that I can use an index number instead of a name when I load a sheet, but this seems inconsistent, e.g. in one particular Excel file setLoadSheetsOnly(0)
gave me the third sheet while setLoadSheetsOnly(1)
gave me an error even though there were four worksheets in the file. So that seems unreliable for some reason.
Is there a way I can read out the names of the worksheets from a large file so that I can access only one of its worksheets at a time?
$objReader = PHPExcel_IOFactory::createReaderForFile("data/" . $file_name);
$objReader->setLoadSheetsOnly(array($sheet_name));
$objReader->setReadDataOnly(true);
$objPHPExcel = $objReader->load("data/" . $file_name);
echo '<table border="1">';
for ($row = 1; $row < $number_of_rows; $row++) {
echo '<tr>';
for ($column = 0; $column < $number_of_columns; $column++) {
$value = $objPHPExcel->setActiveSheetIndex(0)->getCellByColumnAndRow($column, $row)->getValue();
echo '<td>';
echo $value . ' ';
echo '</td>';
}
echo '</tr>';
}
echo '</table>';
die;
ADDENDUM:
I found some code that gets close, but it doesn't seem to always be accurate, e.g. here it missed the second worksheet in a 27MB file:
and here it only got the third worksheet and missed 3 others:
$objReader = PHPExcel_IOFactory::createReaderForFile("data/" . $file_name);
$objReader->setLoadSheetsOnly(0);
$objReader->setReadDataOnly(true);
$objPHPExcel = $objReader->load("data/" . $file_name);
echo $objPHPExcel->getSheetCount(), ' worksheets<hr/>';
$loadedSheetNames = $objPHPExcel->getSheetNames();
foreach ($loadedSheetNames as $sheetIndex => $loadedSheetName) {
echo $sheetIndex, ' -> ', $loadedSheetName, '<br />';
}
die;
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
不幸的是,在不加载整个文件的情况下无法读取工作表的名称。
调用 setLoadSheetsOnly() 时使用索引号而不是名称不会给出可预测的结果:执行该检查的代码逻辑使用 in_array() 来测试要读取的工作表名称是否位于要读取的工作表名称数组中。读。例如,
我怀疑在执行此测试时,字符串与数值的比较将给出 0 == "mySheetName" 的真实结果(基于 PHP 的松散类型和比较转换规则)。
我可能可以提供一个 Reader 方法,该方法将返回工作表名称列表,而无需实际加载整个文件,尽管会影响性能。
编辑
如果将以下方法添加到 Classes/PHPExcel/Reader/Excel2007.php
您可以使用以下方式调用它:
返回的 $worksheetNames 应包含所有工作表名称作为 UTF-8 字符串的数组。因为它只是从 .xlsx 中读取绝对最小值来检索这些名称,所以它应该相当快。在将其签入 PHPExcel SVN 之前,我将做更多测试,但(目前)它似乎可以满足您的需要。
EDIT2
Excel5 Reader 的等效方法
不如 Excel2007 Reader 版本高效,但仍比仅解析工作表名称的整个 .xls 文件更快,因为我只解析全局流。
Unfortunately, it isn't possible to read the names of the worksheets without loading the whole file.
Using an index number rather than a name when calling setLoadSheetsOnly() won't give a predictable result: the code logic that performs that check uses in_array() to test if the sheetname that it is about to read is in the array of sheetnames to read. e.g.
I would suspect that the comparison of a string against a numeric value will be giving a true result for 0 == "mySheetName" when executing this test (based on PHP's loose typing and comparison casting rules).
I could probably provide a Reader method that would return a list of worksheet names without actually loading the entire file, although there would be a performance hit involved.
EDIT
If you add the following method to Classes/PHPExcel/Reader/Excel2007.php
You can call it by using:
the returned $worksheetNames should contain an array of all worksheet names as UTF-8 strings. Because it's only reading the absolute minimum from the .xlsx to retrieve these names, it should be fairly fast. I'll do some more tests before checking it into the PHPExcel SVN, but (for now) it seems to do what you need.
EDIT2
Equivalent method for the Excel5 Reader
Not as efficient as the Excel2007 Reader version, but should still be faster than parsing the whole .xls file just for the sheet names because I'm only parsing the global stream.
我不想修改 phpexcel 所以我这样做了:
它只适用于 excel 2007 或更高版本,但做了我需要的事情
I didn't want to amend phpexcel so I went with this:
It only works on excel 2007 or later but did what I needed