将 Excel 文件导入 R、xlsx 或 xls

发布于 2024-11-29 09:44:13 字数 507 浏览 1 评论 0原文

请有人帮助我找到将 excel 2007 (.xlsx) 文件导入 R 的最佳方法。我尝试了多种方法,但似乎都不起作用。我已经升级到2.13.1,windows XP,xlsx 0.3.0,我不知道为什么错误不断出现。我尝试过:

AB<-read.xlsx("C:/AB_DNA_Tag_Numbers.xlsx","DNA_Tag_Numbers")

或者

AB<-read.xlsx("C:/AB_DNA_Tag_Numbers.xlsx",1)

但收到错误:

 Error in .jnew("java/io/FileInputStream", file) : 
  java.io.FileNotFoundException: C:\AB_DNA_Tag_Numbers.xlsx (The system cannot find the file specified)

谢谢。

Please can someone help me on the best way to import an excel 2007 (.xlsx) file into R. I have tried several methods and none seems to work. I have upgraded to 2.13.1, windows XP, xlsx 0.3.0, I don't know why the error keeps coming up. I tried:

AB<-read.xlsx("C:/AB_DNA_Tag_Numbers.xlsx","DNA_Tag_Numbers")

OR

AB<-read.xlsx("C:/AB_DNA_Tag_Numbers.xlsx",1)

but I get the error:

 Error in .jnew("java/io/FileInputStream", file) : 
  java.io.FileNotFoundException: C:\AB_DNA_Tag_Numbers.xlsx (The system cannot find the file specified)

Thank you.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(17

萌无敌 2024-12-06 09:44:13

对于没有繁琐的外部依赖项*的解决方案,现在有 readxl

readxl 包可以轻松地将数据从 Excel 中取出并输入 R。
与许多现有的软件包(例如 gdata、xlsx、
xlsReadWrite) readxl 没有外部依赖项,因此很容易
在所有操作系统上安装和使用。它旨在与
表格数据存储在单个工作表中。

Readxl 支持传统的 .xls 格式和现代的基于 xml 的格式
.xlsx 格式。通过 libxls C 库可以实现 .xls 支持,
它抽象了底层二进制文件的许多复杂性
格式。为了解析 .xlsx,我们使用 RapidXML C++ 库。

它可以像这样安装:

install.packages("readxl") # CRAN version

devtools::install_github("hadley/readxl") # development version

用法

library(readxl)

# read_excel reads both xls and xlsx files
read_excel("my-old-spreadsheet.xls")
read_excel("my-new-spreadsheet.xlsx")

# Specify sheet with a number or name
read_excel("my-spreadsheet.xls", sheet = "data")
read_excel("my-spreadsheet.xls", sheet = 2)

# If NAs are represented by something other than blank cells,
# set the na argument
read_excel("my-spreadsheet.xls", na = "NA")

* 不严格正确,它需要 Rcpp,这又需要 Rtools(适用于 Windows)或 Xcode(适用于 OSX),它们是 R 外部的依赖项。但它们不需要任何路径等摆弄,因此这比 Java 和Perl 依赖项。

更新 现在有 rexcel 包。这有望从 Excel 文件中获取 Excel 格式、函数和许多其他类型的信息并输入 R。

For a solution that is free of fiddly external dependencies*, there is now readxl:

The readxl package makes it easy to get data out of Excel and into R.
Compared to many of the existing packages (e.g. gdata, xlsx,
xlsReadWrite) readxl has no external dependencies so it's easy to
install and use on all operating systems. It is designed to work with
tabular data stored in a single sheet.

Readxl supports both the legacy .xls format and the modern xml-based
.xlsx format. .xls support is made possible the with libxls C library,
which abstracts away many of the complexities of the underlying binary
format. To parse .xlsx, we use the RapidXML C++ library.

It can be installed like so:

install.packages("readxl") # CRAN version

or

devtools::install_github("hadley/readxl") # development version

Usage

library(readxl)

# read_excel reads both xls and xlsx files
read_excel("my-old-spreadsheet.xls")
read_excel("my-new-spreadsheet.xlsx")

# Specify sheet with a number or name
read_excel("my-spreadsheet.xls", sheet = "data")
read_excel("my-spreadsheet.xls", sheet = 2)

# If NAs are represented by something other than blank cells,
# set the na argument
read_excel("my-spreadsheet.xls", na = "NA")

* not strictly true, it requires the Rcpp package, which in turn requires Rtools (for Windows) or Xcode (for OSX), which are dependencies external to R. But they don't require any fiddling with paths, etc., so that's an advantage over Java and Perl dependencies.

Update There is now the rexcel package. This promises to get Excel formatting, functions and many other kinds of information from the Excel file and into R.

人生戏 2024-12-06 09:44:13

您可能还想尝试 XLConnect 软件包。我的运气比 xlsx 好(而且它也可以读取 .xls 文件)。

library(XLConnect)
theData <- readWorksheet(loadWorkbook("C:/AB_DNA_Tag_Numbers.xlsx"),sheet=1)

另外,如果您遇到找不到文件的问题,请尝试使用 file.choose() 选择它。

You may also want to try the XLConnect package. I've had better luck with it than xlsx (plus it can read .xls files too).

library(XLConnect)
theData <- readWorksheet(loadWorkbook("C:/AB_DNA_Tag_Numbers.xlsx"),sheet=1)

also, if you are having trouble with your file not being found, try selecting it with file.choose().

我很坚强 2024-12-06 09:44:13

已过时:自 2024 年起,read.xls 已从 gdata 中删除。

来自 NEWS 文件:“首先支持 Excel 文件早在 2004 年就在 gregmisc/gdata 1.11.0 中引入,但今天我们有 openxlsx、readxl、XLConnect 和 xlsx 等软件包,提供专用的 Excel 文件支持和更多功能。”


我肯定会尝试 gdata 包中的 read.xls 函数,它比 xlsx 包成熟得多。它可能需要 Perl ...

obsolete: read.xls has been removed from gdata as of 2024.

from the NEWS file: "Excel file support was first introduced in gregmisc/gdata 1.11.0 back in 2004, but today we have packages such as openxlsx, readxl, XLConnect, and xlsx offering dedicated Excel file support with more features."


I would definitely try the read.xls function in the gdata package, which is considerably more mature than the xlsx package. It may require Perl ...

清晨说晚安 2024-12-06 09:44:13

更新

由于下面的答案现在有些过时,我只是提请注意 readxl 包。如果 Excel 工作表格式正确/布局良好,那么我现在将使用 readxl 从工作簿中读取内容。如果工作表格式不佳/布局不当,那么我仍然会导出到 CSV,然后通过 read.csv() 或普通的旧 readLines() 处理 R 中的问题。

原始

我的首选方法是将单个 Excel 工作表保存在逗号分隔值 (CSV) 文件中。在 Windows 上,这些文件与 Excel 关联,因此您不会失去在 Excel 中双击打开的“功能”。

CSV 文件可以使用 read.csv() 读入 R,或者,如果您位于某个位置或使用设置了某些欧洲设置的计算机(其中 , 为用作小数位),使用 read.csv2()

这些函数具有合理的默认值,使读取适当格式的文件变得简单。只需将样本或变量的标签保留在第一行或第一列即可。

将文件存储在 CSV 中的额外好处是,由于文件是纯文本,因此可以非常轻松地传递它们,并且您可以确信它们可以在任何地方打开;人们不需要 Excel 来查看或编辑数据。

Update

As the Answer below is now somewhat outdated, I'd just draw attention to the readxl package. If the Excel sheet is well formatted/lain out then I would now use readxl to read from the workbook. If sheets are poorly formatted/lain out then I would still export to CSV and then handle the problems in R either via read.csv() or plain old readLines().

Original

My preferred way is to save individual Excel sheets in comma separated value (CSV) files. On Windows, these files are associated with Excel so you don't loose the double-click-open-in-Excel "feature".

CSV files can be read into R using read.csv(), or, if you are in a location or using a computer set up with some European settings (where , is used as the decimal place), using read.csv2().

These functions have sensible defaults that makes reading appropriately formatted files simple. Just keep any labels for samples or variables in the first row or column.

Added benefits of storing files in CSV are that as the files are plain text they can be passed around very easily and you can be confident they will open anywhere; one doesn't need Excel to look at or edit the data.

病女 2024-12-06 09:44:13

示例 2012:

library("xlsx")
FirstTable <- read.xlsx("MyExcelFile.xlsx", 1 , stringsAsFactors=F)
SecondTable <- read.xlsx("MyExcelFile.xlsx", 2 , stringsAsFactors=F)
  • 我会尝试“xlsx”包,因为它很容易处理,而且看起来足够成熟,
  • 对我来说工作得很好,不需要任何额外的东西,比如 Perl 或其他

示例 2015:

library("readxl")
FirstTable  <- read_excel("MyExcelFile.xlsx", 1)
SecondTable <- read_excel("MyExcelFile.xlsx", 2)
  • 现在我使用 readxl 并获得了很好的体验。
  • 不需要额外的东西
  • 良好的性能

Example 2012:

library("xlsx")
FirstTable <- read.xlsx("MyExcelFile.xlsx", 1 , stringsAsFactors=F)
SecondTable <- read.xlsx("MyExcelFile.xlsx", 2 , stringsAsFactors=F)
  • I would try 'xlsx' package for it is easy to handle and seems mature enough
  • worked fine for me and did not need any additionals like Perl or whatever

Example 2015:

library("readxl")
FirstTable  <- read_excel("MyExcelFile.xlsx", 1)
SecondTable <- read_excel("MyExcelFile.xlsx", 2)
  • nowadays I use readxl and have made good experience with it.
  • no extra stuff needed
  • good performance
天冷不及心凉 2024-12-06 09:44:13

这个新包看起来不错 http://cran.r-project.org/ web/packages/openxlsx/openxlsx.pdf
它不需要 rJava,并使用“Rcpp”来提高速度。

This new package looks nice http://cran.r-project.org/web/packages/openxlsx/openxlsx.pdf
It doesn't require rJava and is using 'Rcpp' for speed.

攒一口袋星星 2024-12-06 09:44:13

如果您遇到同样的问题并且 R 给您一个错误 - 找不到函数“.jnew” - 只需安装 rJava 库即可。或者,如果您已经有了它,只需运行行库(rJava)。这应该是问题所在。

另外,每个人都应该清楚,csv 和 txt 文件更容易使用,但生活并不容易,有时您只需打开 xlsx。

If you are running into the same problem and R is giving you an error -- could not find function ".jnew" -- Just install the library rJava. Or if you have it already just run the line library(rJava). That should be the problem.

Also, it should be clear to everybody that csv and txt files are easier to work with, but life is not easy and sometimes you just have to open an xlsx.

蓝眸 2024-12-06 09:44:13

对我来说 openxlx 包以最简单的方式工作。

install.packages("openxlsx")
library(openxlsx)
rawData<-read.xlsx("your.xlsx");

For me the openxlx package worked in the easiest way.

install.packages("openxlsx")
library(openxlsx)
rawData<-read.xlsx("your.xlsx");
维持三分热 2024-12-06 09:44:13

在意识到 xlxs 包尚未针对 R 3.1.0 进行更新后,我最近发现了 Schaun Wheeler 的将 excel 文件导入到 R 中的功能。

https://gist.github.com/schaunwheeler/5825002

文件名需要包含“ .xlsx”扩展名,并且运行该函数时无法打开该文件。

这个功能对于访问其他人的工作非常有用。与使用 read.csv 函数相比,主要优点是

  • 导入多个 Excel 文件、
  • 导入大文件
  • 、定期更新的文件

使用 read.csv 函数需要手动打开和保存每个 Excel 文档,这非常耗时且非常无聊。因此,使用 Schaun 函数来自动化工作流程是一个巨大的帮助。

对 Schaun 的这个解决方案表示大力支持。

I recently discovered Schaun Wheeler's function for importing excel files into R after realising that the xlxs package hadn't been updated for R 3.1.0.

https://gist.github.com/schaunwheeler/5825002

The file name needs to have the ".xlsx" extension and the file can't be open when you run the function.

This function is really useful for accessing other peoples work. The main advantages over using the read.csv function are when

  • Importing multiple excel files
  • Importing large files
  • Files that are updated regularly

Using the read.csv function requires manual opening and saving of each Excel document which is time consuming and very boring. Using Schaun's function to automate the workflow is therefore a massive help.

Big props to Schaun for this solution.

最单纯的乌龟 2024-12-06 09:44:13

你的操作系统是什么?您运行的是哪个版本的 R:32 位还是 64 位?您安装了哪个版本的 Java?

当我第一次开始使用 read.xlsx() 函数时,我遇到了类似的错误,并发现我的问题(可能与您的问题相关,也可能不相关;至少,此响应应被视为“也试试这个”)与 .xlsx pacakge 与 64 位 Java 的不兼容有关。我相当确定 .xlsx 包需要 32 位 Java。

使用 32 位 R 并确保安装了 32 位 Java。这可能会解决您的问题。

What's your operating system? What version of R are you running: 32-bit or 64-bit? What version of Java do you have installed?

I had a similar error when I first started using the read.xlsx() function and discovered that my issue (which may or may not be related to yours; at a minimum, this response should be viewed as "try this, too") was related to the incompatability of .xlsx pacakge with 64-bit Java. I'm fairly certain that the .xlsx package requires 32-bit Java.

Use 32-bit R and make sure that 32-bit Java is installed. This may address your issue.

怀里藏娇 2024-12-06 09:44:13

您已经检查过 R 实际上能够找到该文件,例如 file.exists("C:/AB_DNA_Tag_Numbers.xlsx") 吗? – Ben Bolker,2011 年 8 月 14 日 23:05

以上评论应该已经解决了您的问题:

require("xlsx")
read.xlsx("filepath/filename.xlsx",1) 

之后应该可以正常工作。

You have checked that R is actually able to find the file, e.g. file.exists("C:/AB_DNA_Tag_Numbers.xlsx") ? – Ben Bolker Aug 14 '11 at 23:05

Above comment should've solved your problem:

require("xlsx")
read.xlsx("filepath/filename.xlsx",1) 

should work fine after that.

浪荡不羁 2024-12-06 09:44:13

我已经非常努力地尝试了上面的所有答案。然而,他们实际上并没有帮助,因为我使用的是 Mac。 rio库有这个导入功能,基本上可以将任何类型的数据文件导入Rstudio,甚至是那些使用非英语语言的文件!

尝试以下代码:

    library(rio)
    AB <- import("C:/AB_DNA_Tag_Numbers.xlsx")
    AB <- AB[,1]

希望这会有所帮助。
更详细的参考:https://cran.r-project .org/web/packages/rio/vignettes/rio.html

I have tried very hard on all the answers above. However, they did not actually help because I used a mac. The rio library has this import function which can basically import any type of data file into Rstudio, even those file using languages other than English!

Try codes below:

    library(rio)
    AB <- import("C:/AB_DNA_Tag_Numbers.xlsx")
    AB <- AB[,1]

Hope this help.
For more detailed reference: https://cran.r-project.org/web/packages/rio/vignettes/rio.html

半葬歌 2024-12-06 09:44:13

如果导出为 OpenDocument 电子表格文件 (ods) 或较旧的 Excel 格式并使用 ODS 阅读器 或您上面提到的 Excel 阅读器。

You may be able to keep multiple tabs and more formatting information if you export to an OpenDocument Spreadsheet file (ods) or an older Excel format and import it with the ODS reader or the Excel reader you mentioned above.

握住你手 2024-12-06 09:44:13

正如这里许多人所说,我正在写同样的事情,但还有一点!

首先,我们需要确保我们的 R Studio 安装了这两个软件包:

  1. “readxl”
  2. “XLConnect”

为了在 R 中加载软件包,您可以使用以下功能:

install.packages("readxl/XLConnect")
library(XLConnect)
search()

搜索将显示您的当前可用软件包的列表R工作室。

现在另一个问题是,即使您可能拥有这两个包,但在读取“xlsx”文件时仍然可能会遇到问题,并且错误可能类似于“错误:列数多于列名”

要解决此问题,您只需重新保存 Excel 工作表即可“xlsx”到

“CSV(逗号分隔)”

,您的生活将变得超级轻松......

玩得开心!

As stated by many here, I am writing the same thing but with an additional point!

At first we need to make sure that our R Studio has these two packages installed:

  1. "readxl"
  2. "XLConnect"

In order to load a package in R you can use the below function:

install.packages("readxl/XLConnect")
library(XLConnect)
search()

search will display the list of current packages being available in your R Studio.

Now another catch, even though you might have these two packages but still you may encounter problem while reading "xlsx" file and the error could be like "error: more columns than column name"

To solve this issue you can simply resave your excel sheet "xlsx" in to

"CSV (Comma delimited)"

and your life will be super easy....

Have fun!!

贪恋 2024-12-06 09:44:13

xlsx 软件包的安装需要 rJava 和 xlsxjars。它们间接需要系统上特定的(32 或 64 位)java 运行时环境。

read.xlsx 的优点:在同一个包中,有 read.xlsxwrite.xlsx

缺点:速度非常慢

正如所建议的,简单方法是从 Excel 中保存为 .csv 格式。

5800x15 数据集(中值)的简单基准

  1. read.xlsx:>10000ms
  2. read_xlsx:70ms
  3. read.csv:15ms

The installation of xlsx package require rJava and xlsxjars. Indirectly they require the specific (32 or 64 bit) java runtime environment on the system.

Pro of read.xlsx: In the same package there are read.xlsx and write.xlsx

Con: Very low speed

As suggested, the easy way is to save in .csv format from excel.

Simple benchmark on a 5800x15 dataset (median)

  1. read.xlsx: >10000ms
  2. read_xlsx: 70ms
  3. read.csv: 15ms
美人骨 2024-12-06 09:44:13

您可以使用以下方法读取 xlsx 或 xls 文件。此方法仅适用于 Windows。

library(RDCOMClient)

path_Excel_File <- "D:/empty_Excel_File.xls"

xlApp <- COMCreate("Excel.Application")
xlApp[["Visible"]] <- TRUE
xlWbk <- xlApp$Workbooks()$Open(path_Excel_File)
Sheets <- xlWbk$Sheets()

Mat_Val <- matrix(NA, nrow = 2, ncol = 2)

for(i in 1 : 2)
{
  for(j in 1 : 2)
  {
    Mat_Val[i, j] <- Sheets[[1]]$Cells(i, j)$Value()
  }
}

You can read a xlsx or xls file with the following approach. This approach only works on Windows.

library(RDCOMClient)

path_Excel_File <- "D:/empty_Excel_File.xls"

xlApp <- COMCreate("Excel.Application")
xlApp[["Visible"]] <- TRUE
xlWbk <- xlApp$Workbooks()$Open(path_Excel_File)
Sheets <- xlWbk$Sheets()

Mat_Val <- matrix(NA, nrow = 2, ncol = 2)

for(i in 1 : 2)
{
  for(j in 1 : 2)
  {
    Mat_Val[i, j] <- Sheets[[1]]$Cells(i, j)$Value()
  }
}
攒一口袋星星 2024-12-06 09:44:13

openxlsx2

对于仅 xlsx 文件,您可以使用 openxlsx2 包,它旨在简化/增强原始 openxlsx 包。它于 2022 年添加到 CRAN:

library(openxlsx2)

df <- read_xlsx("Book1.xlsx")

# OR 

df <- wb_read("Book1.xlsx")

您可以使用 wb_load() 保留工作簿格式、样式等:

wb <- wb_load("Book1.xlsx")
df <- wb_to_df(wb) # specify importing options

因此,在您的特定情况下,其中任何一个都可以工作。您可以通过名称或索引指定工作表(此处仅使用工作表名称):

AB <- read_xlsx("C:/AB_DNA_Tag_Numbers.xlsx","DNA_Tag_Numbers")
AB <- wb_read("C:/AB_DNA_Tag_Numbers.xlsx", "DNA_Tag_Numbers")

wb <- wb_load("C:/AB_DNA_Tag_Numbers.xlsx","DNA_Tag_Numbers")
AB <- wb_to_df(wb)

openxlsx2

For just xlsx files you can use the openxlsx2 package, which is meant to simplify/enhance the original openxlsx package. It was added to the CRAN in 2022:

library(openxlsx2)

df <- read_xlsx("Book1.xlsx")

# OR 

df <- wb_read("Book1.xlsx")

You can preserve workbook formatting, styles, etc. by using wb_load():

wb <- wb_load("Book1.xlsx")
df <- wb_to_df(wb) # specify importing options

So in your specific case any of these would work. You can specify the sheet by either name or index (only using sheet name here):

AB <- read_xlsx("C:/AB_DNA_Tag_Numbers.xlsx","DNA_Tag_Numbers")
AB <- wb_read("C:/AB_DNA_Tag_Numbers.xlsx", "DNA_Tag_Numbers")

wb <- wb_load("C:/AB_DNA_Tag_Numbers.xlsx","DNA_Tag_Numbers")
AB <- wb_to_df(wb)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文