从Google Drive读取文件

发布于 2025-02-13 19:52:05 字数 545 浏览 2 评论 0 原文

我已经在Google Drive已解锁的Google Drive中将电子表格上传为CSV文件，因此用户可以从中阅读。这是CSV文件的链接： https://docs.google.com/spreadsheets/d/170235qwbmgqvr0gwmt-8ybsc7vk6p_dmvyxrznfskqkqkqkqkqkqk/edit

？我正在使用：

id = "170235QwbmgQvr0GWmT-8yBsC7Vk6p_dmvYxrZNfsKqk"
read.csv(sprint("https://docs.google.com/spreadsheets/d/uc?id=%s&export=download",id))

有人可以建议如何将Google Drive直接读取到R中吗？

原文

I have spreadsheet uploaded as csv file in google drive unlocked so users can read from it.
This is the link to the csv file:
https://docs.google.com/spreadsheets/d/170235QwbmgQvr0GWmT-8yBsC7Vk6p_dmvYxrZNfsKqk/edit?usp=sharing

I am trying to read it from R but I am getting a long list of error messages. I am using:

id = "170235QwbmgQvr0GWmT-8yBsC7Vk6p_dmvYxrZNfsKqk"
read.csv(sprint("https://docs.google.com/spreadsheets/d/uc?id=%s&export=download",id))

Could someone suggest how to read files from google drive directly into R?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

面犯桃花 2025-02-20 19:52:05

我会尝试以CSV文件（ doc ）），然后从那里阅读。

似乎您的文件已经以CSV的形式出版。因此，这应该起作用。（请注意，URL以/pub？output = CSV 结束）

read.csv("https://docs.google.com/spreadsheets/d/170235QwbmgQvr0GWmT-8yBsC7Vk6p_dmvYxrZNfsKqk/pub?output=csv")

I would try to publish the sheet as a CSV file (doc), and then read it from there.

It seems like your file is already published as a CSV. So, this should work. (Note that the URL ends with /pub?output=csv)

read.csv("https://docs.google.com/spreadsheets/d/170235QwbmgQvr0GWmT-8yBsC7Vk6p_dmvYxrZNfsKqk/pub?output=csv")

回复收藏 0 原文

疑心病 2025-02-20 19:52:05

要更快地读取CSV文件，您可以使用甚至比 fread（）更快。请参阅此处。

现在使用vroom，

library(vroom)

vroom("https://docs.google.com/spreadsheets/d/170235QwbmgQvr0GWmT-8yBsC7Vk6p_dmvYxrZNfsKqk/pub?output=csv")

#> Rows: 387048 Columns: 14
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr  (6): StationCode, SampleID, WeatherCode, OrganismCode, race, race2
#> dbl  (7): WaterTemperature, Turbidity, Velocity, ForkLength, Weight, Count, ...
#> date (1): SampleDate
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> # A tibble: 387,048 × 14
#>    StationCode SampleDate SampleID WeatherCode WaterTemperature Turbidity
#>    <chr>       <date>     <chr>    <chr>                  <dbl>     <dbl>
#>  1 Gate 11     2000-04-25 116_00   CLD                    13.1       2   
#>  2 Gate 5      1995-04-26 117_95   CLR                    NA         2   
#>  3 Gate 2      1995-04-21 111_95   W                      10.4      12   
#>  4 Gate 6      2008-12-13 348_08   CLR                    49.9       1.82
#>  5 Gate 5      1999-12-10 344_99   CLR                     7.30      1.5 
#>  6 Gate 6      2012-05-25 146_12   CLR                    55.5       1.60
#>  7 Gate 10     2011-06-28 179_11   RAN                    57.3       3.99
#>  8 Gate 11     1996-04-25 116_96   CLR                    13.8      21   
#>  9 Gate 9      2007-07-02 183_07   CLR                    56.6       2.09
#> 10 Gate 6      2009-06-04 155_09   CLR                    58.6       3.08
#> # … with 387,038 more rows, and 8 more variables: Velocity <dbl>,
#> #   OrganismCode <chr>, ForkLength <dbl>, Weight <dbl>, Count <dbl>,
#> #   race <chr>, year <dbl>, race2 <chr>

^{在2022-07-08创建的 reprex package （v2.0.1）< /sup>}

To read the CSV file faster you can use vroom which is even faster than fread(). See here.

Now using vroom,

library(vroom)

vroom("https://docs.google.com/spreadsheets/d/170235QwbmgQvr0GWmT-8yBsC7Vk6p_dmvYxrZNfsKqk/pub?output=csv")

#> Rows: 387048 Columns: 14
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr  (6): StationCode, SampleID, WeatherCode, OrganismCode, race, race2
#> dbl  (7): WaterTemperature, Turbidity, Velocity, ForkLength, Weight, Count, ...
#> date (1): SampleDate
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> # A tibble: 387,048 × 14
#>    StationCode SampleDate SampleID WeatherCode WaterTemperature Turbidity
#>    <chr>       <date>     <chr>    <chr>                  <dbl>     <dbl>
#>  1 Gate 11     2000-04-25 116_00   CLD                    13.1       2   
#>  2 Gate 5      1995-04-26 117_95   CLR                    NA         2   
#>  3 Gate 2      1995-04-21 111_95   W                      10.4      12   
#>  4 Gate 6      2008-12-13 348_08   CLR                    49.9       1.82
#>  5 Gate 5      1999-12-10 344_99   CLR                     7.30      1.5 
#>  6 Gate 6      2012-05-25 146_12   CLR                    55.5       1.60
#>  7 Gate 10     2011-06-28 179_11   RAN                    57.3       3.99
#>  8 Gate 11     1996-04-25 116_96   CLR                    13.8      21   
#>  9 Gate 9      2007-07-02 183_07   CLR                    56.6       2.09
#> 10 Gate 6      2009-06-04 155_09   CLR                    58.6       3.08
#> # … with 387,038 more rows, and 8 more variables: Velocity <dbl>,
#> #   OrganismCode <chr>, ForkLength <dbl>, Weight <dbl>, Count <dbl>,
#> #   race <chr>, year <dbl>, race2 <chr>

^{Created on 2022-07-08 by the reprex package (v2.0.1)}

回复收藏 0 原文

少女情怀诗 2025-02-20 19:52:05

googlesheets4 对我来说很好。
我不记得这些细节，但我确实相信您确实必须在Google帐户上授予图书馆权限。要考虑的东西。

library("googlesheets4")

sheetURL <- "https://docs.google.com/spreadsheets/d/1owVDl0dZDT4jijtA9GsKu8TO0lDHH14PgPWMqmfvaQk/edit?usp=sharing"

file <- googlesheets4::read_sheet(as_id(sheetURL),sheet = "theSheet")

您也可以分配特定的表格。默认情况下，它在文档中获取第一张纸。文件对象是R数据框。

googlesheets4 works well for me.
I don't remember the specifics but I do believe you do have to give the library permissions on your google account. Something to consider.

library("googlesheets4")

sheetURL <- "https://docs.google.com/spreadsheets/d/1owVDl0dZDT4jijtA9GsKu8TO0lDHH14PgPWMqmfvaQk/edit?usp=sharing"

file <- googlesheets4::read_sheet(as_id(sheetURL),sheet = "theSheet")

You can assign the specific sheet your after as well. By default it grabs the first sheet in the document. The file object is a R dataframe.

回复收藏 0 原文

~没有更多了~