将dataframe转换为r中的整洁格式

发布于 2025-02-05 02:59:23 字数 2171 浏览 1 评论 0 原文

我有一个具有如下结构的数据框架

> ls.str(df)

attachments : 'data.frame': 1103947 obs. of  2 variables:
 $ media_keys:List of 1103947
 $ poll_ids  :List of 1103947
author_id :  chr [1:1103947] "21572351" "21572351" "21572351" "21572351" "21572351" "21572351" "21572351" "21572351" "21572351" "21572351" ...
conversation_id :  chr [1:1103947] "1006266341341519872" "1006265425791987715" "1006251577747869696" "1006246236171722753" "1006246168991600642" ...
created_at :  chr [1:1103947] "2018-06-11T20:06:05.000Z" "2018-06-11T20:02:27.000Z" "2018-06-11T19:07:26.000Z" "2018-06-11T18:46:12.000Z" ...
entities : 'data.frame':    1103947 obs. of  5 variables:
 $ mentions   :List of 1103947
 $ annotations:List of 1103947
 $ hashtags   :List of 1103947
 $ urls       :List of 1103947
 $ cashtags   :List of 1103947
geo : 'data.frame': 1103947 obs. of  2 variables:
 $ place_id   : chr  NA NA NA NA ...
 $ coordinates:'data.frame':    1103947 obs. of  2 variables:
id :  chr [1:1103947] "1006266341341519872" "1006265425791987715" "1006251577747869696" "1006246236171722753" "1006246168991600642" ...
in_reply_to_user_id :  chr [1:1103947] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ...

，我想将其转换为整洁的格式。我不知道有什么巧妙的功能要做吗？ Google并没有多大帮助。提前致谢！

通过整洁的格式，我的意思是这样：

#> # A tibble: 25 × 31
#>    tweet_id       user_username text  conversation_id author_id lang  created_at
#>    <chr>          <chr>         <chr> <chr>           <chr>     <chr> <chr>     
#>  1 1406007405180… Phardiga      "RT … 14060074051803… 58755490  de    2021-06-1…
#>  2 1405617386405… dorothee_goe… "RT … 14056173864058… 97759337… de    2021-06-1…
#>  3 1405616047990… dejools       "RT … 14056160479909… 13065071… de    2021-06-1…
#>  4 1405615055555… LenaOetzel    "RT … 14056150555557… 97897581… de    2021-06-1…
#>  5 1405613064968… jenniferhenk… "RT … 14056130649684… 114774406 de    2021-06-1…
#>  6 1405610724026… Tobias_Schul… "Ihr… 14056107240266… 47919307  de    2021-06-1…
#>  7 1405393033558… HTMIBerlin    "


              
              
                
                  原文 
                
              
              I have a data frame with a structure as follows
> ls.str(df)

attachments : 'data.frame': 1103947 obs. of  2 variables:
 $ media_keys:List of 1103947
 $ poll_ids  :List of 1103947
author_id :  chr [1:1103947] "21572351" "21572351" "21572351" "21572351" "21572351" "21572351" "21572351" "21572351" "21572351" "21572351" ...
conversation_id :  chr [1:1103947] "1006266341341519872" "1006265425791987715" "1006251577747869696" "1006246236171722753" "1006246168991600642" ...
created_at :  chr [1:1103947] "2018-06-11T20:06:05.000Z" "2018-06-11T20:02:27.000Z" "2018-06-11T19:07:26.000Z" "2018-06-11T18:46:12.000Z" ...
entities : 'data.frame':    1103947 obs. of  5 variables:
 $ mentions   :List of 1103947
 $ annotations:List of 1103947
 $ hashtags   :List of 1103947
 $ urls       :List of 1103947
 $ cashtags   :List of 1103947
geo : 'data.frame': 1103947 obs. of  2 variables:
 $ place_id   : chr  NA NA NA NA ...
 $ coordinates:'data.frame':    1103947 obs. of  2 variables:
id :  chr [1:1103947] "1006266341341519872" "1006265425791987715" "1006251577747869696" "1006246236171722753" "1006246168991600642" ...
in_reply_to_user_id :  chr [1:1103947] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ...

and I want to convert it to a tidy format. Is there a neat little function to do this, that I don't know of? Google hasn't been to much help. Thanks in advance!
By tidy format, I mean something like this:
#> # A tibble: 25 × 31
#>    tweet_id       user_username text  conversation_id author_id lang  created_at
#>    <chr>          <chr>         <chr> <chr>           <chr>     <chr> <chr>     
#>  1 1406007405180… Phardiga      "RT … 14060074051803… 58755490  de    2021-06-1…
#>  2 1405617386405… dorothee_goe… "RT … 14056173864058… 97759337… de    2021-06-1…
#>  3 1405616047990… dejools       "RT … 14056160479909… 13065071… de    2021-06-1…
#>  4 1405615055555… LenaOetzel    "RT … 14056150555557… 97897581… de    2021-06-1…
#>  5 1405613064968… jenniferhenk… "RT … 14056130649684… 114774406 de    2021-06-1…
#>  6 1405610724026… Tobias_Schul… "Ihr… 14056107240266… 47919307  de    2021-06-1…
#>  7 1405393033558… HTMIBerlin    "????‍????…  14053930335589… 94052353… und   2021-06-1…
#>  8 1404808751857… Tobias_Schul… ".@j… 14048087518576… 47919307  de    2021-06-1…
#>  9 1404440929881… ASattelmacher "Oka… 14044409298812… 11508518… de    2021-06-1…
#> 10 1404393457427… dr_john_aus_b "#Ic… 14043934574273… 30635588… und   2021-06-1…


              
              
              
              
              
  
    
      
        
        收藏 0
      
      
        
        分享到微信
        
      
      
        
          
          分享到QQ
        
      
      
        
          
          分享到微博
        
      
    
  


              
              
  
    如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web
    技术交流群。



            
  
    
    
    
  



            

            
            

            
            
            
            
            

  
    发布评论
    
      
        
        
      
      
        
          
          
            
              需要
              登录
              才能够评论， 你可以免费
              注册
              一个本站的账号。
            
          
        
        
          
          
          
          
        
      
    

    
    
      评论（1）
      
        
        
        
  
    
      
    
  
  
    
      
        寻找一个思念的角度
      
      
      2025-02-12 02:59:23
    
    使用 Tidyr ，我们可以用打开包装（将data.frame列拆卸为常规列），然后使用 unnest 转换列表列到常规列
library(dplyr)
library(tidyr)
df %>% 
  unpack(where(is.data.frame)) %>%
  unnest(where(is.list))

- 输出
# A tibble: 3 × 6
  media_keys poll_ids author_id conversation_id mentions annotations
       <int>    <int>     <int>           <int>    <int>       <int>
1          1        4         1               1        1           4
2          2        5         2               2        2           5
3          3        6         3               3        3           6

数据
df <- structure(list(attachments = structure(list(media_keys = structure(list(
    1L, 2L, 3L), class = "AsIs"), poll_ids = structure(list(4L, 
    5L, 6L), class = "AsIs")), class = "data.frame", row.names = c(NA, 
-3L)), author_id = 1:3, conversation_id = 1:3, entities = structure(list(
    mentions = structure(list(1L, 2L, 3L), class = "AsIs"), 
annotations = structure(list(
        4L, 5L, 6L), class = "AsIs")), 
class = "data.frame", row.names = c(NA, 
-3L))), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-3L))


    
    
      With tidyr, we could wrap with unpack (to unpack the data.frame columns into regular columns) and then with unnest to convert the list columns to regular columns
library(dplyr)
library(tidyr)
df %>% 
  unpack(where(is.data.frame)) %>%
  unnest(where(is.list))

-output
# A tibble: 3 × 6
  media_keys poll_ids author_id conversation_id mentions annotations
       <int>    <int>     <int>           <int>    <int>       <int>
1          1        4         1               1        1           4
2          2        5         2               2        2           5
3          3        6         3               3        3           6

data
df <- structure(list(attachments = structure(list(media_keys = structure(list(
    1L, 2L, 3L), class = "AsIs"), poll_ids = structure(list(4L, 
    5L, 6L), class = "AsIs")), class = "data.frame", row.names = c(NA, 
-3L)), author_id = 1:3, conversation_id = 1:3, entities = structure(list(
    mentions = structure(list(1L, 2L, 3L), class = "AsIs"), 
annotations = structure(list(
        4L, 5L, 6L), class = "AsIs")), 
class = "data.frame", row.names = c(NA, 
-3L))), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-3L))


    
    
    
    
    
      
        
           回复
        
      

      
        
           收藏 0
        
      

      

      
      

      

      
      
        原文 
      
      
    
  


        
        
        ~没有更多了~
      
    
    
  

  
  
    
      
        绑定邮箱获取回复消息
        由于您还没有绑定你的真实邮箱，如果其他用户或者作者回复了您的评论，将不能在第一时间通知您！


          
  

  
  

  
  
  
    
      
        关于作者
      
      
        
          
        
        雨后咖啡店
        暂无简介
      
      
        
          
          文章
        
        
          
          评论
        
        
          28
          人气
        
      
      
        
           关注
        
        
           发私信
        
      
    
  
  

  
  

  
  
  
  
  



  
    
  

  
  
    
      
        相关话题
      
      
        
          
          
            自动 Word 邮件合并未按预期工作
          
          
          
            关于从使用块返回的最佳实践
          
          
          
            我如何获得图表系列？  父母的父母的详细信息？
          
          
          
            根据三角形获取屏幕坐标
          
          
          
            按大小排序地图
          
          
          
            在 MSAccess 中，在 nvarchar 中插入 NULL 失败
          
          
          
            最好的分布式暴力对抗措施是什么？
          
          
          
            jQuery-java web 同一jsp中多个动态tab显示各种不同数据分析图（折线/饼图），采用什么框架集成开发比较好？
          
          
          
            如何使用 log4j 关闭日志记录？
          
          
          
            MVC 使用 Linq to Entity 和 sql 加密
          
          
        
      
    
  
  
  

  
  
    
      
        更多 
        热门标签
      
      
        
        操作系统
        
        程序设计
        
        IT运维
        
        Linux系统管理
        
        JavaScript
        
        服务器应用
        
        solaris
        
        C/C++
        
        PHP
        
        Shell
        
        BSD
        
        Vue.js
        
        aix
        
        Oracle
        
        Python
        
        HTML
        
        系统管理
        
        HTML5
        
        CSS
        
        前端
        
      
    
  
  

  

  
  
    
      
        更多 
        推荐作者
      
      
        
          
          
            
              
                
              
            
             关注 
            
              alipaysp_snBf0MSZIv
              
                文章 0
                评论 0
              
            
          
          
          
            
              
                
              
            
             关注 
            
              梦断已成空
              
                文章 0
                评论 0
              
            
          
          
          
            
              
                
              
            
             关注 
            
              瞎闹
              
                文章 0
                评论 0
              
            
          
          
          
            
              
                
              
            
             关注 
            
              凯凯我们等你回来
              
                文章 0
                评论 0
              
            
          
          
          
            
              
                
              
            
             关注 
            
              寄意
              
                文章 0
                评论 0
              
            
          
          
          
            
              
                
              
            
             关注 
            
              似梦非梦
              
                文章 0
                评论 0
              
            
          
          
        
      
    
  
  

  
  

  
    
      
        更多 
        友情链接
      
      
        
        文江博客

将dataframe转换为r中的整洁格式

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

数据

data

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。