读取带有不规则标题的文本文件(在 R 中)
我正在尝试将一个平面文件读入 R。
它由 ';' 分隔并有 12 行注释来描述内容。 我想阅读该文件并排除注释。
但问题是注释行 11 包含数据标头,如下所示:
# Fields: labno;姓名;多布;性别;地点; date
有没有一种方法可以从注释中提取标题并将其应用到数据中。我的想法是只读取前 11 行,并将 labno 中的所有内容存储为向量。我将从第 13 行读取所有内容,并使用存储向量作为日期的列名称。
有没有办法读取前 11 行并删除 labno
之前的所有内容,
谢谢。
I am trying to read a flat file into R.
It is separated by ';' and has 12 leading lines of comments to describe the content.
I want to read the file and exlude the comments.
The problem however is that the commented line 11 contains the data headers as follows:
# Fields: labno; name; dob; sex; location; date
Is there a way that I can extract the headers form the comments and apply them to the data. The way I thought of doing it was to read the first 11 lines only and store everything from labno as a vector. The I would read everything from line 13 and use the store vector as column names for the the date.
Is there a way to read the first 11 lines and remove everything before labno
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
步骤 1:(仅读取包含列名称的第十一行。)
步骤 2:(读取文件的其余部分,允许默认自动名称)
步骤 3:(在应用“字段”作为列名称之前删除无关字符)
更高版本的 R 允许 ' scan' 以获得一个 'text' 参数,而不是需要尴尬的 textConnection 函数。
Step1: (read only the eleventh row containing column names. )
Step2: (read the rest of the file, allowing default automatic names)
Step3: (remove extraneous characters before applying the ‘fields’ as column names)
Later versions of R allow ‘scan’ to have a ‘text’ argument rather than requiring the awkward textConnection function.