读取R中的数据集,其中逗号用于字段分隔符和小数点
你如何在R
中读取这个数据集,问题是 这些数字是浮点数并且类似于4,000000059604644E+16
它们由 ,
分隔,
4,000000059604644E-16 , 7,999997138977056E-16, 9,000002145767216E-16
4,999999403953552E-16 , 6,99999988079071E-16 , 0,099999904632568E-16
9,999997615814208E-16 , 4,30000066757202E-16 , 3,630000114440918E-16
0,69999933242798E-16 , 0,099999904632568E-16, 55,657576767799999E-16
3,999999761581424E-16, 1,9900000095367432E-16, 0,199999809265136E-16
您将如何在 R 中加载这一数据集,使其具有 3 列。
如果我
dataset <- read.csv("C:\\data.txt",header=T,row.names=NULL)
这样做,它将返回 6 列而不是 3...
How could you read this dataset in R
, the problem is
that the numbers are floats and are like 4,000000059604644E+16
and they are separated by a ,
4,000000059604644E-16 , 7,999997138977056E-16, 9,000002145767216E-16
4,999999403953552E-16 , 6,99999988079071E-16 , 0,099999904632568E-16
9,999997615814208E-16 , 4,30000066757202E-16 , 3,630000114440918E-16
0,69999933242798E-16 , 0,099999904632568E-16, 55,657576767799999E-16
3,999999761581424E-16, 1,9900000095367432E-16, 0,199999809265136E-16
How would you load this kinf of dataset in R so it has 3 columns.
If I do
dataset <- read.csv("C:\\data.txt",header=T,row.names=NULL)
it would return 6 columns instead 3...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
最好将输入数据转换为在浮点数中使用小数点,而不是逗号。实现此目的的一种方法是使用 sed(看起来您使用的是 Windows,因此您可能需要 sed 才能使用此方法):
文件
data2
如下所示:然后在 R 中:
It might be best to transform that input data to use decimal points, rather than commas, in the floating point numbers. One way you could do this is to use sed (it looks like you are using Windows, so you would likely need to sed to use this approach):
File
data2
looks like this:Then in R:
这是一个全 R 解决方案,使用三个
read.table
调用。第一个read.table
语句将每个数据行读取为 6 个字段;第二个 read.table 语句将字段正确组合在一起并读取它们,第三个语句从标头中获取名称。给出:
注意:问题中的
read.csv
语句暗示存在标题,但示例数据未显示标题。我假设有一个标头,但如果没有,则删除skip=
和col.names=
参数。Here is an all R solution that uses three
read.table
calls. The firstread.table
statement reads each data row as 6 fields; the secondread.table
statement puts the fields back together properly and reads them and the third grabs the names from the header.which gives:
Note: The
read.csv
statement in the question implies that there is a header but the sample data does not show one. I assumed that there is a header but if not then remove theskip=
andcol.names=
arguments.它不漂亮,但应该可以工作:
It's not pretty, but it should work: