当前位置：文江博客话题详情

如何读取以逗号作为小数点分隔符的数字？

发布于 2024-11-09 18:32:11 字数 194 浏览 4 评论 0原文

我有一系列 CSV 文件，其中数字采用欧洲风格格式，使用逗号而不是小数点，即 0,5 而不是 0.5。

这些文件太多，无法在导入到 R 之前对其进行编辑。我希望有一个用于 read.csv() 函数的简单参数，或者一种按顺序应用于提取的数据集的方法R 将数据视为数字而不是字符串。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

风追烟花雨 2024-11-16 18:32:11

当您检查 ?read.table 时，您可能会找到您需要的所有答案。

（大陆）欧洲 csv 文件存在两个问题：

csv 中的 c 代表什么？对于标准 csv，这是 ,，对于欧洲 csv，这是 ;
sep 是 read.table 中对应的参数
小数点是什么字符？对于标准 csv，这是 .，对于欧洲 csv，这是 ,
dec 是 read.table 中对应的参数

要读取标准 csv 使用 read.csv，要读取欧洲 csv 使用 read.csv2。这两个函数只是设置适当参数的 read.table 的包装器。

如果您的文件不遵循这些标准中的任何一个，请手动设置参数。

回复收藏 0 原文

围归者 2024-11-16 18:32:11

来自 ?read.table：

dec     the character used in the file for decimal points.

是的，您也可以将其用于 read.csv。（对我来说：不傻，你不能！）

或者，你也可以使用

read.csv2

它假设一个“，”小数分隔符和一个“;”用于列分隔符。

From ?read.table:

dec     the character used in the file for decimal points.

And yes, you can use that for read.csv as well. (to me: no stupid, you cannot!)

Alternatively, you can also use

read.csv2

which assumes a "," decimal separator and a ";" for column separators.

回复收藏 0 原文

雨后咖啡店 2024-11-16 18:32:11

read.csv(... , sep=";")

假设这个导入的字段称为“金额”，如果您的数字被作为字符读取，您可以通过这种方式修复类型：

d$amount <- sub(",",".",d$amount)
d$amount <- as.numeric(d$amount)

从 excel 或 excel csv 导入时，我经常遇到这种情况以及其他一些小烦恼。由于似乎没有一致的方法来确保在导入 R 时获得您所期望的结果，因此事后修复似乎是最好的方法。我的意思是，查看您导入的内容 - 确保它是您所期望的，如果不是，请修复它。

read.csv(... , sep=";")

Suppose this imported field is called "amount", you can fix the type in this way if your numbers are being read in as character:

d$amount <- sub(",",".",d$amount)
d$amount <- as.numeric(d$amount)

I have this happen to me frequently along with a bunch of other little annoyances when importing from excel or excel csv. As it seems that there's no consistent way to ensure getting what you expect when you import into R, post-hoc fixes seem to be the best method. By that I mean, LOOK at what you imported - make sure it's what you expected and fix it if it's not.

回复收藏 0 原文

慢慢从新开始 2024-11-16 18:32:11

可以使用如下：

mydata <- read.table(fileIn, dec=",")

input file (fileIn):

D:\TEST>more  input2.txt

06-05-2014 09:19:38     3,182534        0

06-05-2014 09:19:51     4,2311          0

can be used as follow:

mydata <- read.table(fileIn, dec=",")

input file (fileIn):

D:\TEST>more  input2.txt

06-05-2014 09:19:38     3,182534        0

06-05-2014 09:19:51     4,2311          0

回复收藏 0 原文

醉生梦死 2024-11-16 18:32:11

如果您指出缺失值的表示方式（na.strings=...），问题也可能得到解决。例如，这里的 V1 和 V2 具有相同的格式（在 csv 文件中小数点用“,”分隔），但由于 V1 中存在 NA，因此它被解释为因子：

dat <- read.csv2("...csv", header=TRUE)
head(dat)

> ID x    time    V1    V2
> 1  1   0:01:00 0,237 0.621
> 2  1   0:02:00 0,242 0.675
> 3  1   0:03:00 0,232 0.398


dat <- read.csv2("...csv", header=TRUE, na.strings="---")
head(dat)

> ID x    time    V1    V2
> 1  1   0:01:00 0.237 0.621
> 2  1   0:02:00 0.242 0.675
> 3  1   0:03:00 0.232 0.398

Problems may also be solved if you indicate how your missing values are represented (na.strings=...). For example V1 and V2 here have the same format (decimals separated by "," in csv file), but since NAs are present in V1 it is interpreted as factor:

dat <- read.csv2("...csv", header=TRUE)
head(dat)

> ID x    time    V1    V2
> 1  1   0:01:00 0,237 0.621
> 2  1   0:02:00 0,242 0.675
> 3  1   0:03:00 0,232 0.398


dat <- read.csv2("...csv", header=TRUE, na.strings="---")
head(dat)

> ID x    time    V1    V2
> 1  1   0:01:00 0.237 0.621
> 2  1   0:02:00 0.242 0.675
> 3  1   0:03:00 0.232 0.398

回复收藏 0 原文

高速公鹿 2024-11-16 18:32:11

也许

as.is=T

这也阻止了将字符列转换为因子

maybe

as.is=T

this also prevents to convert the character columns into factors

回复收藏 0 原文

月牙弯弯 2024-11-16 18:32:11

您可以将十进制字符作为参数传递 (dec = ",")：

# Semicolon as separator and comma as decimal point by default
read.csv2(file, header = TRUE, sep = ";", quote = "\"", dec = ",",
          fill = TRUE, comment.char = "", encoding = "unknown", ...)

有关 https://r-coder.com/read-csv-r/

You can pass the decimal character as a parameter (dec = ","):

# Semicolon as separator and comma as decimal point by default
read.csv2(file, header = TRUE, sep = ";", quote = "\"", dec = ",",
          fill = TRUE, comment.char = "", encoding = "unknown", ...)

More info on https://r-coder.com/read-csv-r/

回复收藏 0 原文

陈甜 2024-11-16 18:32:11

只是补充一下布兰登上面的答案，这对我来说效果很好（我没有足够的代表来发表评论）：

如果您正在使用，

    d$amount <- sub(",",".",d$amount)
    d$amount <- as.numeric(d$amount)

请不要忘记您可能需要 sub("[.]", " ", d$amount, perl=T) 来绕过 . 字符。

Just to add to Brandon's answer above, which worked well for me (I don't have enough rep to comment):

If you're using

    d$amount <- sub(",",".",d$amount)
    d$amount <- as.numeric(d$amount)

don't forget that you may need sub("[.]", "", d$amount, perl=T) to get around the . character.

回复收藏 0 原文

~没有更多了~

关于作者

夜深人未静

暂无简介

0 文章

0 评论

24 人气

关注发私信

友情链接

文江博客

如何读取以逗号作为小数点分隔符的数字？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（8）

关于作者

相关话题

热门标签

推荐作者

胡图图

zt006

z祗昰~

冰葑

野の

天空

友情链接

如何读取以逗号作为小数点分隔符的数字？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（8）

关于作者

相关话题

热门标签

推荐作者

胡图图

zt006

z祗昰~

冰葑

野の

天空

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。