如何根据R中的分类列的条件总和列
我在r中有一个名为house_expenss的数据框,看起来像这样(2列:描述和金额):
DESCRIPTION AMOUNT
----------- ---------
COUCH $801.713
TV $4999.996
TV_MOUNT $575.867
ENTERTAINMENT_SYSTEM $1102.392
MATTRESS $1225.893
BEDFRAME $356.789
PILLOWS $528.989
我想为具有总和的数据框架创建两个额外的列,并将其四舍五入为2个小数点
- :沙发,电视,TV_Mount,Entertainment_System),= 2)
- 卧室= sum(圆形(床垫,床架,枕头),= 2)
我尝试过
house_expenses <- house_expenses %>%
group_by(DESCRIPTION) %>%
mutate(LIVING_ROOM_COSTS = sum(round(DESCRIPTION == "COUCH" &
DESCRIPTION == "TV" &
DESCRIPTION == "TV_MOUNT" &
DESCRIPTION == "ENTERTAINMENT_SYSTEM" , digits = 2)),
mutate(BEDROOM_COSTS = sum(round(DESCRIPTION == "MATTRESS" &
DESCRIPTION == "BEDFRAME" &
DESCRIPTION == "PILLOWS", digits = 2)))
但不幸的是,这没有起作用。以前有人遇到过这个问题吗?
I have a dataframe in r called house_expenses that looks like this (2 columns: DESCRIPTION and AMOUNT):
DESCRIPTION AMOUNT
----------- ---------
COUCH $801.713
TV $4999.996
TV_MOUNT $575.867
ENTERTAINMENT_SYSTEM $1102.392
MATTRESS $1225.893
BEDFRAME $356.789
PILLOWS $528.989
I would like to create two additional columns to the dataframe that has the sums and is rounded to 2 decimal places:
- LIVING_ROOM_COSTS = sum(round(COUCH, TV, TV_MOUNT, ENTERTAINMENT_SYSTEM), =2)
- BEDROOM_COSTS = sum(round(MATTRESS, BEDFRAME, PILLOWS), =2)
I have tried doing
house_expenses <- house_expenses %>%
group_by(DESCRIPTION) %>%
mutate(LIVING_ROOM_COSTS = sum(round(DESCRIPTION == "COUCH" &
DESCRIPTION == "TV" &
DESCRIPTION == "TV_MOUNT" &
DESCRIPTION == "ENTERTAINMENT_SYSTEM" , digits = 2)),
mutate(BEDROOM_COSTS = sum(round(DESCRIPTION == "MATTRESS" &
DESCRIPTION == "BEDFRAME" &
DESCRIPTION == "PILLOWS", digits = 2)))
But unfortunately this hasn't worked. Had anyone come across this before and know how to approach this problem?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
要获取解决方案,您想做一些子集,
description%in%c(“ couch”,“电视”,“ TV_Mount”,“ Entertainment_System”)
根据该行获得真或错误,然后您的子集金额
金额[Description%in%C(“ Couch”,“ TV”,“ TV_Mount”,“ Entertainment_System”)]
然后,将值包装在一个和围绕:
这给了我们数据。框架:
使用使用
允许我们在不使用
$
的情况下参考列名,因为没有足够答案的原因是因为给出的格式化所需的额外工作和人类是通常懒惰。
如果您已经格式化了data.frame。这样:
或使用函数
dput
:它会迅速回答。
To get the solution you want you have to do some subsetting,
Description %in% c("COUCH", "TV","TV_MOUNT","ENTERTAINMENT_SYSTEM")
Gets you the TRUE or FALSE according to the row, then you subset AMOUNT
AMOUNT[Description %in% c("COUCH", "TV","TV_MOUNT","ENTERTAINMENT_SYSTEM")]
Then you wrap the values in a sum and round it:
This gives us the data.frame of:
Using
with
allows us to refer to column names without using$
The reason there wasn't an answer sooner enough is because the formatting given required extra work and humans are generally lazy.
If you had formatted your data.frame like this:
Or like this using the function
dput
:It would have been answered swiftly.