R 在每年年初将 cumsum 重置为零

发布于 2024-12-22 01:22:10 字数 1391 浏览 3 评论 0原文

我有一个包含大量捐赠数据的数据框。我获取数据并按时间顺序从最旧的礼物到最近的礼物进行排列。接下来,我添加一列,其中包含一段时间内礼物的累计金额。该数据包含多年的数据,我一直在寻找一种在每年年初将 cumsum 重置为 0 的好方法(出于财务目的,该年从 7 月 1 日开始和结束)。

这就是它目前的样子:

id        date          giftamt      cumsum()
005       01-05-2001     20.00        20.00
007       06-05-2001     25.00        45.00
009       12-05-2001     20.00        65.00
012       02-05-2002     30.00        95.00
015       08-05-2002     50.00       145.00
025       12-05-2002     25.00       170.00
...          ...          ...         ...

这就是我希望它看起来的样子:

id        date          giftamt      cumsum()
005       01-05-2001     20.00        20.00
007       06-05-2001     25.00        45.00
009       12-05-2001     20.00        20.00
012       02-05-2002     30.00        50.00
015       08-05-2002     50.00        50.00
025       12-05-2002     25.00        75.00
...          ...          ...          ...

有什么建议吗?

更新:

这是最终由 Seb 提供的代码:

#tweak for changing the calendar year to fiscal year
df$year <- as.numeric(format(as.Date(df$giftdate), format="%Y"))
df$month <- as.numeric(format(as.Date(df$giftdate), format="%m"))
df$year <- ifelse(df$month<=6, df$year, df$year+1)

#cum-summing :)
library(plyr)
finalDf <- ddply(df, .(year), summarize, cumsum(as.numeric(as.character(giftamt))))

I have a dataframe with a bunch of donations data. I take the data and arrange it in time order from oldest to most recent gifts. Next I add a column containing a cumulative sum of the gifts over time. The data has multiple years of data and I was looking for a good way to reset the cumsum to 0 at the start of each year (the year starts and ends July 1st for fiscal purposes).

This is how it currently is:

id        date          giftamt      cumsum()
005       01-05-2001     20.00        20.00
007       06-05-2001     25.00        45.00
009       12-05-2001     20.00        65.00
012       02-05-2002     30.00        95.00
015       08-05-2002     50.00       145.00
025       12-05-2002     25.00       170.00
...          ...          ...         ...

this is how I would like it to look:

id        date          giftamt      cumsum()
005       01-05-2001     20.00        20.00
007       06-05-2001     25.00        45.00
009       12-05-2001     20.00        20.00
012       02-05-2002     30.00        50.00
015       08-05-2002     50.00        50.00
025       12-05-2002     25.00        75.00
...          ...          ...          ...

Any suggestions?

UPDATE:

Here's the code that finally worked courtesy of Seb :

#tweak for changing the calendar year to fiscal year
df$year <- as.numeric(format(as.Date(df$giftdate), format="%Y"))
df$month <- as.numeric(format(as.Date(df$giftdate), format="%m"))
df$year <- ifelse(df$month<=6, df$year, df$year+1)

#cum-summing :)
library(plyr)
finalDf <- ddply(df, .(year), summarize, cumsum(as.numeric(as.character(giftamt))))

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

深海里的那抹蓝 2024-12-29 01:22:10

我会这样尝试(df 是数据框):

#tweak for changing the calendar year to fiscal year
df$year <- format(as.Date(df$date), format="%Y")
df$month <- format(as.Date(df$date), format="%m")
df$year <- ifelse(df$month<=6, year, year+1)

#cum-summing :)
library(plyr)
ddply(df, .(year), summarize, cumsum(giftamt))

i would try it this way (df being the dataframe):

#tweak for changing the calendar year to fiscal year
df$year <- format(as.Date(df$date), format="%Y")
df$month <- format(as.Date(df$date), format="%m")
df$year <- ifelse(df$month<=6, year, year+1)

#cum-summing :)
library(plyr)
ddply(df, .(year), summarize, cumsum(giftamt))
遮了一弯 2024-12-29 01:22:10

有两个任务:在代表每年的数据框中创建一列,然后拆分数据、应用累积和并重新组合。 R 有很多方法可以完成这两个部分。

第一个任务的最易读的方式可能是使用 lubridate 包中的 year

library(lubridate)
df$year <- year(df$date)

请注意,R 有很多日期格式,因此值得检查一下您当前是否使用 POSIXctDatechronZooxts 或其他格式之一。

我推荐 Seb 在第二个任务中选择 ddply 。为了完整起见,您还可以使用 tapplyaggregate

with(df, tapply(giftamt, year, cumsum))
aggregate(giftamt ~ year, df, cumsum)

使用您希望在 7 月 1 日更改年份的新信息,将年份列更新为

df$year <- with(df, year(date) + (month(date) >= 7))

There are two tasks: create a column in the data frame representing each year, then split the data, apply the cumsum, and recombine. R has lots ways of doing both parts.

Probably the most readable way of dong the first task is with year from the lubridate package.

library(lubridate)
df$year <- year(df$date)

Note that R has lots of date formats, so it's worth checking to see whether you are currently using POSIXct or Date or chron or zoo or xts or one of the other formats.

Seb's choice or ddply for the second task is the one I'd recommend. For completeness, you can also use tapply or aggregate.

with(df, tapply(giftamt, year, cumsum))
aggregate(giftamt ~ year, df, cumsum)

With the new info that you want years to change on 1st July, update the year column to

df$year <- with(df, year(date) + (month(date) >= 7))
浮生未歇 2024-12-29 01:22:10
gifts <- read.table("gifts.txt", header=T, quote="\"")
NbGifts <- nrow(gifts)

# Determination of the relevant fiscal year ending dates
CalYear <- as.numeric(substr(gifts$date,7,10)) # calendar years
TCY <- as.numeric(names(table(CalYear))) # list of calendar years
MDFY <- "07-01-" # ending date for the current fiscal year
EFY <- paste(MDFY,TCY,sep="") # list of fiscal year ending dates
EFYplus <- cbind(TCY,EFY) # table of fiscal year ending dates
colnames(EFYplus) <- c("CalYear","EndDate")

# Manipulation of data frames in order to match
# the fiscal year end dates to the relevant dates
giftsPlusYear <- data.frame(CalYear, gifts, stringsAsFactors = FALSE)
giftsPlusEFY <- merge(giftsPlusYear,EFYplus) # using the CalYear

# Date comparison in order to associate a gift to its fiscal year
DateGift <- as.Date(giftsPlusEFY$date,"%m-%d-%y") # date conversion for comparison
DateEFY <- as.Date(giftsPlusEFY$EndDate,"%m-%d-%y")
FiscYear <- ifelse(DateGift<DateEFY,giftsPlusEFY$CalYear,giftsPlusEFY$CalYear+1)

# Computation of cumulative totals per fiscal year
LastFY <- 0
CumGift <- rep(0,NbGifts)
for (g in 1:NbGifts){
  if (LastFY==FiscYear[g]){
    CumGift[g] <- CumGift[g-1] + gifts$giftamt[g]
    } else {
      CumGift[g] <- gifts$giftamt[g]
      LastFY <- FiscYear[g]
    }
}
(CumGifts <- cbind(gifts,CumGift))
gifts <- read.table("gifts.txt", header=T, quote="\"")
NbGifts <- nrow(gifts)

# Determination of the relevant fiscal year ending dates
CalYear <- as.numeric(substr(gifts$date,7,10)) # calendar years
TCY <- as.numeric(names(table(CalYear))) # list of calendar years
MDFY <- "07-01-" # ending date for the current fiscal year
EFY <- paste(MDFY,TCY,sep="") # list of fiscal year ending dates
EFYplus <- cbind(TCY,EFY) # table of fiscal year ending dates
colnames(EFYplus) <- c("CalYear","EndDate")

# Manipulation of data frames in order to match
# the fiscal year end dates to the relevant dates
giftsPlusYear <- data.frame(CalYear, gifts, stringsAsFactors = FALSE)
giftsPlusEFY <- merge(giftsPlusYear,EFYplus) # using the CalYear

# Date comparison in order to associate a gift to its fiscal year
DateGift <- as.Date(giftsPlusEFY$date,"%m-%d-%y") # date conversion for comparison
DateEFY <- as.Date(giftsPlusEFY$EndDate,"%m-%d-%y")
FiscYear <- ifelse(DateGift<DateEFY,giftsPlusEFY$CalYear,giftsPlusEFY$CalYear+1)

# Computation of cumulative totals per fiscal year
LastFY <- 0
CumGift <- rep(0,NbGifts)
for (g in 1:NbGifts){
  if (LastFY==FiscYear[g]){
    CumGift[g] <- CumGift[g-1] + gifts$giftamt[g]
    } else {
      CumGift[g] <- gifts$giftamt[g]
      LastFY <- FiscYear[g]
    }
}
(CumGifts <- cbind(gifts,CumGift))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文