当前位置：文江博客话题详情

新列：日期=日期时的传输值

发布于 2025-01-26 06:11:26 字数 3234 浏览 2 评论 0 原文

我有两个数据帧。两者都有一个日期列。 DF A有大约14000行，并包含每日回报。 DF B有大约600行，并包含各个月份的平均每日CPI。因此，在DF B中，日期列的仅百分比是决定性的。这是一个简短的示例，数据帧的样子是：

DF A：

   Date       Returns
   <date>       <dbl>
y-m-d
1965-01-06 1.0049728
1965-01-07 1.0081001
1965-01-08 1.0094975
1965-01-12 1.0016388
1965-01-13 0.9946541
1965-01-14 1.0060906
1965-01-18 0.9860844
1965-01-19 0.9989457
1965-01-20 0.9988501
1965-01-21 1.0012444
1965-01-22 0.9931840
1965-01-25 0.9903565
1965-01-26 0.9986629
1965-01-27 1.0081077
1965-01-28 0.9912110
1965-01-29 0.9960394
1965-02-01 1.0004506
1965-02-02 0.9967118
1965-02-03 1.0074635
1965-02-04 1.0108144
# … with 14,065 more rows

DF B

     DATE      CPI
    y-m-d
1   1965-01-01 1.000275
2   1965-02-01 1.000271
3   1965-03-01 1.000278
4   1965-04-01 1.000318
5   1965-05-01 1.000274
6   1965-06-01 1.000274
7   1965-07-01 1.000276
8   1965-08-01 1.000252
9   1965-09-01 1.000257
10  1965-10-01 1.000236
11  1965-11-01 1.000229
# … with 676 more rows

我想彼此计算两个DF。为此，我想在DF A中创建一个新的列，其中XY中的所有天数中都输入了DF B的相应值。如下所示：

输出

   Date       Returns  OUTPUT = CPI from data frame B
   <date>       <dbl>  <dbl>
y-m-d
1965-01-06 1.0049728   1.000275
1965-01-07 1.0081001   1.000275
1965-01-08 1.0094975   1.000275
1965-01-12 1.0016388   1.000275
1965-01-13 0.9946541   1.000275
1965-01-14 1.0060906   1.000275
1965-01-18 0.9860844   1.000275
1965-01-19 0.9989457   1.000275
1965-01-20 0.9988501   1.000275
1965-01-21 1.0012444   1.000275
.............................
1965-02-01 1.0004506   1.000271
1965-02-02 0.9967118   1.000271
1965-02-03 1.0074635   1.000271
1965-02-04 1.0108144   1.000271
.............................
1965-03-01 0.9945527   1.000278
1965-03-02 0.9967290   1.000278
1965-03-03 1.0043810   1.000278
1965-03-04 1.0043043   1.000278

# … with 14,065 more rows

我认为这是通过参考日期列的月份和年份可行的。但是，我是R的新手，不知道如何实施此功能。

到目前为止，这是我的代码。

#DF A 
library(quantmod)
library(mosaic)
library(dplyr)
library(xts)

data.N225 <- getSymbols("^N225",from="1965-01-01", to="2022-02-30", auto.assign=FALSE, src='yahoo') # funktion getSymbols wenn wir Kapitalmarkt haben wollten 
class(data.N225)
data.N225[c(1:3, nrow(data.N225)),]
#View(data.N225)
data.N225<- na.omit(data.N225)
N225 <- data.N225[,6]

N225$Returns<- diff(log(N225$N225.Adjusted))+1
N225[c(1:3,nrow(N225)),]
options(digits=5)
N225.stetig <- N225[-1,2]
N225.stetig[c(1:3,nrow(N225.stetig)),]

DF_A <- as_tibble(data.frame(Date = index(N225.stetig), coredata(N225.stetig)))
View(DF_A)

#################
#DF B

setwd("enter file path")
CPI<- read.csv("InflationJapan.csv")
#View(CPI)

CPI$percent <- CPI[,2]/100
CPI$CPI.umsk <- (CPI[,3]/250)
CPI$CPI <- CPI[,4]+1
class(CPI$DATE)
CPI$DATE <- as.Date.character(CPI$DATE, tryFormats =  c("%Y-%m-%d"))
class(CPI$DATE)

DF<- as_tibble(data.frame(Date = index(CPI$DATE), coredata(CPI)))
DF_B <- DF[,c(2,6)]
options(digits=7)
View(DF_B)

可以从以下链接下载CSV文件。

我希望有人可以帮助我。预先感谢！

原文

I have two data frames. Both have a date column. DF A has about 14000 rows and contains daily returns. DF B has about 600 rows and contains the average daily CPI in the respective month. So in DF B only %m-%Y of the date column are decisive. Here is a short example how the data frames look like:

DF A:

   Date       Returns
   <date>       <dbl>
y-m-d
1965-01-06 1.0049728
1965-01-07 1.0081001
1965-01-08 1.0094975
1965-01-12 1.0016388
1965-01-13 0.9946541
1965-01-14 1.0060906
1965-01-18 0.9860844
1965-01-19 0.9989457
1965-01-20 0.9988501
1965-01-21 1.0012444
1965-01-22 0.9931840
1965-01-25 0.9903565
1965-01-26 0.9986629
1965-01-27 1.0081077
1965-01-28 0.9912110
1965-01-29 0.9960394
1965-02-01 1.0004506
1965-02-02 0.9967118
1965-02-03 1.0074635
1965-02-04 1.0108144
# … with 14,065 more rows

DF B

     DATE      CPI
    y-m-d
1   1965-01-01 1.000275
2   1965-02-01 1.000271
3   1965-03-01 1.000278
4   1965-04-01 1.000318
5   1965-05-01 1.000274
6   1965-06-01 1.000274
7   1965-07-01 1.000276
8   1965-08-01 1.000252
9   1965-09-01 1.000257
10  1965-10-01 1.000236
11  1965-11-01 1.000229
# … with 676 more rows

I would like to compute both DFs with each other. To do so, I want to create a new column in DF A, in which the corresponding value from DF B is entered for all days in month XY. As follows:

OUTPUT

   Date       Returns  OUTPUT = CPI from data frame B
   <date>       <dbl>  <dbl>
y-m-d
1965-01-06 1.0049728   1.000275
1965-01-07 1.0081001   1.000275
1965-01-08 1.0094975   1.000275
1965-01-12 1.0016388   1.000275
1965-01-13 0.9946541   1.000275
1965-01-14 1.0060906   1.000275
1965-01-18 0.9860844   1.000275
1965-01-19 0.9989457   1.000275
1965-01-20 0.9988501   1.000275
1965-01-21 1.0012444   1.000275
.............................
1965-02-01 1.0004506   1.000271
1965-02-02 0.9967118   1.000271
1965-02-03 1.0074635   1.000271
1965-02-04 1.0108144   1.000271
.............................
1965-03-01 0.9945527   1.000278
1965-03-02 0.9967290   1.000278
1965-03-03 1.0043810   1.000278
1965-03-04 1.0043043   1.000278

# … with 14,065 more rows

I assume that this is feasible by referring to the month and year of the Date columns. However, I am new to R and do not know how to implement this.

Here is my code so far.

#DF A 
library(quantmod)
library(mosaic)
library(dplyr)
library(xts)

data.N225 <- getSymbols("^N225",from="1965-01-01", to="2022-02-30", auto.assign=FALSE, src='yahoo') # funktion getSymbols wenn wir Kapitalmarkt haben wollten 
class(data.N225)
data.N225[c(1:3, nrow(data.N225)),]
#View(data.N225)
data.N225<- na.omit(data.N225)
N225 <- data.N225[,6]

N225$Returns<- diff(log(N225$N225.Adjusted))+1
N225[c(1:3,nrow(N225)),]
options(digits=5)
N225.stetig <- N225[-1,2]
N225.stetig[c(1:3,nrow(N225.stetig)),]

DF_A <- as_tibble(data.frame(Date = index(N225.stetig), coredata(N225.stetig)))
View(DF_A)

#################
#DF B

setwd("enter file path")
CPI<- read.csv("InflationJapan.csv")
#View(CPI)

CPI$percent <- CPI[,2]/100
CPI$CPI.umsk <- (CPI[,3]/250)
CPI$CPI <- CPI[,4]+1
class(CPI$DATE)
CPI$DATE <- as.Date.character(CPI$DATE, tryFormats =  c("%Y-%m-%d"))
class(CPI$DATE)

DF<- as_tibble(data.frame(Date = index(CPI$DATE), coredata(CPI)))
DF_B <- DF[,c(2,6)]
options(digits=7)
View(DF_B)

The CSV file can be downloaded from the following link.

https://drive.google.com/drive/folders/1GUlce4wR_lWiCbJivwDadgDYaz2ir0_L?usp=sharing

I hope someone can help me. Thanks a lot in advance!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

七婞 2025-02-02 06:11:26

在下面，我提供了一个示例，说明了如何在每个数据范围内添加YM的列，然后在循环中将CPI输出与DataFrame B的CPI输出匹配。

希望它有帮助！

library(lubridate)

# read in a sample of dataframe A data
dfA <- read.csv("DFA.csv")
# add column with y-m, add empty column for Output
dfA <- dfA %>%
  mutate(YearMonth = format(as.Date(dfA$Date, format = "%Y-%m-%d"), "%Y-%m")) %>%
  mutate(Output = NA)
dfA

         Date   Returns YearMonth Output
1  1965-01-06 1.0049728   1965-01     NA
2  1965-01-07 1.0081001   1965-01     NA
3  1965-01-08 1.0094975   1965-01     NA
4  1965-01-12 1.0016388   1965-01     NA
5  1965-01-13 0.9946541   1965-01     NA
6  1965-01-14 1.0060906   1965-01     NA
7  1965-02-01 1.0004506   1965-02     NA
8  1965-02-02 0.9967118   1965-02     NA
9  1965-02-03 1.0074635   1965-02     NA
10 1965-02-04 1.0108144   1965-02     NA
11 1965-03-01 0.9945527   1965-03     NA
12 1965-03-02 0.9967290   1965-03     NA


dfB <- read.csv("InflationJapan.csv")
# add column with y-m
dfB <- dfB %>%
  mutate(YearMonth = format(as.Date(dfB$Date, format = "%m/%d/%Y"), "%Y-%m")) 
head(dfB)
 Date      CPI YearMonth
1 1/1/1965 6.870229   1965-01
2 2/1/1965 6.768559   1965-02
3 3/1/1965 6.956522   1965-03
4 4/1/1965 7.948443   1965-04
5 5/1/1965 6.844920   1965-05
6 6/1/1965 6.859593   1965-06


# Add CPI per month/ year to dataframe A
for (i in 1:nrow(dfA)) { # for each row in dataframe A
  for (j in 1:nrow(dfB)) { # for each row in dataframe B
    if (dfA$YearMonth[i] == dfB$YearMonth[j]) { # if year and month match
      dfA$Output[i] <- dfB$CPI[j] # add CPI value from dataframe B to dataframe A 
    }
  }
}

dfA
         Date   Returns YearMonth   Output
1  1965-01-06 1.0049728   1965-01 6.870229
2  1965-01-07 1.0081001   1965-01 6.870229
3  1965-01-08 1.0094975   1965-01 6.870229
4  1965-01-12 1.0016388   1965-01 6.870229
5  1965-01-13 0.9946541   1965-01 6.870229
6  1965-01-14 1.0060906   1965-01 6.870229
7  1965-02-01 1.0004506   1965-02 6.768559
8  1965-02-02 0.9967118   1965-02 6.768559
9  1965-02-03 1.0074635   1965-02 6.768559
10 1965-02-04 1.0108144   1965-02 6.768559
11 1965-03-01 0.9945527   1965-03 6.956522
12 1965-03-02 0.9967290   1965-03 6.956522

Below I've provided an example of how you can add a column for Y-M to each of the dataframes, and then match the CPI output from dataframe B to dataframe A in a loop.

Hope it helps!

library(lubridate)

# read in a sample of dataframe A data
dfA <- read.csv("DFA.csv")
# add column with y-m, add empty column for Output
dfA <- dfA %>%
  mutate(YearMonth = format(as.Date(dfA$Date, format = "%Y-%m-%d"), "%Y-%m")) %>%
  mutate(Output = NA)
dfA

         Date   Returns YearMonth Output
1  1965-01-06 1.0049728   1965-01     NA
2  1965-01-07 1.0081001   1965-01     NA
3  1965-01-08 1.0094975   1965-01     NA
4  1965-01-12 1.0016388   1965-01     NA
5  1965-01-13 0.9946541   1965-01     NA
6  1965-01-14 1.0060906   1965-01     NA
7  1965-02-01 1.0004506   1965-02     NA
8  1965-02-02 0.9967118   1965-02     NA
9  1965-02-03 1.0074635   1965-02     NA
10 1965-02-04 1.0108144   1965-02     NA
11 1965-03-01 0.9945527   1965-03     NA
12 1965-03-02 0.9967290   1965-03     NA


dfB <- read.csv("InflationJapan.csv")
# add column with y-m
dfB <- dfB %>%
  mutate(YearMonth = format(as.Date(dfB$Date, format = "%m/%d/%Y"), "%Y-%m")) 
head(dfB)
 Date      CPI YearMonth
1 1/1/1965 6.870229   1965-01
2 2/1/1965 6.768559   1965-02
3 3/1/1965 6.956522   1965-03
4 4/1/1965 7.948443   1965-04
5 5/1/1965 6.844920   1965-05
6 6/1/1965 6.859593   1965-06


# Add CPI per month/ year to dataframe A
for (i in 1:nrow(dfA)) { # for each row in dataframe A
  for (j in 1:nrow(dfB)) { # for each row in dataframe B
    if (dfA$YearMonth[i] == dfB$YearMonth[j]) { # if year and month match
      dfA$Output[i] <- dfB$CPI[j] # add CPI value from dataframe B to dataframe A 
    }
  }
}

dfA
         Date   Returns YearMonth   Output
1  1965-01-06 1.0049728   1965-01 6.870229
2  1965-01-07 1.0081001   1965-01 6.870229
3  1965-01-08 1.0094975   1965-01 6.870229
4  1965-01-12 1.0016388   1965-01 6.870229
5  1965-01-13 0.9946541   1965-01 6.870229
6  1965-01-14 1.0060906   1965-01 6.870229
7  1965-02-01 1.0004506   1965-02 6.768559
8  1965-02-02 0.9967118   1965-02 6.768559
9  1965-02-03 1.0074635   1965-02 6.768559
10 1965-02-04 1.0108144   1965-02 6.768559
11 1965-03-01 0.9945527   1965-03 6.956522
12 1965-03-02 0.9967290   1965-03 6.956522

回复收藏 0 原文

~没有更多了~

关于作者

嘦怹

暂无简介

文章

621 人气

关注发私信

友情链接

文江博客

新列：日期=日期时的传输值

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

新列：日期=日期时的传输值

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。