日期列中特定日期的值

发布于 2025-01-19 12:12:32 字数 1975 浏览 0 评论 0原文

在问题开始时,您只有 2 列:日期和值。

从这里开始,我们的想法是获取过去一个月和过去一年的价值。最终输出如下:

日期m1m12m1_valm12_val
2022-02-271002022-01-272021-02-27nannan
2022-03-273002022-02-272021-03-27100
2022-03-305002022-02-302022-03-30nannan
2023-02-278002023-01-272022-02-27nan100

我已经完成了,但没有矢量化,我想改变向量化的最终应用函数,不需要逐行进行。

例如,要创建列 m1 和 m12,您可以使用

d['year'] = d['date'].dt.year
d['month'] = d['date'].dt.month
d['day'] = d['date'].dt.day
d['month-1'] = (d['month'] - 1)
d['year-1'] = d['date'].dt.year
d['year-12'] = d['year'] - 1
d.loc[d['month-1'] == 0, 'year-1'] = d.loc[d['month-1'] == 0, 'year-1'] - 1
d.loc[d['month-1'] == 0, 'month-1'] = 12
d['m1'] = pd.to_datetime(d[['year-1', 'month-1', 'day']].rename({'year-1':'year', 'month-1':'month'}, axis=1), errors='coerce')
d['m12'] = pd.to_datetime(d[['year-12', 'month', 'day']].rename({'year-12':'year'}, axis=1), errors='coerce')
d = d.drop(['year', 'month', 'day', 'month-1', 'year-1', 'year-12'], axis=1)

这样,我使用下一个 apply 函数来填充列 m1_val 和 m12_val,它基本上搜索日期列中的每个所需值并返回它。

def test(x, col):
    value = d.loc[d['date'] == x[col]]['value']
    if len(value) == 0:
        return np.nan
    else:
        return value.iloc[0]

d['m1_val'] = d.apply(lambda x: test(x, 'm1'), axis=1)
d['m12_val'] = d.apply(lambda x: test(x, 'm12'), axis=1)

但是,是否有更好的方法从日期列获取 m1 的值而不使用 for 循环?我在想也许我可以使用 np.where o d.loc 的东西...但我不知道如何使用 d.loc[d['date'].isin(d['m1'] )] 然后是 groupby()?但就性能而言,它看起来与使用 apply() 类似

At the start of the problem, you only have 2 columns, date and value.

From here the idea is getting the value from past month and past year. The final output would be something like this:

datevaluem1m12m1_valm12_val
2022-02-271002022-01-272021-02-27nannan
2022-03-273002022-02-272021-03-27100nan
2022-03-305002022-02-302022-03-30nannan
2023-02-278002023-01-272022-02-27nan100

I have already done it but without vectorization, and I wanted to change the final apply function for something vectorize, to not need to go row by row.

For example, to create the columns m1 and m12 you could use

d['year'] = d['date'].dt.year
d['month'] = d['date'].dt.month
d['day'] = d['date'].dt.day
d['month-1'] = (d['month'] - 1)
d['year-1'] = d['date'].dt.year
d['year-12'] = d['year'] - 1
d.loc[d['month-1'] == 0, 'year-1'] = d.loc[d['month-1'] == 0, 'year-1'] - 1
d.loc[d['month-1'] == 0, 'month-1'] = 12
d['m1'] = pd.to_datetime(d[['year-1', 'month-1', 'day']].rename({'year-1':'year', 'month-1':'month'}, axis=1), errors='coerce')
d['m12'] = pd.to_datetime(d[['year-12', 'month', 'day']].rename({'year-12':'year'}, axis=1), errors='coerce')
d = d.drop(['year', 'month', 'day', 'month-1', 'year-1', 'year-12'], axis=1)

And with this, I was using the next apply function to fill the cols m1_val and m12_val, which basically search each needed value in the date column and return it.

def test(x, col):
    value = d.loc[d['date'] == x[col]]['value']
    if len(value) == 0:
        return np.nan
    else:
        return value.iloc[0]

d['m1_val'] = d.apply(lambda x: test(x, 'm1'), axis=1)
d['m12_val'] = d.apply(lambda x: test(x, 'm12'), axis=1)

But, is there a better way to get the value of m1 from date column without using for loops? I was thinking maybe I could use something with np.where o d.loc... but I didn't know how, with d.loc[d['date'].isin(d['m1'])] and then a groupby()? But in terms of performance it looks similar as using apply()

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文