为 Spark 中的每个组应用特定的过滤条件
我有一个如下的数据框:
id |original_date |date1 |date2 |name
1 |03-30-2022 |03-29-2022 |04-02-2022 | John
1 |03-27-2022 |03-29-2022 |04-02-2022 | Mary
2 |04-01-2022 |03-29-2022 |04-02-2022 | Joe
2 |03-30-2022 |04-02-2022 |04-08-2022 | Susan
3 |04-03-2022 |04-02-2022 |04-08-2022 | Mallory
我希望获得以下结果数据框,以便对于每组 id
,我想应用一个过滤条件,使得 date1
date1
date1
date2原始日期 <= 日期2
。
id |original_date |date1 |date2 |name
1 |03-30-2022 |03-29-2022 |04-02-2022 | John
2 |04-01-2022 |03-29-2022 |04-02-2022 | Joe
3 |04-03-2022 |04-02-2022 |04-08-2022 | Mallory
我该怎么做?
I have a data frame as below:
id |original_date |date1 |date2 |name
1 |03-30-2022 |03-29-2022 |04-02-2022 | John
1 |03-27-2022 |03-29-2022 |04-02-2022 | Mary
2 |04-01-2022 |03-29-2022 |04-02-2022 | Joe
2 |03-30-2022 |04-02-2022 |04-08-2022 | Susan
3 |04-03-2022 |04-02-2022 |04-08-2022 | Mallory
I am looking to get the following resultant dataframe such that for each group of id
, I want to apply a filter condition such that date1 < original_date <= date2
.
id |original_date |date1 |date2 |name
1 |03-30-2022 |03-29-2022 |04-02-2022 | John
2 |04-01-2022 |03-29-2022 |04-02-2022 | Joe
3 |04-03-2022 |04-02-2022 |04-08-2022 | Mallory
How can I do this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这不是只是直接的过滤条件,还是我缺少一些东西?
Isn't that just a straightforward filter condition, or am I missing something?