Python:查找一列中的最新日期,而另一列中没有匹配的日期
我有两个日期列,代表了客户设施的条目和退出。
ID | ENTRY_DATE | EXIT_DATE | ORICANT_ENTRYDATE |
---|---|---|---|
003246 | 2022-03-22 | NAN | 2012-10-01 |
003246 | 2015-07-24 | 2022-03-22 | 2012-10-10-01 |
003246 | | 2012-10-01 | |
003246 | 2001-02-02 | 2010-04-05 | 2001-02-02 |
对于桌子中的所有ID实例输入日期代表该ID在设施之间移动但不离开护理的时间跨度的开始,并将其返回,并将其返回原始_entrydate。
在示例中,前三行的原始_entrydate值是2012-10-01,因为该entry_date与exit_date不匹配,这表明与护理的分离,日期显示了这持续了两年零几个月。如果还有该ID的其他记录,则该过程将重置并找到与该分离之前的任何记录,直到下一个分隔的任何记录。
I have two date columns representing Entries and Exits from facilities for clients.
ID | entry_date | exit_date | original_entrydate |
---|---|---|---|
003246 | 2022-03-22 | NaN | 2012-10-01 |
003246 | 2015-07-24 | 2022-03-22 | 2012-10-01 |
003246 | 2012-10-01 | 2015-07-24 | 2012-10-01 |
003246 | 2001-02-02 | 2010-04-05 | 2001-02-02 |
For all instances of an ID in the table, I need to match entry_date to exit_date to find the most recent entry date that represents the beginning of an uninterrupted span of time in which that ID was moving between facilities but not leaving care, and return it in a column, original_entrydate.
In the example, the value for original_entrydate for the first three rows would be 2012-10-01, because that entry_date does not match an exit_date, indicating a separation from care, which the dates show lasted for two years and some months. If there were additional records for that ID, that process would reset and find the original_entrydate for any records preceding that separation from care, up to the next separation.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
I solved my problem in the most clunky way imaginable--by creating nested if-else statements:
This worked for two reasons: the dataframe was sorted by 'Individual_ID' ascending and 'Admit_Date' descending--and the nested if-else statements allowed为了比较索引[i]的“私有_id”与随后的行进行比较,直到所有可能性都用尽。
我也知道每个ID最多有4行。
但是 - 请向我展示一种更好,更多的Pythonic方法!
I solved my problem in the most clunky way imaginable--by creating nested if-else statements:
This worked for two reasons: the dataframe was sorted by 'Individual_ID' ascending and 'Admit_Date' descending--and the nested if-else statements allowed for comparison of the 'Individual_ID' at index[i] with subsequent rows, until all possibilities were exhausted.
I also knew there was a maximum of 4 rows per ID.
BUT--please show me a better, more pythonic way of doing this!