如何将ISO持续时间转换为Pyspark或Python的分钟
我在python数据框中有一个列,该列的值如下
输出应为:
- 15分钟
- 90分钟
- 5分钟
import pandas as pd
import re
import json
from datetime import datetime
currentdate=datetime.today().strftime('%Y/%m/%d')
absolutepath='/project/sniper/'+'/'+currentdate+'/*.json'
df = pd.read_json('absolutepath', lines=True)
df_sugar = df.loc[df['ingredients'].str.contains("Sugar|sugar", case=True)]
def convertToInteger(my_str):
if 'H' in my_str and PT in my_str:
characters_to_remove_H = "H"
for l in characters_to_remove_H:
new_string_hour = my_str.replace(l, "*60")
new_p=int(new_string_hour.replace(PT,""))
return pd.Series(new_p)
elif 'M' in my_str and PT in my_str:
characters_to_remove_M = "PTM"
for m in characters_to_remove_M:
new_string_minute = int(my_str.replace(m, ""))
return pd.Series(new_string_minute)
df2[["new_col_2"]] = df_beef["prepTime"].apply(convertToInteger)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
假设您的数据是这样的(您可能有更多列,但是您明白了):
我会使用
isodate
软件包具有更强大的解决问题的方法或ONELINER:
如果您不想使用
isodate
,则可以应用自定义方法。根据您的要求,您可以将其推广,但是如果您的所有字符串均以格式“ Pt [< hours> h]< minees> m“”
,您可以简单地做类似的事情:概括它,无论如何我建议您看看 iSodate 源。
还有许多其他方法可以做同样的事情,我希望这给您一些有关如何进行的提示:)
Assuming your data is something like this (you probably have more columns, but you get the point):
I'd use
isodate
package to have a more robust approach to the problemOr oneliner:
If you don't want to use
isodate
, you could apply a custom approach. According to your requirements, you may generalize it, but if all of your string are in the format"PT[<hours>H]<minutes>M"
you could simply do something like:To generalize it, I'd anyway suggest you to take a look at
isodate
source.There are many other ways to do the same thing, I hope this gives you some hints on how to proceed :)