应用 lambda 或定义一个函数在 dask 数据框中返回 1 else 0
可能很简单,但我仍在学习。
我正在 dask 数据框中创建一个新列,其中的值将来自提取 str ddmmyyyy 中 date
列的最后四个 str 字符。 我所做的:
- 有一个 inv_years 列表,
- 提取字符串 date 的前四个字符,
- 尝试定义一个函数,如果提取的年份在 inv_years 列表中,则在新列中返回 1,否则返回 0。
问题:如何以更少的行数编写一个工作函数或更好的 lambda 函数
def valid_yr(x):
inv_years = ['1921','1969','2026','2030','2041','2060','2062']
validity_year = ddf['string_ddmmyyyy'].str[-4:] #extract the last four to get the year
if validity_year.isin(inv_years):
x = 1
else:
x = 0
return x
#create a new column and apply function
ddf['validity_year']= ??? # what to write here?
Probably easy, but I am still learning.
I am creating a new column in dask dataframe where the value will come from after extracting the last four str characters of date
column in str ddmmyyyy.
What I did:
- have is a list of inv_years
- extract the lst four characters of the string date
- tried to define a function that if the extracted years are in the inv_years list, return 1 else 0 in a new column.
Issue: How do I write a working function or better in fewer lines a lambda function
def valid_yr(x):
inv_years = ['1921','1969','2026','2030','2041','2060','2062']
validity_year = ddf['string_ddmmyyyy'].str[-4:] #extract the last four to get the year
if validity_year.isin(inv_years):
x = 1
else:
x = 0
return x
#create a new column and apply function
ddf['validity_year']= ??? # what to write here?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我可以想出的一种非常脾气暴躁的方法是,
或者尝试让您的方法发挥作用,我们首先稍微修改您的函数,以便它的参数是单行。
现在我们可以将此函数应用于所有行。
A very grumpy way I could come up with is
or to try and get your approach working we initially modify your function a bit so as it's argument is a single row.
Now we can apply this function to all rows.