熊猫从列表中找到多个单词,并分配布尔值
因此,我有这样的数据帧,
data = {
"properties": ["FinancialOffice","Gas Station", "Office", "K-12 School", "Commercial, Office"],
}
df = pd.DataFrame(data)
这是我的列表,
proplist = ["Office","Other - Mall","Gym"]
我要做的是使用列表,我试图找出哪些单词与数据框列完全匹配,对于数据框中的每个单词,我需要分配一个布尔值是/错误的值或0/1。必须是一个确切的匹配。
这样的输出,
properties flag
FinancialOffice FALSE
Gas Station FALSE
Office TRUE
K-12 School FALSE
Commercial, Office TRUE
因此,它仅用于“ Office”,因为它是列表中的确切匹配。 FinancialOffice不是因为它不在列表中。同样,对于最后一个商业,Office 是真的,因为 Office 即使商业却没有。因此,即使是其中一个也是如此。
df["flag"] = df["properties"].isin(proplist)
上面的代码可以很好地分配布尔值为true/fals,但是它返回 false 对于最后一个(商业,办公室),它试图找到确切的匹配。
任何帮助将不胜感激。
So, I have dataframe like this,
data = {
"properties": ["FinancialOffice","Gas Station", "Office", "K-12 School", "Commercial, Office"],
}
df = pd.DataFrame(data)
This is my list,
proplist = ["Office","Other - Mall","Gym"]
what I am trying to do is using the list I am trying to find out which words exactly matches with the dataframe column and for each word from the dataframe I need to assign a Boolean true/false value or 0/1. It has to be a exact match.
Output like this,
properties flag
FinancialOffice FALSE
Gas Station FALSE
Office TRUE
K-12 School FALSE
Commercial, Office TRUE
So, It returns TRUE for only "Office" because it is the exact match from the list. FinancialOffice is not because it is not in the list. Also, For the last one Commercial, Office it is TRUE because Office is found in the list even though Commercial not. So, even one of them is present it will be TRUE.
df["flag"] = df["properties"].isin(proplist)
Above code works fine to assign a boolean true/false but It returns FALSE for the last one(Commercial,Office) as it tries to find the exact match.
Any help is appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
使用带有单词定界符的精心打正等级:
输出:
Use a crafted regex with word delimiter:
output:
您可以定义外部功能以进行检查,例如
输出:
You can define an external function to do the check, for example
Output:
您可以使用
split()
和strip()
将每个属性
comma-delecited属性的字符串转换为字符串列表,然后使用pythonset
交集操作员&
测试是否有任何属性匹配preplist
:输出:输出:
You can use
split()
andstrip()
to convert eachproperties
string of comma-delimited properties to a list of strings, then use the pythonset
intersection operator&
to test whether any of the properties match those inproplist
:Output: