从一系列bigrams中,我需要编辑至少一个 至少匹配至少一个学期的大型杂物。
这两个列表
,'数据可视化']
bigram_list = ['计算机视觉' visio','可视化']
目标
cleaned_bigrams = ['数据可视化']
我尝试了
我尝试调整这种方法在这里,但失败了:来自Python 3.x的另一个列表的单独列表
我也尝试过,但无法正常工作:
我试图从一个我试图适应一个, 从而摆脱列表中的umigrams-python 我提出过的上一个问题,但无法做到这一点:创建基于标记的Pandas DataFrame中出现的特定大型图片的新布尔字段
在此先感谢您提供的任何帮助,如果您认为这是一个很好的问题,请欣赏upvote!
From a list of bigrams, I need to redact bigrams that do not have at least one term that exactly matches at least one term in a list of unigrams.
The Two Lists
bigram_list = ['computer vision', 'data excellence', 'data visualization']
unigram_list = ['excel', 'tableau', 'visio', 'visualization']
The Objective
cleaned_bigrams = ['data visualization']
What I've Tried
I tried adapting this approach here, but failed: Removing separate list of items from another list in Python 3.x
I also tried this, but couldn't get it to work: Get rid of unigrams in a list if contained within bigrams or trigrams python
I tried to adapt from a previous question I asked, but couldn't get that going: Create new boolean fields based on specific bigrams appearing in a tokenized pandas dataframe
Thanks in advance for any help you can provide, and would appreciate an upvote if you think this is a good question!
发布评论
评论(1)
这是一种方法:
Here is one way to do it: