我如何将函数(在本例中是抓取关键字)应用于 df 中列的每一行,并将其放入新列中?
我有一个关于同源蛋白质的数据框(近 3000 个!),其中包括每个蛋白质功能的描述。从这个描述中,我想从每个单元格中获取一个关键字并将其放在单独的列中。这是为了创建蛋白质的分类。
我正在创建一个函数,使用 yake!: 从“描述”列的每个单独行的文本中提取关键字:
def generate_keyword():
kw_extractor = yake.KeywordExtractor(n=2, top=40)
keywords = kw_extractor.extract_keywords(data["description"])
for kw in keywords:
print(kw)
然后我尝试将此信息放入数据框中的新列(“关键字”)中,例如所以:
data["keyword"] = data["description"].apply(generate_keyword())
然后,当我尝试运行它时,它会给出这两条消息:
Warning! Exception: 'Series' object has no attribute 'split' generated by the following text: '0 Mitochondrial malate dehydrogenase;catalyzes i...
.......
TypeError: 'NoneType' object is not callable
我认为错误出在我为函数标记参数的方式中,但我不知道如何修复它。非常感谢任何帮助!
I have a dataframe about homologous proteins (almost 3000 of them!), which includes the description of each proteins' function. From this description I want to grab a key word from each cell and put it in a separate column. This is in order to create a classification of the proteins.
I am creating a function to extract key-words from the text of each individual row of the 'description' column, using yake!:
def generate_keyword():
kw_extractor = yake.KeywordExtractor(n=2, top=40)
keywords = kw_extractor.extract_keywords(data["description"])
for kw in keywords:
print(kw)
And then I am trying to put this information into a new column ('keyword') in the dataframe like so:
data["keyword"] = data["description"].apply(generate_keyword())
It then gives these two messages when I try to run it:
Warning! Exception: 'Series' object has no attribute 'split' generated by the following text: '0 Mitochondrial malate dehydrogenase;catalyzes i...
.......
TypeError: 'NoneType' object is not callable
I think the mistake is somewhere in how I'm labelling the parameters for my function, but I have no clue how to fix it. Any help is greatly appreciated!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论