通过参数通过函数将pandas dataframe转换为语料库文件时的错误

发布于 2025-02-11 19:13:21 字数 1343 浏览 2 评论 0原文

我想准备使用NLTK的PANDAS DataFrame中的文本数据。为此，我将代码用于将PANDAS DataFrame的每一行转换为语料库的函数。

import nltk
# convert each row of the pandas dataframe of tweets into corpus files
def CreateCorpusFromDataFrame(corpusfolder,df):
    for index, r in df.iterrows():
        date=r['Date']
        tweet=r['Text']
        place=r['Place']
        fname=str(date)+'_'+'.txt'
        corpusfile=open(corpusfolder+'/'+fname,'a')
        corpusfile.write(str(tweet) +" " +str(date))
        corpusfile.close()
CreateCorpusFromDataFrame(myfolder,mydf)

问题是我一直收到这样的信息：

NameError: name 'myfolder' is not defined

即使我在jupyter笔记本的相同路径目录中有一个名为“ myFolder”的文件夹，我的代码就在吗？

更新：

我现在可以看到，问题简直就是我需要将文件夹名称作为字符串传递。现在我已经做到了并修改了我的代码。我现在遇到的问题是，使用该函数创建的文本文件的内容没有写入语料库，而创建的变量类型是“非电视”。

import nltk
# convert each row of the pandas dataframe of tweets into corpus files
def CreateCorpusFromDataFrame(corpusfolder,df):
    for index, r in df.iterrows():
        id=r['Date']
        tweet=r['Text']
        #place=r['Place']
        #fname=str(date)+'_'+'.txt'
        fname='tweets'+'.txt'
        corpusfile=open(corpusfolder+'/'+fname,'a')
        corpusfile.write(str(tweet) +" ")
        corpusfile.close()
corpus df = CreateCorpusFromDataFrame('myfolder',mydf)
type(corpusdf)
NoneType

原文

I want to prepare my text data that is in a pandas dataframe for sentiment analysis with nltk. For that, I'm using code for a function that converts each row of a pandas dataframe into a corpus.

import nltk
# convert each row of the pandas dataframe of tweets into corpus files
def CreateCorpusFromDataFrame(corpusfolder,df):
    for index, r in df.iterrows():
        date=r['Date']
        tweet=r['Text']
        place=r['Place']
        fname=str(date)+'_'+'.txt'
        corpusfile=open(corpusfolder+'/'+fname,'a')
        corpusfile.write(str(tweet) +" " +str(date))
        corpusfile.close()
CreateCorpusFromDataFrame(myfolder,mydf)

The problem is I keep getting the message that

NameError: name 'myfolder' is not defined

Even though I have a folder called 'myfolder' in the same path directory of jupyter notebook that my code is in?

UPDATE:

I can see now that the issue was simply that I needed to pass the folder name as a string. Now that I've done that and amended my code. The problem I have now is that the contents of the text file created with the function are not being written into a corpus and the type of variable being created is a 'NoneType'.

import nltk
# convert each row of the pandas dataframe of tweets into corpus files
def CreateCorpusFromDataFrame(corpusfolder,df):
    for index, r in df.iterrows():
        id=r['Date']
        tweet=r['Text']
        #place=r['Place']
        #fname=str(date)+'_'+'.txt'
        fname='tweets'+'.txt'
        corpusfile=open(corpusfolder+'/'+fname,'a')
        corpusfile.write(str(tweet) +" ")
        corpusfile.close()
corpus df = CreateCorpusFromDataFrame('myfolder',mydf)
type(corpusdf)
NoneType

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

倾城花音 2025-02-18 19:13:21

问题

您将myFolder作为您在代码中未定义的函数的变量，因此提高了名称。

解决方案

只需将其替换为'myFolder' [将其传递为字符串]。

CreateCorpusFromDataFrame('myfolder',mydf)

Problem

You are passing myfolder as a variable to your function which you have not defined in your code and hence it raises a NameError.

Solution

Just replace it with 'myfolder' [pass it as a string].

CreateCorpusFromDataFrame('myfolder',mydf)

回复收藏 0 原文

~没有更多了~

关于作者

萝莉病

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

通过参数通过函数将pandas dataframe转换为语料库文件时的错误

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

问题

解决方案

Problem

Solution

关于作者

相关话题

热门标签

推荐作者

櫻之舞

弥枳

m2429

寻找一个思念的角度

野却迷人

我怀念的。

友情链接

通过参数通过函数将pandas dataframe转换为语料库文件时的错误

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

问题

解决方案

Problem

Solution

关于作者

相关话题

热门标签

推荐作者

櫻之舞

弥枳

m2429

寻找一个思念的角度

野却迷人

我怀念的。

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。