“列表”对象没有属性“lower”“list”对象没有属性“lower”;在 TfidfVectorizer 中
我有关于某些阿拉伯语推文的CSV文件,我做了令牌,茎并清洁我在tfidfvectorizer中有错误的文本,
`data = pd.read_csv(r"tweet.csv" ,dtype=str, encoding="utf-8")
def preprocessing(text):
\#remove stop words
stop_words = set(stopwords.words('arabic'))
\#takenazation
tokens = word_tokenize(text.lower())
result = \[i for i in tokens if not i in stop_words\]
\#stemming
als=ArabicLightStemmer();
word_list = \[als.light_stem(w) for w in result\]
return word_list
data\['text'\] = data\['text'\].apply(preprocessing)
vectorizer=TfidfVectorizer(binary=False,norm='l2',use_idf=True,smooth_idf=True,lowercase=True,min_df=1,
max_df=1.0,max_features=None,ngram_range=(1,1))
vectorizer.fit(data\["text"\])
x=vectorizer.transformv(data\["text"\])
它说“列表”对象没有属性,
我想对我的文本进行否决
i have csv file about some arabic tweet i did token and stemming and clean the text
`data = pd.read_csv(r"tweet.csv" ,dtype=str, encoding="utf-8")
def preprocessing(text):
\#remove stop words
stop_words = set(stopwords.words('arabic'))
\#takenazation
tokens = word_tokenize(text.lower())
result = \[i for i in tokens if not i in stop_words\]
\#stemming
als=ArabicLightStemmer();
word_list = \[als.light_stem(w) for w in result\]
return word_list
data\['text'\] = data\['text'\].apply(preprocessing)
vectorizer=TfidfVectorizer(binary=False,norm='l2',use_idf=True,smooth_idf=True,lowercase=True,min_df=1,
max_df=1.0,max_features=None,ngram_range=(1,1))
vectorizer.fit(data\["text"\])
x=vectorizer.transformv(data\["text"\])
i have error in TfidfVectorizer it said that 'list' object has no attribute 'lower'
i want to do vetorization to my text
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果“文本”是列表格式,则下部函数不适用于列表。
它适用于将字符串作为列表中的元素。
所以你应该输入
作为函数的参数。
If the 'text' is in a list format, the lower function does not work for a list.
It works for a string as an element in a list.
So you should input
as an argument for the function.