r情感功能
我对此代码有问题。我找不到错误,但是结果显然是不正确的。调用功能后:
data=sentimentfunction(My_tweettext, positive_war, negative_war,
.progress='text')
我明白了: 结果的
结果是带有下载推文的DF(清洁已完成),情感功能的第二个结果是最大值。 3个相同的推文,我们得到2倍分数= 0,1倍得分= 4771。
比我更聪明地查看此代码并检查正确性吗?建议我如何获得正确的结果?我想使用我已经获得的“ TweetText”。
sentimentfun = function(tweettext, pos, neg, .progress='non')
{
# Parameters
# tweettext: vector of text to score
# pos: vector of words of postive sentiment
# neg: vector of words of negative sentiment
# .progress: passed to laply() 4 control of progress bar
scores = laply(tweettext,
function(singletweet, pos, neg)
{
singletweet = gsub("[[:punct:]]", "", singletweet)
singletweet = gsub("[[:cntrl:]]", "", singletweet)
singletweet = gsub("\\d+", "", singletweet)
tryTolower = function(x)
{
y = NA
try_error = tryCatch(tolower(x), error=function(e)e)
if (!inherits(try_error, "error"))
y = tolower(x)
return(y)
}
singletweet = sapply(singletweet, tryTolower)
word.list = str_split(singletweet, "\\s+")
words = unlist(word.list)
pos.matches = match(words, pos)
neg.matches = match(words, neg)
pos.matches = !is.na(pos.matches)
neg.matches = !is.na(neg.matches)
score = sum(pos.matches) - sum(neg.matches)
return(score)
}, pos, neg, .progress=.progress )
sentiment.df = data.frame(text=tweettext, score=scores)
return(sentiment.df)
}
抱歉,如果这个问题很愚蠢,但是我需要此功能才能获取我的研究数据。
编辑: 我使用Windows 10我的rstudio版本是1.4.1103
tweettext: (Trudeau_tweettext.csv)
pos: (positive-words.txt)
neg: (negative-words.txt)
library(stringr)
library(plyr)
library(dplyr)
library(tm)
祝大家有美好的一天(或晚上)!
I have a problem with this piece of code. I can't find a bug, but the results are clearly incorrect. After calling the function:
data=sentimentfunction(My_tweettext, positive_war, negative_war,
.progress='text')
I get this:
ss of result
The result is a df with downloaded tweets (cleaning has been done), where every second result of the sentiment function is the maximum. 3 identical tweets, we get 2x score = 0 and 1x score = 4771.
Could someone smarter than me look at this code and check it for correctness? Suggest how I can get the correct results? I want to use the "tweettext" that I have already obtained.
sentimentfun = function(tweettext, pos, neg, .progress='non')
{
# Parameters
# tweettext: vector of text to score
# pos: vector of words of postive sentiment
# neg: vector of words of negative sentiment
# .progress: passed to laply() 4 control of progress bar
scores = laply(tweettext,
function(singletweet, pos, neg)
{
singletweet = gsub("[[:punct:]]", "", singletweet)
singletweet = gsub("[[:cntrl:]]", "", singletweet)
singletweet = gsub("\\d+", "", singletweet)
tryTolower = function(x)
{
y = NA
try_error = tryCatch(tolower(x), error=function(e)e)
if (!inherits(try_error, "error"))
y = tolower(x)
return(y)
}
singletweet = sapply(singletweet, tryTolower)
word.list = str_split(singletweet, "\\s+")
words = unlist(word.list)
pos.matches = match(words, pos)
neg.matches = match(words, neg)
pos.matches = !is.na(pos.matches)
neg.matches = !is.na(neg.matches)
score = sum(pos.matches) - sum(neg.matches)
return(score)
}, pos, neg, .progress=.progress )
sentiment.df = data.frame(text=tweettext, score=scores)
return(sentiment.df)
}
Sorry, if this question is stupid, but I need this function to get data for my research.
Edit:
I use Windows 10 my RStudio version is 1.4.1103
tweettext: (Trudeau_tweettext.csv)
pos: (positive-words.txt)
neg: (negative-words.txt)
library(stringr)
library(plyr)
library(dplyr)
library(tm)
I wish you all a lovely day (or night)!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论