情感分析(意见挖掘)中最具挑战性的问题是什么?

发布于 2024-10-14 12:57:50 字数 1559 浏览 8 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

淡淡離愁欲言轉身 2024-10-21 12:57:50

情感分析的关键挑战是:-

1) 命名实体识别 - 这个人实际上在谈论什么,例如 300 斯巴达是一群希腊人还是一部电影?

2) 照应解析——解决代词或名词短语所指代的问题。 “我们看了电影然后去吃晚饭;太糟糕了。” “它”指的是什么?

3)解析——句子的主语和宾语是什么,动词和/或形容词实际上指的是哪一个?

4) 讽刺——如果你不认识作者,你就不知道“坏”是指坏还是好。

5) Twitter - 缩写、缺少大写字母、拼写错误、标点符号错误、语法错误……

The key challenges for sentiment analysis are:-

1) Named Entity Recognition - What is the person actually talking about, e.g. is 300 Spartans a group of Greeks or a movie?

2) Anaphora Resolution - the problem of resolving what a pronoun, or a noun phrase refers to. "We watched the movie and went to dinner; it was awful." What does "It" refer to?

3) Parsing - What is the subject and object of the sentence, which one does the verb and/or adjective actually refer to?

4) Sarcasm - If you don't know the author you have no idea whether 'bad' means bad or good.

5) Twitter - abbreviations, lack of capitals, poor spelling, poor punctuation, poor grammar, ...

仄言 2024-10-21 12:57:50

我同意 Hightechrider 的观点,即这些领域的情感分析准确性可以得到提高。我还要补充一点,情感分析在很大程度上往往是在封闭域文本上完成的。尝试在开放域文本上执行此操作通常会导致准确性非常差/F1 度量/您有什么,否则它是伪开放域,因为它只查看某些语法结构。因此,我想说主题敏感的情感分析可以识别上下文并据此做出决策,这是一个令人兴奋的研究领域(和行业产品)。

我还将他的第五点从 Twitter 扩展到其他社交媒体网站(例如 Facebook、Youtube),在这些网站上,简短、不合语法的话语很常见。

I agree with Hightechrider that those are areas where Sentiment Analysis accuracy can see improvement. I would also add that sentiment analysis tends to be done on closed-domain text for the most part. Attempts to do it on open domain text usually winds up having very bad accuracy/F1 measure/what have you or else it is pseudo-open-domain because it only looks at certain grammatical constructions. So I would say topic-sensitive sentiment analysis that can identify context and make decisions based on that is an exciting area for research (and industry products).

I'd also expand his 5th point from Twitter to other social media sites (e.g. Facebook, Youtube), where short, ungrammatical utterances are commonplace.

峩卟喜欢 2024-10-21 12:57:50

我认为答案是语言的复杂性、语法和拼写错误。人们表达观点的方式有很多种,例如,讽刺可能会被错误地解释为极其积极的情绪。

I think the answer is the language complexity, mistakes in grammar, and spelling. There is vast of ways people expresses there opinions, e.g., sarcasms could be wrongly interpreted as extremely positive sentiment.

墨小沫ゞ 2024-10-21 12:57:50

这个问题可能太笼统了,因为情感分析有多种类型(文档级别、句子级别、比较情感分析等),每种类型都有一些特定的问题。

一般来说,我同意@Ian Mercer的回答,并且我会添加其他3个问题:

  • 如何检测更深入的情绪/情感。正面和负面是一个非常简单的分析,挑战之一是如何提取情感,例如意见中有多少仇恨,有多少快乐,有多少悲伤等。
  • 如何检测意见是正面的对象和意见是否定的对象。例如,如果您说“她赢得了他!”,这同时意味着对她的积极情绪和对他的消极情绪。
  • 如何分析非常主观的句子或段落。有时,即使对于人类来说,也很难就这种高度主观的文本的情感达成一致。想象一下对于计算机...

The question may be too generic, because there are several types of sentiment analysis (document level, sentence level, comparative sentiment analysis, etc.) and each type has some specific problems.

Generally speaking, I agree with the answer by @Ian Mercer, and I would add 3 other issues:

  • How to detect a more in depth sentiment/emotion. Positive and negative is a very simple analysis, one of the challenge is how to extract emotions like how much hate there is inside the opinion, how much happiness, how much sadness, etc.
  • How to detect the object that the opinion is positive for and the object that the opinion is negative for. For example, if you say "She won him!", this means a positive sentiment for her and a negative sentiment for him, at the same time.
  • How to analyze very subjective sentences or paragraphs. Sometimes even for humans it is very hard to agree on the sentiment of this high subjective texts. Imagine for a computer...
泪之魂 2024-10-21 12:57:50

虽然这是一个有点老的问题,但让我具体添加一些与阿拉伯语情绪分析相关的注释。阿拉伯语具有复杂的形态和方言多样性,需要先进的预处理和词汇构建过程,这超出了英语的需要。

请参阅

  1. https://www.researchgate.net/publication/280042139_Survey_on_Arabic_Sentiment_Analysis_in_Twitter
  2. https://link.springer.com/chapter/10.1007/ 978-3-642-35326-0_14

Although this is a little bit an old question, let me add some note related to Arabic sentiment anlsysis in specific. Arabic language has morphological complexities and dialectal varieties which require advanced preprocessing and lexical building processes that surpass what is needed for the English language.

Please, refer to

  1. "https://www.researchgate.net/publication/280042139_Survey_on_Arabic_Sentiment_Analysis_in_Twitter"
  2. "https://link.springer.com/chapter/10.1007/978-3-642-35326-0_14"
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文