另一个波特词干算法实现问题?
我正在尝试实现波特词干算法,但我很难理解这一点
步骤 1c
<前><代码>(*v*) Y ->我很高兴->快乐 天空->天空
这不是与我们想要做的相反吗,为什么算法要将 Y 转换为 I。
完整的算法在这里 http://tartarus.org/~martin/PorterStemmer/def.txt
谢谢
I am trying to implement porter stemming algorithm, but i am having difficualties understanding this point
Step 1c
(*v*) Y -> I happy -> happi sky -> sky
Isn't that the the opposite of what we want to do , why does the algorithim convert the Y into I.
for the complete algorithm here http://tartarus.org/~martin/PorterStemmer/def.txt
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Porter 词干分析器和其他词干提取算法并不总是返回单词;它们返回词干。目标是相关单词应该具有相同的词干。只要“happiness”、“happy”和“happyly”都简化为同一个词干,那么您的词干分析器就可以完成其工作,即使该词干不是一个真正的单词。
The Porter stemmer and other stemming algorithms don't always return words; they return word stems. The goal is that related words should have the same stem. As long as "happiness", "happy", and "happily" all reduce to the same stem, then your stemmer is doing its job, even if the stem isn't a real word.