您将如何按上下文对文章进行分组? - 自然语言
我有由以下内容组成的文章列表: 标题、副标题和正文。
现在我需要解析所有这些文章,并根据可能的关键字将它们分组到不同的上下文类别或子类别中。
例如如果文章可能与跑车相关,那么该文章将与汽车或/和车辆上下文相关
现在我明白这是一片广阔的海洋,但这也是为什么我有提出这个问题。因为解决方案的海洋对我来说可能太大了,我很可能会迷失方向并采用一些糟糕的思想解决方案。
可能有一些我不知道的流行且标准化的方法可以做到这一点,如果有人为我指出正确的方向,那将非常有用。
帮助会很棒。 =)
I have lists of articles made of:
title, subtitle and body.
Now I need to parse all these articles and group them up under different context categories or sub categories based on their possible keywords.
e.g. if the article is likely to be related to sports cars then the article would be associated with the car or/and vehicle context
Now I understand that this is a vast ocean, but this is also why I have put up this question. Because the ocean of solutions might be too big for me, and I would most likely get lost and adopt some bad thought solution.
There are probably some popular and standardized ways of doing this that I do not know, and it would be very useful if someone pointed me in the right direction.
Help would be great. =)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
自然语言工具包,但不要指望其中有灵丹妙药,让您不得不学习一些语言学知识,因为你描述的问题不能完全机械地解决。
The Natural Lanugage Toolkit but don't expect that there is a magic bullet in there which will keep you having to learn a fair bit about linguistics, as the problem you describe cannot be solved wholly mechanically.