如何根据关键词内容对网站进行分类
我正在编写一个网络机器人,它根据关键字/元/链接将网站分类为预定义的类别列表。
我一直在研究各种本体论方法,并研究过 Wordnet(上位词/下位词)、ResearchCyc、WebKb,并且想知道这是否像我想象的那样是一个难题,或者之前是否已在其他地方得到解决。
本质上,我有大量已排序的关键字值,并且想使用它们来匹配类别名称。我目前的想法是检查某种本体层次结构中的类别名称。
还有其他人处理过这样的基于本体的问题吗?
干杯!
I'm writing a webrobot which categorizes sites based on there keyword/meta/links into a predefined list of categories.
I've been looking at various ontology approaches and have looked at Wordnet (for the hypernym/hyponym), ResearchCyc , WebKb and was wondering if this was as hard a problem as I'm thinking or has it been solved somewhere else before.
Essentially I have large stacks of sorted keyword values and would like to use them to match against a category name. My current thoughts are to check against the category name in some kind of ontology hierarchy.
Has anyone else approached a ontology based problem like this?
Cheers!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可能想了解文本挖掘,特别是关键字挖掘或主题索引、研究。
You might want to look at text mining, specifically keyword mining or subject indexing, research.