在 Web 应用程序中实现语义搜索
该 Web 应用程序用于将不同类型的帖子发布到 Web 应用程序,用户可以通过基于文本的搜索来搜索这些帖子。基本上,帖子具有以下属性。
Title
Description
Category
Budget
Submit_date
End_date
目前,Post 内容存储在 sql server 数据库中的“Post”表下。我想做的是,对应用程序中发布的帖子实施语义搜索。例如,当用户在搜索字段中输入“教育”时,系统应使用该词“教育”的含义而不是其文本值进行搜索。我喜欢使用 Jena 框架来寻求基于 RDF/OWL 的解决方案。但我真的不知道如何开始,因为我是语义网的新手。帮助我建立这个搜索。 (如果您能提供示例应用程序/代码就更好了)。提前致谢。
This web application is about posting different kind of posts to the web application where users can search those posts by a text-based search. Basically a post has following attributes.
Title
Description
Category
Budget
Submit_date
End_date
Currently, Post content is store in sql server database under 'Post' table. What I want to do is, implement a semantic search on this posts which are publish in the application. For an example, when a user type 'Education' in search field, system should search with the meaning of that word 'Education' rather than its text value. I like to go for a RDF/OWL based solution with Jena framework for this. But I really don't know how to start it since i'm a newbie to Semantic web. Help me to build this search. (its better if you can provide a sample application/codes).Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在开始一项不平凡的任务之前,我想说,明智的做法是进一步了解语义网络技术、它们要解决的问题等。您可以首先阅读/浏览“语义网编程》一书。
通过对“是什么”的高级理解,您可以将您的问题重述得更具体,也许可以分为几个不太笼统的问题。 OWL 和 Jena 是实现细节,首先您需要清楚地了解语义搜索将如何准确工作。您的帖子描述会由人类作者或机器进行语义注释吗?您还会使用类别来帮助您的搜索吗?您会使用外部系统来查找诸如“哪些术语/概念/资源与‘教育’相关”之类的信息,还是您的系统会自行维护这些信息?等等。
除非您非常认真地考虑语义,否则为了改进您的搜索,我建议从简单的事情开始,例如词干提取,因此搜索“教育”将返回提及“教育”、“未受过教育”等的帖子。添加一些像这样的简单技巧,也许您会意识到这就是您真正需要的......:-)
Before jumping on to a non-trivial task, I'd say it would be wise to get a bit more acqainted with semantic web technologies, problems they set out to solve etc. You could start by reading/glancing through the "Programming the Semantic Web" book, for example.
With the high-level understanding on what's what, you could then restate your question to be more specific, maybe split in several less general questions. OWL and Jena are implementation detail, first you need a clear overall idea how your semantic search will work exactly. Will your post descriptions be semantically annotated by human authors or machines? Will you also use categories to aid your search? Will you use external systems to look up information like "what terms/concepts/resources are related to 'Education'" or will your system maintain this information by itself? And so on.
Unless you are dead-serious about going semantic, in order to improve your search I'd recommend to start with the simple things like stemming, so search for "Education" would return posts mentioning "educate", "uneducated" and the like. Add some simple tricks like this and maybe you'll realize that's all you really need... :-)
这样做的方法可以是根据
本体论。现在很明显,本体论需要随着时间的推移而发展,
您可能想要保留几个这样的本体用于搜索。这
我会这样做的方式是为帖子生成标签
分析帖子内的文本。帖子通常只有标签
由作者自己定义,如果您可以以某种方式添加更多标签
,这将使帖子在搜索时更加明显,等等
有用。获得标签后,您可以根据标签对其进行分类
您拥有的本体,然后使用这些本体建立关系
我建议使用“opencalais”(还有更多可用的,请随意选择)网络服务
生成更多标签。使用一些您可以找到的标准本体
网络,根据您找到的新标签添加到它们。您发布的帖子越多
您将拥有更多的关系,从而获得更好的结果。
希望它能给您一个开始。
of the methods of doing it could be classifying the post based on an
ontology.Now clearly the ontology would need to evolve over time and
you might want to keep several such ontologies for searching . The
way i would have done it would be generating tags for a post by
analyzing the text inside the post.The posts usually have only tags
defined by the authors themselves,if you could somehow add more tags
,it would make the post more visible when searching and a lot more
useful. Once you have the tags you can classify it based on the
ontologies you have and then build on the relationships using these
ontologies.I can suggest using "opencalais"(several more are available feel free to choose) web service for
generating more tags.Use a few standard ontologies you can find on
the web,add to them based on new tags you find .The more posts you
have more relationships you will have and hence better results.
Hope it gives you a start .