如何直观地设计用于编程提取的混搭查询
我正在开发一个应用程序,该应用程序从互联网页面获取各种输入,而每个信息片段都来自不同的位置(混搭)。 我想通过可视化工具生成混搭构建块(片段)。 您知道有什么类似的东西可以用于这样的项目吗? (已经做好的控件、示例代码、文章等) 首选开发环境是 .NET - 但不是强制性的。
I'm into development of an application that fetches various inputs from internet pages whereas each information snippet comes from a different location (mashup).
I would like to generate the mashup building block (snippets) through a visual tool.
Do you know of anything similar that can be used for such a project? (Already made control, a sample code, article, etc.)
Preferred development environment is .NET - but not mandatory.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在我看来,主要的挑战是以语义形式从每个提要中提取适当的信息。维基百科将混搭描述为:
经典的混搭——芝加哥犯罪——之所以有效,是因为日期和地理位置等关键信息在语义上是可用的。其他类型的公共信息包括个人、组织和特定领域的标识符。
当您确定了这些之后,您可能希望考虑语义网正在开发的基于 RDF 的工具。请注意,政府开始以 RDF 形式发布数据,因此我将其视为一项关键技术。
如果您的网页没有立即包含语义信息,您可能必须创建屏幕抓取工具和 HTML 解析器。这不是很迷人,没有特殊的工具,而且往往只是艰苦的工作。
IMO the major challenge will be to extract the appropriate information from each feed in semantic form. Wikipedia describes mashups as:
The classic mashup - Chicago crime - works because key information such as dates and geolocations are available semantically. Other types of common information are persons, organisations, and domain-specific identifiers.
When you have identified these you may wish to consider the RDF-based tools that the semantic web is developing. Note that governments are starting to emit their data in RDF so I would see this as a key technology
If your web pages do not have semantic information immediately you will probably have to create screen scrapers and HTML parsers. That's not very glamorous, there are no special tools and tends to be just hard work.