Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 11 years ago.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(2)
一般来说,自动转换是不可能的,因为 HTML 说明了某些内容的外观,而不是它的含义。如果 HTML 包含现有标记,您可以使用 Anything2Triples (http://developers.any23.org/) 来获取 RDF。
如果它只是 HTML,您必须以某种方式编写自己的提取规则。 GRDDL 可以工作,但我可能会简单地使用 python + BeautifulSoup。这取决于您已经掌握的技术/语言!
In general, an automatic conversion is not possible, since HTML says what something looks like, and not what it means. If the HTML contains existing markup, you could use Anything2Triples (http://developers.any23.org/) to get RDF out.
If it's just HTML you have to write your own extraction rules somehow. GRDDL would work, but I would probably simply use python + BeautifulSoup. It depends what technology/language you know already!
如果 HTML 包含嵌入的 RDFa,您可以使用 RDFa 解析器来提取信息。有适用于各种平台和语言的解析器,因此取决于您的开发环境。
If the HTML contains embedded RDFa the you can use an RDFa parser to extract the information. There are parsers available for various platforms and languages so depends on your development environment.