当前位置：文江博客话题详情

用于公开结构化元数据的格式（都柏林核心、rdf、atom）？

发布于 2024-10-24 04:15:19 字数 505 浏览 8 评论 0原文

我想以无私的方式公开尽可能多的有关我的网站的结构化数据。我也不介意 SEO 提升，但它是次要的。

似乎有几个选项：

完整的 RDF（现在杀了我 XML）
Atom 带有您自己的自定义标签（喜欢）
网页中的 RDFa（可能有助于 SEO）
Dublin Core Meta 标签
Dublin Core 使用 RDFa
Atom 与 RDFa

我是只是想让人们轻松地从我的网站获取数据。

您认为我应该使用哪一个？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

悲念泪 2024-10-31 04:15:19

RDF 不仅仅是 XML；它也是 XML。 RDF 是一种数据模型，它依赖三元组（主语、谓语、宾语）和 URI 来明确指代事物。事实上，使用 RDF 的人往往会逃避 RDF/XML，而我们更喜欢 RDF/Turtle或RDF/Ntriples，甚至是JSON格式的RDF。这些序列化更具可读性、更容易构造并且更容易解析。此外，有许多工具可以让您在所有 RDF 风格之间进行转换（即：rapper 或耶拿）。

当涉及到以 RDF 发布信息时。您通常有三种不同的选择：

提供数据的 RDF 转储。
按照链接数据规则发布 RDF。
使用 RDFa 将元数据添加到现有网页。

......这些并不是排他性的。您可以选择它们的任意组合，最重要的是选择正确的 URI 结构（请参阅 Cool URI 不会改变）。

根据您的 SO 个人资料，我发现您正在开发一个社交品味推荐网站 (http://evocatus.com/)。我假设您可能想公开有关这些评论的信息。因此，对于像 http://evocatus.com/sauce/cholula-chipolte 这样的评论-hot-sauce/272645/ 您可以提供不同的序列化，不仅返回 HTML，还返回：

.../holula-chipolte-hot-sauce/272645/rdf-turtle
.../holula-chipolte- hot-sauce/272645/rdf-xml
.../holula-chipolte-hot-sauce/272645/rdf-json
以及您想要公开的任何其他类型的格式。

此外，HTML 版本可以使用 RDFa 进行增强。根据使用数据的客户端类型，遵循内容协商规则，您可以将 HTTP 请求重定向为客户端接受的格式。这是由 HTTP 标头 Accept 建立的。因此，像下面这样带有 curl 的请求将由您的应用程序重定向，返回 RDF/XML 版本：

curl -H 'Accept: application/rdf+xml' .../holula-chipolte-hot-sauce/272645/

将来，人们只需重用即可对您网站中的现有评论进行评论您的 RDF 数据中的 URI。这就是 RDF 和关联数据的力量。

关于 Dublin Core，您可以将 Dublin Core 与 RDF 或 RDFa 结合使用。但是，就您而言，还有一些其他有趣的本体需要考虑，正确的做法是混合使用所有这些本体：

FOAF：Friend Of A Friend，表达用户个人信息以及用户之间的关系。
标签本体：一个非常简单的本体来表达标签信息。
RDF 评论词汇：使用 RDF 表达评论和评级的词汇。
GoodRelations：表达产品信息和电子商务的本体。
Vcard/RDF：用于地址，通常与 FOAF 结合使用。

有一个名为 http://revyu.com/ 的网站使用所有这些本体（GoodRelations 除外），因此您可以使用它作为指导。例如，请参阅：

...这些是同一评论的 HTML 和 RDF 版本。

正如您所看到的，与 ATOM 不同的是，使用 RDF，您将能够重用现有的本体，并且由于 RDF 基于 URI，所有内容都将相互链接。

链接数据增值

如果您花时间将您的产品和评论链接到其他数据源，会发生什么？（即：dbpedia.org 或 freebase.com）。让我们想象一下，您开始将所有啤酒评论 (http://evocatus.com/beer/) 链接到生产该产品的任何啤酒厂 (http://dbpedia.org/page/Alcoholic_beverage)，方法是按照您想要的链接例如，能够知道最好的啤酒是在哪里生产的。 Dbpedia 拥有该信息。

另请参阅 Freebase，它也提供 RDF 版本，您可以链接到制造商。例如，请参阅http://rdf.freebase.com/rdf/en.budweiser在 RDF 或 http://www.freebase.com/view/en/budweiser 中HTML。

RDF is not just XML; RDF is a data model that relies on sets of triples (subject, predicate, object) and URIs to unambiguously refer to things. Actually, people working with RDF tend to run away from RDF/XML and we prefer RDF/Turtle or RDF/Ntriples, even RDF in JSON format. These serializations are more readable, easier to construct and easier to parse. Moreover, there are many tools that allow you to transform between all the range of RDF flavors (i.e: rapper or Jena).

When it comes to publishing information in RDF. You generally have three different choices:

To provide RDF dumps of your data.
To publish RDF following the Linked Data rules.
To add metadata to your existing Web pages with RDFa.

... these are not exclusive. You can go for any combination of them, the most important thing is choosing the correct structure of URIs (see Cool URIs don't change).

Following your SO profile I see that you're working on a social taste recommendation website (http://evocatus.com/). I assume that you might want to expose information about those reviews. So for a review like http://evocatus.com/sauce/cholula-chipolte-hot-sauce/272645/ you can provide different serializations and give back not just HTML but also:

.../holula-chipolte-hot-sauce/272645/rdf-turtle
.../holula-chipolte-hot-sauce/272645/rdf-xml
.../holula-chipolte-hot-sauce/272645/rdf-json
and one for any other type of format you want to expose.

In addition, the HTML version could be enhanced with RDFa. Depending on the type of client that consumes your data, following content negotiation rules, you'll redirect the HTTP request to whichever format is accepted by the client. This is established by the HTTP header Accept. So a request like the one below with curl would be redirected by your application giving back the RDF/XML version:

curl -H 'Accept: application/rdf+xml' .../holula-chipolte-hot-sauce/272645/

In the future, people would be able to say things about existing reviews in your site by just reusing your URIs in their RDF data. That's the power of RDF and Linked Data.

About Dublin Core, you could use Dublin Core with either RDF or RDFa. But, in your case there are some other interesting ontologies to consider and the right thing would be to use a mix of all of them:

FOAF: Friend Of A Friend, to express user personal information and relations between users.
Tag Ontology: A very simple ontology to express tag information.
RDF Review Vocabulary: Vocabulary for expressing reviews and ratings using RDF.
GoodRelations: An ontology to express product information and eCommerce.
Vcard/RDF: for addresses, normally used in combination with FOAF.

There is one site called http://revyu.com/ that uses all these ontologies (except GoodRelations), so you could use it as a guideline. See for instance:

... these are HTML and RDF versions of the same review.

Unlike with ATOM, as you can see, with RDF you would be able to reuse existing ontologies and since RDF is based on URIs everything would be interlinked.

Linked Data Added Value

What would happen if you invest sometime linking your products and reviews to other data sources ? (i.e: dbpedia.org or freebase.com). Let's imagine that you start linking all your Beer reviews (http://evocatus.com/beer/) to whatever brewery is manufacturing the product from (http://dbpedia.org/page/Alcoholic_beverage), by following the links you would be able to know for instance where the preferable beers are manufactured. Dbpedia holds that information.

Also see that in Freebase, that also provides RDF versions, you could link to manufacturers. For instance see, http://rdf.freebase.com/rdf/en.budweiser in RDF or http://www.freebase.com/view/en/budweiser in HTML.

回复收藏 0 原文

梦里°也失望 2024-10-31 04:15:19

都柏林核心模式是一小组词汇术语，可用于描述网络资源（视频、图像、网页等）。
的都柏林核心代码链接示例

 <meta name="DC.Format" content="video/mpeg; 10 minutes">

 <meta name="DC.Language" content="en" >

 <meta name="DC.Publisher" content="publisher-name" >

生成 DC.Meta 标记：http://www.dublincoregenerator.com/generator_nq。用于 SEO 目的的元标记中的html

DC - 它们已过时。

结果发现，使用 Dublin Core 元素并没有提高网页的检索排名”，并且“Dublin Core 元数据作为一种众所周知的元数据模式，并未被搜索引擎设计者广泛接受和使用，蜘蛛也不会考虑对网页进行排名时的元素。

谷歌没有在索引中使用它，并且谷歌或搜索引擎的索引网站上也没有提到都柏林核心。

在英国，政府组织使用 DC 提供对标签的标准化访问。

这并不是说谷歌、必应、雅虎等永远不会实施它们。如今，谷歌正在使用更多的元数据和丰富的摘要。

The Dublin Core Schema is a small set of vocabulary terms that can be used to describe web resources (video, images, web pages, etc.).
Example of Dublin Core code

 <meta name="DC.Format" content="video/mpeg; 10 minutes">

 <meta name="DC.Language" content="en" >

 <meta name="DC.Publisher" content="publisher-name" >

Link to Generate DC.Meta tags : http://www.dublincoregenerator.com/generator_nq.html

DC in meta-tags for SEO purposes - they are obsolete.

It was found that using Dublin Core elements did not improve the retrieval rank of the web pages" and that "Dublin Core metadata, as a well-known metadata schema, is not widely accepted and used by search engine designers and the spiders do not consider its elements while ranking the web pages.

Google are NOT using that in their indexing, and there is no mention of Dublin core on Google or search engine's site for indexing.

In the UK, government organisations use DC to provide standardised access to tags.

That's not to say Google, Bing, Yahoo, etc will never implement them. Google is using more metadata and rich snippets these days.

回复收藏 0 原文

~没有更多了~