什么是本体(数据库?)?

发布于 2024-08-26 11:16:03 字数 356 浏览 5 评论 0原文

我刚刚阅读这篇文章,它提到一些组织有一个本体论( ?)他们的数据库(?)层,并且这样做的决定是错误的。问题是我以前没有听说过这个,所以我不明白为什么它不好。

所以我尝试在谷歌上搜索数据库和本体,并找到了很多 2006 年的 pdf 文件,其中充满了难以理解的内容(在我看来)。我读了其中一些,但目前仍然完全不知道他们在说什么。

我目前的印象是,这是2006年的一些疯狂时尚,一些学者试图向我们兜售,但由于他们的想法的措辞而惨遭失败。但我仍然很好奇是否有人真正知道这到底是怎么回事。

I was just reading this article and it mentions that some organization had an Ontology as(?) their database(?) layer, and that the decision to do this was bad. Problem is I hadn't heard about this before, so I can't understand why it's bad.

So I tried googling about databases and ontology, and came about quite a few pdfs from 2006 that we're full of incomprehensible content (for my mind). I read a few of these and at this point still have absolutely no idea what they are talking about.

My current impression is that it was some crazy fad of 2006 that some academics were trying to sell us, but failed miserably due to the wording of their ideas. But I'm still curious if anyone actually knows what this is actually all about.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

作死小能手 2024-09-02 11:16:03

卡鲁塞尔已经提供了维基百科的定义:

“正式代表
知识由一组概念组成
一个域以及它们之间的关系
这些概念”。

为了实现这样的表示,已经开发了几种语言。目前最受关注的可能是 网络本体语言(OWL)

在传统的关系数据库中,概念可以使用表格来存储,但系统不包含任何关于概念含义以及它们与每个概念之间的关系的信息其他本体确实提供了存储此类信息的方法,这也意味着可以构建相当高级和智能的查询语言。 ="http://www.w3.org/TR/rdf-sparql-query/" rel="noreferrer">SPARQL 是专门为此目的而开发的

,我曾与 OWL 合作 。本体论,但这是相当学术研究的一部分,我不知道这项技术目前是否在实践中得到了广泛应用,但我确信潜力是存在的。

更新:示例

关于本体的“含义”和推理的示例:假设您在本体中定义了一个类 Pizza 和一个类 Vegetarian Pizza,它是一个 披萨没有属于肉类类的成分。如果您现在创建的Pizza实例恰好没有任何肉类成分,系统可以自动推断您的披萨也是Vegetarian Pizza,即使您没有明确指定。

Karussell already provided the wikipedia definition:

"a formal representation of the
knowledge by a set of concepts within
a domain and the relationships between
those concepts".

In order to implement such a representation, several languages have been developed. The one that currently gets the most attention is probably the Web Ontology Language (OWL).

In a traditional relational database, concepts can be stored using tables, but the system does not contain any information about what the concepts mean and how they relate to each other. Ontologies do provide the means to store such information, which allows for a much richer way to store information. This also means that one can construct fairly advanced and intelligent queries. Query languages such as SPARQL have been developed specifically for this purpose.

For my masters thesis, I have worked with OWL ontologies, but this was as part of a fairly academic research. I don't know if any of this technology is currently used in practice very much, but I'm sure the potential is there.

Update: example

An example of 'meaning' and reasoning over the ontologies: say you define in your ontology a class Pizza, and a class Vegetarian Pizza, which is a Pizza that has no Ingredients that belong to the class Meat. If you now create some instance of a Pizza that just happens not to have any meat ingredients, the system can automatically infer that your pizza is also a Vegetarian Pizza, even if you did not explicitly specify it.

无人问我粥可暖 2024-09-02 11:16:03

本体是一种模式(模型),描述域中的类型(可能还有一些个体)、类型和个体之间可能存在的关系,以及个体和属性组合方式的约束。

一种类比是 UML 类图 - 但本体具有形式语义,因此可以被机器解释,而不仅仅是供人类使用的图。

示例

:项目、人员、项目经理。 ProjectManager 是 Person 的子类(显然)。人员和项目是脱节的

关系:工作、管理。 Manages是works的子属性

约束:人们在项目上工作,而不是相反。只有项目经理才能管理项目。

这个简单的例子支持机器推理,例如,如果X管理Y,那么我们可以推断Y是一个项目,而X是一个项目经理,因此是一个人。

An ontology is a schema (model) describing the types (and possibly some individuals) in a domain, the relationships that may exist between types and individuals, and constraints on the way that individuals and properties may be combined.

One analogy is with the UML class diagrams - but ontologies have formal semantics, so can be machine-interpreted, rather than just being diagrams for human consumption.

Example:

Classes: Project, Person, ProjectManager. ProjectManager is a subclass of Person (apparently). People and Projects are disjoint

Relationships: worksOn, manages. Manages is a sub-property of worksOn

Constraints: People work on Projects, not the other way around. Only Project Managers can manage projects.

This simple example enables machine inferences, e.g. if X manages Y, then we can infer that Y is a Project, and X is a Project Manager and therefore a Person.

依 靠 2024-09-02 11:16:03

人工智能人们在某种程度上认为,如果我们想要构建一个系统,能够以某种方式认为我们应该让系统以某种方式了解我们对世界的了解。换句话说,他们希望通过生成一个数据库,将我们自己对这个词的理解强加给计算机,该数据库几乎包含我们所知道的概念和实体的信息和简明定义。这些数据库是用不同的算法建立的,但毕竟不是很精确。你最好看看一个被称为最好的数据库之一,叫做 CYC。
http://sw.openencyc.org/
检查框中的几个字,看看您会得到什么回报。
最好的祝愿

AI people at some point thought that in case we want to build a system to be able to somehow think we should enable the system to somehow know what we know about the world. In other words they wanted to impose our own understanding of the word to the computers by generating a database which almost contains information and concise definitions about concepts and entities we know. Such databases have been built with different algorithms but not very precise after all. You better have a look on a database which is known to be among the best called CYC.
http://sw.opencyc.org/
check few words in the box and see what you get as a return.
Best wishes

極樂鬼 2024-09-02 11:16:03

曾几何时,我把这样的问题作为一项任务分配给一个优秀的开发人员来回答,因为我的上级相信本体论。但没有得到任何尖锐的答复,我的上司在一段时间后被解雇了。我还是很好奇。

我目前的理解是,这是自然语言中的单词(或“实体”)以不同的关系相互连接的想法。然后我们将这个想法推广到任何数据库实体。基本上,我们最终没有得到任何有趣的东西,也没有有用的查询语言。

我可能错了。

Once upon a time I have assigned such question to a good developer to answer as a task, because my superior believed in Ontologies. It didn't materialize to any sharp answer and my superior was fired after some time. I'm still curious.

My current understanding is that this is an idea of words in a natural language (or "entities") being connected to each other with different relations. Then we generalize that idea to any DB entities. And basically, we end up with nothing interesting and with no useful query language.

I may be wrong.

虐人心 2024-09-02 11:16:03

维基百科怎么样?

本体是一种形式化的表示
知识由一组概念组成
域内和关系
这些概念之间

请参阅“领域本体”和 了解更多详细信息。

What about wikipedia?

an ontology is a formal representation
of the knowledge by a set of concepts
within a domain and the relationships
between those concepts

See 'Domain ontologies' and this and that for more details.

余生共白头 2024-09-02 11:16:03

上面的一些评论似乎有点不屑一顾。
我在实际产品中使用了本体数据库,这是解决问题的唯一方法。本体可用于创建一个数据库,该数据库可以比关系数据库之类的数据库更好地涵盖现实世界的复杂性。 “信息”多于“数据”。当关系复杂且信息集庞大且不完整时,它尤其有用。
良好的本体数据库中的查询机制特别简洁 - 它智能地使用模式/本体(例如任何类层次结构)来返回否则找不到的答案。

Some of the comments above seem a bit dismissive.
I've used an ontology database in a real product and it was the only way to solve the problem. An ontology can be used to create a database that can encompass the complexities of the real world much better than something like an relational database. More "information" than "data". It's especially good when the relationships are complex and the information set is large and incomplete.
Especially neat is the query mechanism in a good ontology database - it intelligently uses the schema/ontology (such as any class hierarchies) to return answers that would not be found otherwise.

唠甜嗑 2024-09-02 11:16:03

本体论来自生物科学,这个词代表了一个非常简单的想法,但它是用其他不太常用的词来定义的。

通过领域内的一组概念以及这些概念之间的关系对知识的正式表示

  • 知识的表示,或“模型”
  • 领域,或“主题”
  • 一组概念,或“领域中的事物”
  • 概念之间的一组关系

因此,用计算机科学术语来说,它是一个图,其中节点对应于同一主题的所有部分,并用主题相关数据进行注释,并通过关系注释边连接到其他节点。

由于它是一个不太适合关系数据库的模型,如果您打算存储本体,您可能需要使用图形数据库,或者流行的关系数据库图形存储技术之一。

本体论没有在所有方面超越关系数据库的主要原因是因为关系数据库提供了一种简单的(尽管不太灵活)连接两个项目的方法,即外键。虽然这个键不允许使用大量注释来描述关系,但它确实限制了数据结构化方法的数量,防止人们创建各种关系(值得庆幸的是,这意味着限制浪费关系的数量)。

例如,在基于本体的“家谱”数据库中,

  • 域是一个家谱,
  • 模型是家谱中的个体及其关系。
  • 概念是家庭中的人。
  • 这些关系将是表示“母亲”、“父亲”、“麻烦”、“姐妹”等的边。

请注意,现在是棘手的部分。你有“妈妈”和“爸爸”,但是“父母”呢?如果省略“parent”,您的查找逻辑会更复杂,所以让我们添加一个新的关系“parent”,这意味着一个人的“母亲”现在有两个链接,“mother”和“parent”(就像父亲一样) 。

“祖父母”呢?同样,从逻辑上讲,这样做会将一些信息排除在数据库之外,但存储这些信息会增加维护数据库的开销。

“叔叔”、“阿姨”、“岳父”、“岳父”等都添加了一种新的关系,本体背后的力量在于你不受你想要的关系类型的限制添加;然而,困难在于了解哪些关系直接影响解决方案(如果不直接存储关系,通常会缺乏性能,因为您需要进行多个数据库查找才能找到“组合关系”)。

Coming from the Biological Sciences, Ontology is a word that represents a really easy idea, but is defined with other less-commonly used words.

a formal representation of the knowledge by a set of concepts within a domain and the relationships between those concepts

  • A representation of knowledge, or a "model"
  • A domain, or "a topic"
  • A set of concepts, or "things in the domain"
  • A set of relationships between concepts

So, in computer science terms, it's a graph, where the nodes correspond to things which are all part of the same topic, are annotated with topic-related data, and are connected to other nodes with relationship annotated edges.

As it is a model that doesn't fit into relational databases well, if you intend to store an Ontology you might want to use a graph database, or one of the popular relational database graph storage techniques.

The primary reason Ontologizes haven't overtaken relational databasees in all aspects is because relational databases provide a simple, even if less flexible, means of connecting two items, the foreign-key. While this key doesn't permit a lot of annotation to describe the relationship, it does limit the number of approaches to data structuring, preventing people from creating every kind of relationship (which thankfully means limiting the number of wasteful relationships).

For example, in a "family tree" database based on Ontologies

  • The domain is one family's tree
  • The model is the individuals and their relationships within the family tree.
  • The concepts are the people in the family.
  • The relationships would be the edges indicating "mother", "father", "bother", "sister", etc.

Note that now comes the tricky part. You have "mother" and "father", but what about "parent"? If you omit "parent" your lookup logic is more complex, so let's include a new relationshiop "parent", which means a "mother" of a person now has two links, "mother" and "parent" (as does the father).

What about "grandparent"? Again, doing it logically leaves some of the information out of the database, but storing it increases the overhead of maintaining the database.

"uncle", "aunt", "in-law", "father-in-law", etc. all add in one new relationship, and the power behind Ontologies is that you are not constrained as to the kinds of relationships you wish to add; however, the difficulties lie in knowing which relationships directly impact the solution (and the general lack of performance if you don't store the relationships directly, as you need to do multiple database lookups to find a "composed relationship").

你的他你的她 2024-09-02 11:16:03

很早以前,我使用过斯坦福大学开发的本体数据库(Protege)。

这个想法是为了跟踪参考文献。书籍有作者和引言。引用包含一本书的链接以及页码。作者有书籍的链接,书籍有出版商、出版日期、作者的链接。文章和视频也是如此。

这个想法是插入一个引文,并可以随时访问归因,这样我下次使用它时就不再需要跟踪在哪本书和哪页中找到了引文。

本体数据库提供了一种极好的数据建模方法。但使用它又是另一回事了。从数据库中提取参考文献的各个部分所花费的时间比从 Word 文档中复制完整的引用和参考信息所花费的时间要多。

要使类似的东西真正有用,只需将其集成到文字处理器中即可。 (理想情况下,您通常会或多或少地添加引用,然后保存它们以供以后重复使用,并附上指向您使用位置的链接!:__)

A long time ago, I used an ontology database developed at Stanford (Protege).

The idea was to keep track of references. Books had authors and quotes. A quote had a link to a book, along with a page number. An author had links to books, books had publisher, publication date, links to authors. Similarly for articles and videos.

The idea was to insert a quote, and have ready access to the attribution, so I no longer had to keep track of what book and page a quotation was found in, the next time I used it.

The ontology database provided a superb way to model the data. But using it was another matter. It took more time to pull the parts of a reference out of the database than it took to copy a complete quotation-and-reference-info from a Word doc.

All it would take to make something like that really useful would be an integration into a Word processor. (Ideally, you would add references more or less normally, but then save them for later re-use, along with a link to the location where you used! :__)

禾厶谷欠 2024-09-02 11:16:03

我是个门外汉,但在我看来,人工智能研究已有50年历史循环往复。

  1. 学者们的夸张预测。
  2. 政府慷慨资助。
  3. 产生了适度的结果。
  4. 资金被大幅削减。
  5. 时间流逝。之前的循环被忘记了。返回到步骤 1。

我们已经循环了两次。也许这一次会有所不同......?

I am a total layman, but it appears to me that artificial intelligence research has a 50 year history that goes round in cycles.

  1. Extravagant predictions by academics.
  2. Generous funding by government.
  3. Modest results are produced.
  4. Funding is cut savagely.
  5. Time passes. The previous cycle is forgotten. Return to step 1.

We've been round the cycle twice. Possibly this time it will be different...?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文