分类层次表示格式
我们计划在我们的软件解决方案中集成分层分类法。 (基于Java)
是否有标准化(且易于使用)的格式来表示分层分类法?不同分类法编辑器使用的通用交换格式?
我一直在研究 OWL (RDF)、PMML...但它们要么非常复杂,要么似乎并不真正适合这个目的。
举一个简单的例子。我们想代表一棵概念树。每个概念都会附加某种数据对象(在括号中)。
Vehicles (category := 'V')
|-> Car (code := 1)
| |-> Petrol (code := 2 && car_code := 'petrol')
| |-> Electical (code := 2 && car_code := 'electrical')
|-> Plane (code := 1)
我们可以使用 Xstream 等序列化库来开发我们自己的 XML 格式。但如果有一个很好的标准——Java 很好地支持它,我更愿意使用它。
We are planning to integrate a hierarchical taxonomy in our software solution. (Java based)
Is there a standardized (and easy to use) format to represent hierarchical taxonomies? A format which would the common exchange format used by different taxonomy editors?
I have been looking at OWL (RDF), PMML... but those are either quite complex, or do not really seem be fit for this purpose.
To give a simple example. We would like to represent a tree of concepts. Attached to each concept there would be some kind of data object (in brackets)
Vehicles (category := 'V')
|-> Car (code := 1)
| |-> Petrol (code := 2 && car_code := 'petrol')
| |-> Electical (code := 2 && car_code := 'electrical')
|-> Plane (code := 1)
We could of develop our own XML format using a serialization library like Xstream. But if there is a good standard - which is well supported by Java, I would prefer to use it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您正在寻找SKOS - 简单知识组织系统命名空间文档
SKOS 是一个本体论来表示分类法、层次结构和同义词库。它基于更广泛和更狭义属性的概念来陈述术语之间的关系。例如:
您可以使用 SKOS 表示您的分类法,在 RDF 中序列化并在 RDF 数据库中断言。要查询它并检索层次结构树,请使用 SPARQL 语言。
You are looking for SKOS - Simple Knowledge Organization System Namespace Document
SKOS is an ontology to represent taxonomies, hierarchies and thesaurus. It is based on the concept of broader and narrower properties to state relationships between terms. For instance:
You can represent your taxonomy with SKOS, serialize in RDF and assert in a RDF database. To query it, and retrieve hierarchy trees, use the SPARQL language.
[很抱歉使用回复来表达对问题的评论。只是评论格式不适合这种“问题重定向”]
虽然问题似乎是关于表示 分类层次结构,对 OWL、RDF 和 PMML 的引用指向 本体解决方案。此外,这些本体格式的复杂性可能表明需要采用更简单的方法。
简而言之,您需要断言您是否真的需要本体框架而不是分类框架。这两个相关概念很容易混淆,但在许多情况下,似乎只需要更灵活的 DBMS 甚至简单的基于 XML 的模式描述符即可。
例如,要通过异构项目的目录执行引导搜索,具有相对简单的分层模式模型的 EAV 数据库后端可以“满足要求”。
或者,为了支持/验证某些实体提取逻辑,一个简单的分类法,其中叶节点包含接受的文本。
另一方面,如果需要基于模式的某些推理,或者为了,例如,奇特的数据挖掘工作,其中本体驱动数据收集机器人,那么您可能实际上正在谈论语义网络/本体应用程序。
[Apologies for using a reply for what should be a comment to the question. 't is just that the comment format is not suitable for this kind of "question redirect"]
While the question appears to be about a format to represent taxonomy hierarchies, the references to OWL, RDF and PMML point toward ontology solutions. Also the perceived complexity of these ontology formats is maybe a tell that a simpler approach is warranted.
In a nutshell, you need to assert if you really need an ontology framework rather than a taxonomy framework. It is easy to confuse these two related concepts but it seems that in many instances a more flexible DBMS or even a simple XML-based schema descriptor is all that is required.
For example, to perform guided searches through catalogs of heterogeneous items an EAV database back-end with a relatively simple hierarchical schema model can "fit the bill".
Or, to support/validate some entity extraction logic, a simple taxonomy where the leaf nodes contain the accepted texts
On the other hand, if some reasoning on the basis of the schema is required, or for, say, fancy data mining efforts whereby the ontology drives the data-gathering bots, then you may effectively be talking about a semantic web / ontology application.
生物信息学家使用 OBO 文件格式 (http://www.geneontology.org/GO。 format.obo-1_2.shtml )来存储一些众所周知的本体,例如 GeneOntology(有向图本体)。它带有一个java解析器:http://www.geneontology.org/GO。 java.obo.parser.shtml
Bioinformaticians use the OBO File format ( http://www.geneontology.org/GO.format.obo-1_2.shtml ) to store some well known ontologies such as GeneOntology (a directed graph ontology). It comes with a java parser: http://www.geneontology.org/GO.java.obo.parser.shtml