Java 中的 XML 到 XML 转换
我需要在 Java 中从 XMI 转换为 OWL(XML/RDF 序列化),所以本质上这是 XML 到 XML 的转换,很可能我可以使用正则表达式并使用 ReplaceAll 来满足我的需要,但这似乎是非常混乱的方法它。 您有什么建议,以便以后可以轻松定制(我的 OWL 模型将来可能会略有变化)?
我的想法是将 XMI 读入创建的类层次结构(根据我的 OWL 模型),然后使用一些模板引擎将其输出为 OWL (XML)。您知道更容易定制的更简单的方法吗?
I need to translate from XMI to OWL (XML/RDF serialized) in Java, so essentially this is XML to XML translation and most probably I could just play with regex and use replaceAll to what I need, but that seems very messy way to do it.
What would you suggest so that it will be easily customizable later (my OWL model might change slightly in the future)?
My idea was to read XMI into created class hierarchy (according to my OWL model) and then using some template engine to output it as OWL (XML). Do you know of easier way that would be easily customizable?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
XSL 转换 非常适合此类工作,事实上它是为此设计的
:-)从 XSLT 开始,查看 zvon 参考 及其教程。
XSL Transformations is perfect for this kind of job, in fact it designed for it :-)
To start with XSLT, have a look at the zvon reference and its tutorial.
您可以使用 XSLT 将 XML 转换为 XML。
这篇 OReilly 文章是一个很好的起点。
You could use XSLT to transform XML to XML.
This OReilly article is a good place to start.
XMI 不是一种很好的直接转换为 OWL 的格式 - XMI 中有许多不同的结构具有相同的含义(
@stereotype="foo"
、stereotype/@name="foo “
和stereotype/@xmi:id="{id of the foostereotype}"
都表示同一件事) - 我强烈建议使用 XMI 的两阶段过程首先转换为规范形式,在其中解析此类引用,并删除您不想映射到 OWL 的任何信息。如果您不熟悉的话,XSLT 键函数和元素 将非常有用。尽管您可以在 XSLT1 中完成此操作(当没有其他可用的时候我就这样做了),但可以在 XSLT2 处理器(例如 Saxon)中工作 使转换更加简洁。询问 XSLT 问题的最佳位置是 Mulberry 列表。
sourceforge 上有一个工具可以通过 GUI 完成此操作,但我似乎找不到它。我的中间转换由前任雇主所有。对于代码生成或 XMI 到 XML,我直接使用 XSLT 和两阶段方法。
XMI is not a very good format for direct transformation into OWL - there are many different structures in XMI which have the same meaning (
@stereotype="foo"
,stereotype/@name="foo"
, andstereotype/@xmi:id="{id of the foo stereotype}"
all mean the same thing ) - I strongly advise using a two-stage process where the XMI is first transformed into a canonical form where such references are resolved and any information you don't want to map into OWL is removed.The XSLT key function and element will prove useful if you're not familiar it. Although you can do it in XSLT1 (and I did when there was no other available), working in an XSLT2 processor such as Saxon makes the transform much more concise. The best place to ask XSLT questions is the Mulberry list.
There was a tool on sourceforge which did this through a GUI, but I can't seem to find it. My intermediate transforms are owned by a previous employer. For code generation or XMI to XML, I use XSLT directly and the two-stage approach.
我同意 rsp 和 cb160 的观点,即 XSLT 是完成这项工作的工具。
如果您使用的是 unix 平台,您可以考虑使用 xsltproc 在命令行上测试转换。根据我的经验,如果您不太熟悉 XSL,这确实可以加快开发时间。
I agree with rsp and cb160 that XSLT is the tool for the job.
If you're on a unix platform you could consider xsltproc to test the transformations on the command line. In my experience that can really speed up development time if you're not really at home with XSL.
XSLT 设计用于处理 XML 节点树。虽然 RDF 序列化是 XML 节点的“树”(RDF/XML 和 RDF/XML-Abbrev),但底层的 RDF 数据模型是一个图。
如果生成的 RDF 图不是树,那么您将不得不在 XSLT 中做一些肮脏的事情来遍历引用,并且性能/可维护性/健全性可能会受到影响。如果您修改 OWL 格式然后想要转换回非 RDF XML,请注意这一点。
一个简单的(树)示例如下:
对于转换回非 RDF XML,如果您使用最基本的 RDF/XML 形式,您将立即在顶级
rdf:RDF元素。转换这些可能需要一遍又一遍地搜索整个语句列表。
您可能会发现 RDF/XML-Abbrev 格式更易于阅读,但用 XSLT 处理并不容易,因为 RDF 的数据模型是无序的,并且一张图可以有许多等效(但与 XSLT 不兼容)的 XML 形式。上面的示例可以序列化为以下任一形式:
Pete Kirkham 关于创建序列化规范形式的建议将帮助您编写 XSLT。在大多数情况下,给定完全相同的输入,RDF 库每次都会将语句序列化为相同的格式,但从长远来看,我不会依赖于此,因为 RDF 图中的数据是无序的。
XSLT is designed for processing trees of XML nodes. While there are RDF serializations which are a "tree" of XML nodes (RDF/XML and RDF/XML-Abbrev), the underlying RDF data model is a graph.
If your resulting RDF graph is not also tree, you're going to have to do dirty things in your XSLT to traverse references and performance/maintainability/sanity can suffer. Just be aware of this if you modify the OWL format and then want to convert back to non-RDF XML.
A simple (tree) example is as follows:
For conversions back to non-RDF XML, if you use the most basic RDF/XML form you will get a list of RDF statements immediately under the top level
rdf:RDF
element. Transforming these can involve searching the entire list of statements over and over.You might find the RDF/XML-Abbrev format easier to read, but it is not easy to process with XSLT because RDF's data model is unordered and one graph can have many equivalent (but incompatible to your XSLT) XML forms. The example above can serialize as either of the following:
Pete Kirkham's suggestion of creating a canonical form for serialization will aide you in writing XSLTs. In most cases, given the exact same input, a RDF library will serialize the statements to the same format every time, but I would not depend on this in the long run as data in a RDF graph is unordered.