我应该如何管理基于 Xml 的文档的不同不兼容格式
我有一个应用程序,它以基于 Xml 的格式保存文档(例如 Word 文档) - 目前,从 xsd 文件生成的 C# 类用于读取/写入文档格式,一切都很好,直到最近我不得不更改以下格式:该文件。 我担心的是向后兼容性,因为我的应用程序的未来版本需要能够读取所有以前版本保存的文档,理想情况下我也希望我的应用程序的旧版本能够正常处理读取保存的文档通过我的应用程序的未来版本。
例如,假设我更改文档的架构以在某处添加(可选)额外元素,那么我的应用程序的旧版本将简单地忽略额外元素并且不会出现问题:
<doc>
<!-- Existing document -->
<myElement>Hello World!</myElement>
</doc>
但是,如果进行了重大更改(属性例如更改为一个元素或元素的集合),那么我的应用程序的过去版本应该忽略此元素(如果它是可选的),或者通知用户他们正在尝试读取使用我的较新版本保存的文档否则应用程序。 此外,这目前也让我头疼,因为我的应用程序的所有未来版本都需要完全独立的代码来读取两个不同的文档。
此类更改的一个示例是以下 xml:
<doc>
<!-- Existing document -->
<someElement contents="12" />
</doc>
更改为:
<doc>
<!-- Existing document -->
<someElement>
<contents>12</contents>
<contents>13</contents>
</someElement>
</doc>
为了防止将来出现支持问题,我想提出一个不错的策略来处理我将来可能进行的更改,以便我的应用程序版本我现在发布的版本将能够应对未来的这些变化:
- 文档的“版本号”是否应该存储在文档本身中,如果是,应该使用什么版本控制策略? 如果文档版本与 .exe 程序集版本匹配,或者应该使用更复杂的策略(例如,主要修订版更改表示重大更改,而次要修订版增量表示非重大更改 - 例如额外的可选元素)
- 我应该使用什么方法用于读取文档本身以及如何避免为不同版本的文档复制大量代码?
- 虽然 XPath 显然是最灵活的,但它比简单地使用 xsd 生成类要实现更多的工作。
- 另一方面,如果使用 DOM 解析,则每个重大更改都需要在源代码管理中生成文档 xsd 的新副本,如果需要将修复应用于较旧的模式(旧版本的应用程序是仍然支持)。
另外,我非常松散地假设我所做的所有更改都可以分为“喙状更改”和“非破坏性更改”这两类,但我并不完全相信这是一个安全的假设使.
请注意,我非常宽松地使用“文档”一词 - 内容根本不像文档!
感谢您能给我提供的任何建议。
I have an application which saves documents (think word documents) in an Xml based format - Currently C# classes generated from xsd files are used for reading / writing the document format and all was well until recently when I had to make a change the format of the document. My concern is with backwards compatability as future versions of my application need to be able to read documents saved by all previous versions and ideally I also want older versions of my app to be able to gracefully handle reading documents saved by future versions of my app.
For example, supposing I change the schema of my document to add an (optional) extra element somewhere, then older versions of my application will simply ignore the extra elemnt and there will be no problems:
<doc>
<!-- Existing document -->
<myElement>Hello World!</myElement>
</doc>
However if a breaking change is made (an attribute is changed into an element for example, or a collection of elements), then past versions of my app should either ignore this element if it is optional, or inform the user that they are attempting to read a document saved with a newer version of my app otherwise. Also this is currently causing me headaches as all future versions of my app need entirely separate code is needed for reading the two different documents.
An example of such a change would be the following xml:
<doc>
<!-- Existing document -->
<someElement contents="12" />
</doc>
Changing to:
<doc>
<!-- Existing document -->
<someElement>
<contents>12</contents>
<contents>13</contents>
</someElement>
</doc>
In order to prevent support headaches in the future I wanted to come up with a decent strategy for handling changes I might make in the future, so that versions of my app that I release now are going to be able to cope with these changes in the future:
- Should the "version number" of the document be stored in the document itself, and if so what versioning strategy should be used? Should the document version match the .exe assembly version, or should a more complex strategy be used, (for example major revision changed indicate breaking changes, wheras minor revision increments indicate non-breaking changes - for example extra optional elements)
- What method should I use to read the document itself and how do I avoid replicating massive amounts of code for different versions of documents?
- Although XPath is obviously most flexible, it is a lot more work to implement than simply generating classes with xsd.
- On the other hand if DOM parsing is used then a new copy of the document xsd would be needed in source control for each breaking change, causing problems if fixes ever need to be applied to older schemas (old versions of the app are still supported).
Also, I've worked all of this very loosly on the assumption that all changes I make can be split into these two categories of "beaking changes" and "nonbreaking changes", but I'm not entirely convinced that this is a safe assumption to make.
Note that I use the term "document" very loosely - the contents dont resemble a document at all!
Thanks for any advice you can offer me.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您肯定需要 XML 文件中的版本号,我建议不要将其与应用程序的版本绑定,因为它实际上是一个单独的实体。 您可能会使用应用程序的两个或三个版本而无需更改 XML 格式,或者您可能会在单个版本的开发过程中多次更改格式。
如果您希望旧版本的应用程序能够读取新版本的 XML 文件,那么您永远不能删除元素或更改它们的名称。 您始终可以添加元素,旧代码会很乐意忽略它们(XML 的优秀功能之一),但如果删除它们,则旧代码将无法运行。
正如 Ishmael 所说,XSLT 是将 XML 格式从一个版本转换为另一个版本的好方法,这样您就不会在源代码中遇到一大堆解析例程。
You definitely need a version number in the XML file, and I would suggest not tying it to the version of the application because it's really a separate entity. You may through two or three versions of your app without ever changing the XML format or you may wind up changing the format multiple times during development of a single release.
If you want older versions of the application to be able to read newer versions of the XML file then you can never, ever remove elements or change their names. You can always add elements and the older code will happily ignore them (one of the nice features of XML) but if you remove them then the old code won't be able to function.
Like Ishmael said, XSLT is a good way to convert the XML format from one version to another so that you don't wind up with a whole pile of parsing routines in your source code.
XSLT 在这里是一个明显的选择。 鉴于您可以识别文档的版本,因此对于模式的每个版本,创建一个 XSLT 将以前的版本转换为新版本。
您可以按顺序应用转换,直到达到当前版本。 因此,您只能编辑最新的文档版本。 当然,您将无法保存为旧格式,并且可能会破坏旧版本的文档,但这是许多应用程序的典型情况。 如果您确实需要保存到旧版本,只需创建一个相反的转换即可。
就像@Andy 所说,使用应用程序的主要版本号。
XSLT is an obvious choice here. Given that you can identify the version of your document, for each version of your schema, creat an XSLT that transforms the previous version to your new version.
You can apply the transforms in sequence until you reach the current version. Thus you are only ever editing the latest document version. Of course, you will be unable to save to the old format and can break the document for older versions, but this is typical of many applications. If you absolutely need to save to the old version, just create a transform that goes the other way.
Like @Andy says, use the major build number of your app.
您可以向根元素添加指定版本的属性吗?
这样旧版本就不会被破坏,而新版本的软件将看到该属性并适当地切换到不同的加载方法。
版本编号本身取决于您的发布频率。 我个人会选择您软件的主要内部版本号,除非您预见到格式更改的频率会更高。
编辑:刚刚注意到有关代码重复的一点:
为此我将使用工厂模式,如下所示:
Could you add an attribute to the root element specifying version?
That way older versions wont be broken, and newer versions of your software will see the attribute and switch to a different loading method appropriately.
Version numbering itself would depend on your frequency of release. I would personally go with the major build number from your software, unless you foresee the format changing more often than that.
Edit: just noticed the bit about code duplication:
For that i would use the Factory Pattern, something like this: