转换器的最佳实践
在创建基于 C# 将文件从一种格式转换为另一种格式的组件时,您可以建议哪些最佳实践和设计模式?例如 PDF 到 HTML 或 Word 到 HTML。
What best practice and design patterns could you advise when creating component that convert file from one format to another, based on C#? For example PDF to HTML or Word to HTML.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
不要使用文件。将基本 Stream 引用作为输入/输出,以便您可以从任何地方引入数据并将其写入任何地方(文件、数据库、网络连接、内存等)。
如果您希望有机会进行不止一种格式转换(根据我的经验,大多数软件的范围扩大是很常见的,通常是在发布之前,但通常是发布后不久),请设计一个 <可扩展的内部表示,支持当前对您重要/相关的其他格式的所有功能,并使用应用双重转换的管道方法(PDF -> MyFormat,然后 MyFormat -> HTML )。这样,当您决定也希望从 Word 转换时,您只需编写一个 Word -> 即可。 MyFormat转换实现Word-> HTML。首先,实现这一点的成本非常低,并且可以为您添加支持的每种连续格式带来巨大的稳健性和实现成本收益。
尝试编写灵活的代码(考虑格式将来可能如何变化)。当下一个 PDF、Word 或 HTML 格式出现时,您能够轻松地升级代码以支持已引入的任何更改吗? (我的意思是,尽量不要在转换中施加任意限制,比如假设因为当前格式只允许使用 256 种字体,所以您可以使用一个字节来存储字体索引 - 留一点空间扩展)
从第一天起就在设计中构建进度报告系统,以便在转换过程中轻松显示进度条。并确保您的设计不会妨碍一次性批处理数百个文件,即使这就是您现在所需要的。
使转换代码与驱动它的应用程序完全分开;确保业务逻辑(转换)完全脱离任何 UI。转换应该是完全独立的可重用模块,并且能够在没有用户交互的批处理模式下运行。
Don't use files. Take base Stream references as input/outputs so you can bring in data from anywhere and write it anywhere (files, databases, network connections, memory, etc).
If you expect that there is any chance you will ever want to do more than one format conversion (and in my experience it's very commmon for most software to grow in scope, usually before it's released, but often soon after release), design an extendable internal representation that supports all the features of the other formats that are currently important/relevant to you, and use a pipeline approach that applies a double conversion (PDF -> MyFormat and then MyFormat -> HTML). That way, when you decide you also wish to convert from Word, you only have to write a Word -> MyFormat conversion to achieve Word -> HTML. This can cost very little to implement in the first place, and can give massive gains in robustness and implementation cost for each successive format you add support for.
Try to write flexible code (consider how formats might change in future). When the next PDF or Word or HTML format comes out, how easily will you be able to upgrade the code to support any changes that have been introduced? (By this, I mean, try not to put arbitrary restrictions in the conversion, like assuming that because the current format only allows 256 fonts to be used, that you can use a byte to store a font index - leave a bit of space for expansion)
Build a progress reporting system into the design from day one, so it's easy to show a progress bar during the conversion. And ensure your design that doesn't preclude batch-processing hundreds of files in one go, even if that's all you need now.
Keep the conversion code completely separate from the application that drives it; make sure the business logic (conversion) is totally divorced from any UI. The conversion(s) should be totally separate re-usable module(s), and capable of running in a batch-processing mode where there is no user interaction.