在 C# 中概括和聚合 XML 转储的最佳方法是什么?

发布于 2024-10-08 01:16:55 字数 287 浏览 5 评论 0原文

以下是该问题的业务部分:

  • 几家不同的公司发送了一份 信息的 XML 转储 已处理。
  • 各公司发送的信息 相似……不完全相同。
  • 很快就会有更多的公司 已入伍并将开始发送 信息

现在,问题的技术部分是我想用 C# 编写一个通用解决方案来容纳此信息进行处理。我将转换 C# 类中的 XML 以适应我的数据库模型。

是否有任何模式或解决方案可以通用地处理此问题,而无需在以后添加许多公司时更改我的解决方案?

编写解析器/转换器的最佳方法是什么?

Here is the business part of the issue:

  • Several different companies send a
    XML dump of the information to be
    processed.
  • The information sent by the companies
    are similar ... not exactly same.
  • Several more companies would be soon
    enlisted and would start sending
    information

Now, the technical part of the problem is I want to write a generic solution in C# to accommodate this information for processing. I would be transforming the XML in my C# class(es) to fit in to my database model.

Is there any pattern or solution for this issue to be handled generically without needing to change my solution in case of addition of many companies later?

What would be the best approach to write my parser/transformer?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

酒儿 2024-10-15 01:16:55

这就是我过去做过类似事情的方式。

只要每个公司都有自己用于 XML 转储的固定格式,

  1. 每个公司就有一个特定的 XSLT。
  2. 有一种方法可以指示哪个转储来自何处(每个公司可能有不同的 DUMP 文件夹)
  3. 在您的程序中,基于 2,选择 1 并将其应用到 DUMP
  4. 所有 XSLT 都会将 XML 转换为您的一个标准数据库模式
  5. 保存将其添加到您的数据库中

每个新公司的添加至多是一个新的 XSLT
在架构非常相似的情况下,可以重复使用 XSLT,然后对它们进行特定的更改。

这种方法的缺点:如果没有合适的工具,调试 XSLT 可能会更加痛苦。然而,很多 XML 编辑器(例如 XML Spy 等)都具有出色的 XSLT 调试功能。

This is how I have done something similar in the past.

As long as each company has its own fixed format which they use for their XML dump,

  1. Have an specific XSLT for each company.
  2. Have a way of indicating which dump is sourced from where (maybe different DUMP folders for each company )
  3. In your program, based on 2, select 1 and apply it to the DUMP
  4. All the XSLT's will transform the XML to your one standard database schema
  5. Save this to your DB

Each new company addition is at the most a new XSLT
In cases where the schema is very similar, the XSLT's can be just re-used and then specific changes made to them.

Drawback to this approach: Debugging XSLT's can be a bit more painful if you do not have the right tools. However a LOT of XML Editors (eg XML Spy etc) have excellent XSLT debugging capabilities.

夜清冷一曲。 2024-10-15 01:16:55

在我看来,您只是在要求一种设计模式(或一组模式),您可以使用它以通用的、面向未来的方式来做到这一点,对吧?

理想情况下,您可能需要的一些属性

  • 每个“变压器”都是相互解耦的。
  • 您可以轻松添加新的“变压器”,而无需重写主要的“驱动程序”例程。
  • 每次修改变压器或至少添加一个新变压器时,您不需要重新编译/重新部署整个解决方案。

理想情况下,每个“转换器”都应该实现一个您的驱动程序例程了解的通用接口 - 将其称为 IXmlTransformer。该接口的职责是接收 XML 文件并返回用于保存到数据库的任何对象模型/数据集。每个变压器都会实现这个接口。对于所有转换器共享的通用逻辑,您可以创建一个所有继承自的基础类,或者(我的首选)拥有一组可以从其中任何一个调用的辅助方法。

我将首先使用工厂从主驱动程序例程创建每个“变压器”。工厂可以使用反射来询问它可以看到的所有程序集,或者像 MEF 这样的东西可以为你做很多工作。您的驱动程序逻辑应该使用工厂来创建所有变压器并存储它们。

然后,您需要一些逻辑和机制来“查找”给定 Transformer 接收到的每个 XML 文件 - 也许每个 XML 文件都有一个可用于识别的标头或类似的内容。同样,您希望将它们与主逻辑分离,以便您可以轻松添加新的变压器,而无需修改驱动程序例程。例如,您可以将 XML 文件提供给每个转换器并询问它“您可以转换该文件吗”,然后由每个转换器对给定文件“负责”。

每次您的驱动程序例程获取一个新的 XML 文件时,它都会查找适当的转换器并运行它;结果被发送到DB处理区。如果找不到转换器,则将文件转储到目录中以供稍后查询。

我建议阅读 Robert Martin 所著的《敏捷原则、模式和实践》(http://www.amazon.co.uk/Agile-Principles-Patterns-Practices-C/dp/0131857258) 等书,其中给出了适当的示例针对像您这样的情况(例如工厂和 DIP 等)的设计模式。

希望有所帮助!

Sounds to me like you are just asking for a design pattern (or set of patterns) that you could use to do this in a generic, future-proof manner, right?

Ideally some of the attributes that you probably want

  • Each "transformer" is decoupled from one another.
  • You can easily add new "transformers" without having to rewrite your main "driver" routine.
  • You don't need to recompile / redeploy your entire solution every time you modify a transformer, or at least add a new one.

Each "transformer" should ideally implement a common interface that your driver routine knows about - call it IXmlTransformer. The responsibility of this interface is to take in an XML file and to return whatever object model / dataset that you use to save to the database. Each of your transformers would implement this interface. For common logic that is shared by all transformers you could either create a based class that all inherit from, or (my preferred choice) have a set of helper methods which you can call from any of them.

I would start by using a Factory to create each "transformer" from your main driver routine. The factory could use reflection to interrogate all assemblies it can see that, or something like MEF which could do a lot of the work for you. Your driver logic should use the factory to create all the transformers and store them.

Then you need some logic and mechanism to "lookup" each XML file received to a given Transformer - perhaps each XML file has a header that you could use to identify or something similar. Again, you want to keep these decoupled from your main logic so that you can easily add new transformers without modification of the driver routine. You could e.g. supply the XML file to each transformer and ask it "can you transform this file", and it is up to each transformer to "take responsibility" for a given file.

Every time your driver routine gets a new XML file, it looks up the appropriate transformer, and runs it through; the result gets sent to the DB processing area. If no transformer can be found, you dump the file in a directory for interrogation later.

I would recommend reading a book like Agile Principles, Patterns and Practices by Robert Martin (http://www.amazon.co.uk/Agile-Principles-Patterns-Practices-C/dp/0131857258), which gives good examples of appropriate design patterns for situations like yours e.g. Factory and DIP etc.

Hope that helps!

瑾兮 2024-10-15 01:16:55

InSane 提出的解决方案可能是最直接且绝对 XML 友好的方法。

如果您希望编写自己的代码来转换不同的数据格式,而不是实现多个读取器实体,这些读取器实体将从每种不同的格式读取数据并转换为统一格式,那么您的主代码将以统一的方式处理这些实体,即通过保存到数据库。

搜索 ETL - (Extract-Trandform-Load) 以获取更多信息 - 我应该使用什么模型/模式来处理多个数据源? , http://en.wikipedia.org/wiki/Extract,_transform,_load

Solution proposed by InSane is likley the most straigh forward and definitely XML friendly approach.

If you looking for writing your own code to do conversion of different data formats than implementing multiple reader entities that would read data from each distinct format and transform to unified format, than your main code would work with this entities in unified way, i.e. by saving to the database.

Search for ETL - (Extract-Trandform-Load) to get more information - What model/pattern should I use for handling multiple data sources? , http://en.wikipedia.org/wiki/Extract,_transform,_load

夜深人未静 2024-10-15 01:16:55

按照当前获得最多支持的答案中的建议使用 XSLT 只是将问题从 c# 转移到 xslt。

您仍在更改处理 xml 的部分,并且仍然了解代码结构的好/差/无论是在 c# 中还是在 xslt 中的规则。

无论您将其保留在 C# 中还是使用 xslt 来处理这些位,关键是将您从各个公司收到的 xml 转换为独特的格式,无论它是中间 xml 还是加载数据的一组类你正在处理。

无论您做什么,都应避免变得聪明并尝试定义自己的通用转换层(如果您想要这样做),请务必使用 XSLT,因为这就是您的用途。如果您使用 C#,请为每个实现最简单接口的公司提供一个转换类,以保持简单。

在 C# 方式中,保留在组合转换之间可能存在的任何重用,甚至不要考虑继承来这样做……如果这样做的话,这是很快就会变得非常丑陋的领域之一。

Using XSLT as proposed in the currently most upvoted answer, is just moving the problem, from c# to xslt.

You are still changing the pieces that process the xml, and you are still exposed to how good/poor is the code structured / whether it is in c# or rules in the xslt.

Regardless if you keep it in c# or go xslt for those bits, the key is to separate the transformation of the xml you receive from the various companies into a unique format, whether that's an intermediate xml or a set of classes where you load the data you are processing.

Whatever you do avoid getting clever and trying to define your own generic transformation layer, if that's what you want Do use XSLT since that's what's for. If you go with c#, keep it simple with a transformation class for each company that implements the simplest interface.

On the c# way, keep any reuse you may have between the transformations to composition, don't even think of inheritance to do so ... this is one of the areas where it gets very ugly quickly if you go that way.

挥剑断情 2024-10-15 01:16:55

您考虑过 BizTalk 服务器吗?

Have you considered BizTalk server?

笙痞 2024-10-15 01:16:55

只是在这里玩围栏并为其他读者提供另一种解决方案。

在 C# 中将数据输入模型的最简单方法是使用 XSLT 将每个公司的数据转换为模型的序列化形式。以下是我将采取的基本步骤:

  1. 创建所有数据的完整模型并使用 XmlSerializer 写出该模型。
  2. 创建一个 XSLT,获取 A 公司的数据并将其转换为数据的有效序列化 xml 模型。使用之前创建的 XML 文件作为参考。
  3. 对刚刚创建的新 XML 使用反序列化。现在,您将拥有对包含公司所有数据的模型对象的引用。

Just playing the fence here and offering another solution for other readers.

The easiest way to get the data into your models within C# is to use XSLT to convert each companies data into a serialized form of your models. These are the basic steps I would take:

  1. Create a complete model of all your data and use XmlSerializer to write out the model.
  2. Create an XSLT that takes Company A's data and converts it into a valid serialized xml model of your data. Use the previously created XML file as a reference.
  3. Use Deserialize on the new XML you just created. You will now have a reference to your model object containing all the data from the company.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文