解析二进制文件数据并存储在数据库中的设计模式

发布于 2024-07-04 05:08:56 字数 464 浏览 14 评论 0原文

有人推荐一种设计模式来获取二进制数据文件,将其部分解析为对象并将结果数据存储到数据库中吗?

我认为类似的模式可用于获取 XML 或制表符分隔的文件并将其解析为它们的代表对象。

常见的数据结构包括:

(标头)(DataElement1)(DataElement1SubData1)(DataElement1SubData2)(DataElement2)(DataElement2SubData1)(DataElement2SubData2)(EOF)

我认为一个好的设计将包括一种根据文件类型或包含的一些定义的元数据更改解析定义的方法标题。 因此 工厂模式 将成为解析器部分整体设计的一部分。

Does anybody recommend a design pattern for taking a binary data file, parsing parts of it into objects and storing the resultant data into a database?

I think a similar pattern could be used for taking an XML or tab-delimited file and parse it into their representative objects.

A common data structure would include:

(Header) (DataElement1) (DataElement1SubData1) (DataElement1SubData2)(DataElement2) (DataElement2SubData1) (DataElement2SubData2) (EOF)

I think a good design would include a way to change out the parsing definition based on the file type or some defined metadata included in the header. So a Factory Pattern would be part of the overall design for the Parser part.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

迷离° 2024-07-11 05:08:56

策略模式也许是您想要了解的一种。 该策略是文件解析算法。

然后您需要一个单独的数据库插入策略。

The Strategy pattern is maybe one you want to look at. The strategy being the file parsing algorithm.

Then you want a separate strategy for database insertion.

装迷糊 2024-07-11 05:08:56

使用 Lex 和 YACC。 除非你在接下来的十年里专门致力于这个主题,否则他们每次都会生成更好更快的代码。

Use Lex and YACC. Unless you devote the next ten years exclusively to this subject, they will produce better and faster code every time.

超可爱的懒熊 2024-07-11 05:08:56

我完全同意 Orion Edwards 的观点,这通常是我处理问题的方式; 但最近我开始看到一些疯狂的模式(!)。

对于更复杂的任务,我通常使用类似 解释器 (或 策略) 使用一些 构建器(或工厂)来创建数据的每个部分。

对于流数据,整个解析器看起来像一个适配器,从流对象适应到对象流(通常只是一个队列)。

对于您的示例,可能会有一个用于完整数据结构(从 head 到 EOF)的构建器,该构建器在内部使用内部数据元素的构建器(由解释器提供)。 一旦遇到 EOF,就会发射一个对象。

然而,对于许多较小的任务来说,在某些工厂函数中的 switch 语句中创建对象可能是最简单的方法。 另外,我喜欢保持数据对象不可变,因为你永远不知道何时有人将并发强加给你:)

I fully agree with Orion Edwards, and it is usually the way I approach the problem; but lately I've been starting to see some patterns(!) to the madness.

For more complex tasks I usually use something like an interpreter (or a strategy) that uses some builder (or factory) to create each part of the data.

For streaming data, the entire parser would look something like an adapter, adapting from a stream object to an object stream (which usually is just a queue).

For your example there would probably be one builder for the complete data structure (from head to EOF) which internally uses builders for the internal data elements (fed by the interpreter). Once the EOF is encountered an object would be emitted.

However, objects created in a switch statement in some factory function is probably the simplest way for many lesser tasks. Also, I like keeping my data-objects immutable as you never know when someone shoves concurrency down your throat :)

水染的天色ゝ 2024-07-11 05:08:56
  1. 使用想到的任何技术编写文件解析器
  2. 编写大量单元测试以确保涵盖所有边缘情况。 一旦你完成了这个,你实际上就会对问题/解决方案有一个合理的想法。 现在你脑子里只有一些理论,其中大部分都会被证明是错误的。
  3. 毫不留情地重构。您的目标应该是删除大约一半的代码。

您会发现最后的代码要么类似于现有的设计模式,要么创建了一个新的设计模式。 这样你就有资格回答这个问题了:-)

  1. Write your file parser, using whatever techniques come to mind.
  2. Write lots of unit tests to make sure all your edge cases are covered. Once you've done this, you will actually have a reasonable idea of the problem/solution. Right now you just have theories floating around in your head, most of which will turn out to be misguided.
  3. Refactor mercilessly. Your aim should be to delete about half of your code.

You'll find that your code at the end will either resemble an existing design pattern, or you'll have created a new one. You'll then be qualified to answer this question :-)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文