重构嵌套 for 循环

发布于 2024-12-18 15:07:29 字数 1004 浏览 1 评论 0原文

我遇到这种情况,两组数据之间有父子关系。我有一个父文档集合和一个子文档集合。要求是父母及其相应的孩子需要导出为“a”pdf文档。上述情况的简单实现如下(下面是java-ish伪代码):

for(Document parentDocument:Documents){
   ExportToPdf(parentDocument);
    for(Document childDocument:parentDocument.children()){
      AppendToParentPdf(childDocument);  
  }
}

上面的东西可能会解决问题,但是突然之间需求发生了变化,现在每个父母及其相应的孩子都需要位于单独的 pdf 中,因此通过将 AppendToParentPdf() 更改为 ExportToPdf() 来修改上面给定的代码片段,如下所示:

for(Document parentDocument:Documents){
   ExportToPdf(parentDocument);
    for(Document childDocument:parentDocument.children()){
      ExportToPdf(childDocument);  
  }
}

按照这种方式进行,不久之后就会出现此情况似乎琐碎的代码片段可能会遭受一些严重的代码异味。

我对 SO 的问题是:

  1. 是否有更好的父子关系表示,例如上面的,而不是在 O(n^2) 中强行遍历所有文档及其子项 code> 时尚,我可以使用不同的数据结构或技术以更优化的方式遍历整个结构。

  2. 在我上面描述的场景中,业务规则对于 pdf 的导出方式相当不固定,是否有更智能的方法来对导出功能的性质进行编码?而且导出格式是暂时的。 PDF 可以让位于 *.docs/csvs/xmls 等。

对此有一些看法将会很棒。

谢谢

I have this situation where I have a parent child relationship between two sets of data. I have a parent document collection and a child document collection. The requirement is that the parents and their corresponding children need to be exported into 'a' pdf document. A simple-implementation of the above situation can be as follows(java-ish pseudo code below):

for(Document parentDocument:Documents){
   ExportToPdf(parentDocument);
    for(Document childDocument:parentDocument.children()){
      AppendToParentPdf(childDocument);  
  }
}

Something as above will probably solve the problem, but all of a sudden the requirements changes and now each of these parents and their corresponding children need to be in separate pdfs, so the above given snippet is modified by changing the AppendToParentPdf() to ExportToPdf() follows:

for(Document parentDocument:Documents){
   ExportToPdf(parentDocument);
    for(Document childDocument:parentDocument.children()){
      ExportToPdf(childDocument);  
  }
}

Going along this way, it will not take long before this seemingly trivial snippet would suffer from some serious code smells.

My questions to SO are:

  1. Are there better representations of parent-child relationships such as the above where instead of brute-forcing my way through all the documents and their children in an O(n^2) fashion, I can use a different data-structure or technique to traverse the entire structure in a more optimal fashion.

  2. In the scenario that I described above, where the business rules are rather fluid about the way the pdfs should be exported, is there a smarter way to code the nature of the export function? Also the export format is transient. PDFs can give way to *.docs/csvs/xmls et al.

It will be great to get some perspective on this.

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

渡你暖光 2024-12-25 15:07:29

是否有更好的父子关系表示方法,例如上面的例子,而不是以 O(n^2) 方式强行遍历所有文档及其子对象。

这不是O(N^2)。它是 O(N),其中 N 是父文档和子文档的总数。假设没有一个子文档拥有多个父文档,那么您就无法显着提高性能。此外,与生成 PDF 的调用成本相比,遍历的成本可能微不足道。

您可能需要考虑优化的唯一情况是子文档是否可以是多个父文档的子文档。在这种情况下,您可能希望跟踪已为其生成 PDF 的文档……并在遍历中重新访问它们时跳过它们。 “我以前见过这个文档吗”的测试可以使用HashSet来实现。

Are there better representations of parent-child relationships such as the above where instead of brute-forcing my way through all the documents and their children in an O(n^2) fashion.

This is not O(N^2). It is O(N) where N is the total number of parent and child documents. Assuming that no child has more than one parent document, then you can't significantly improve the performance. Furthermore, the cost of the traversal is probably trivial compared with the cost of the calls that generate the PDFs.

The only case where you might want to consider optimizing is if child documents can be children of multiple parents. In that case, you may want to keep track of the documents that you've already generated PDF's for ... and skip them if you revisit them in the traversal. The test for "have I seen this document before" can be implemented using a HashSet.

帝王念 2024-12-25 15:07:29

您可以将要对文档执行的操作封装在处理程序中。这还允许您将来定义可以传递给现有代码的新处理程序。

interface DocumentHandler {
    void process(Document d);
}

class ExportToPdf implements DocumentHandler { ... }
class AppendToParentPdf implements DocumentHandler { ... }

// Now you're just passing the interface whose implementation does something with the document
void handleDocument(DocumentHandler parentHandler, DocumentHandler childHandler) {
    for(Document parent : documents) {
        parentHandler.process(parent);

        for(Document child : parent.children()) {
            childHandler.process(child);
        }
    }
}

DocumentHandler appendToParent = new AppendToParentPdf();
DocumentHandler exportToPdf = new ExportToPdf();

// pass the child/parent handlers as needed
handleDocument(exportToPdf, appendToParent);
handleDocument(exportToPdf, exportToPdf);

至于效率,我想说除非遇到性能问题,否则不要尝试优化。无论如何,问题不在于嵌套循环,而在于处理文档的逻辑本身。

You could encapsulate what you want to do with a document in a handler. This will also allow you to define new handlers in the future that you can pass to existing code.

interface DocumentHandler {
    void process(Document d);
}

class ExportToPdf implements DocumentHandler { ... }
class AppendToParentPdf implements DocumentHandler { ... }

// Now you're just passing the interface whose implementation does something with the document
void handleDocument(DocumentHandler parentHandler, DocumentHandler childHandler) {
    for(Document parent : documents) {
        parentHandler.process(parent);

        for(Document child : parent.children()) {
            childHandler.process(child);
        }
    }
}

DocumentHandler appendToParent = new AppendToParentPdf();
DocumentHandler exportToPdf = new ExportToPdf();

// pass the child/parent handlers as needed
handleDocument(exportToPdf, appendToParent);
handleDocument(exportToPdf, exportToPdf);

As for efficiency, I'd say don't try to optimise unless you run into performance issues. In any case, the problem won't be with the nested loop but with the logic itself that processes the documents.

烟酉 2024-12-25 15:07:29

对于第二个问题,您可以使用 提供程序模式 或其扩展。

提供者模式:此模式源于策略模式,它允许您在抽象中设计数据和行为,以便您可以随时更换实现

For your 2nd question, you could use the provider pattern or an extension of it.

Provider pattern : This pattern has its roots in the Strategy pattern and it lets you design your data and behavior in an abstraction so that you can swap out implementation at any time

Smile简单爱 2024-12-25 15:07:29

我试图将其纳入评论中,但有太多的话要说......

我不明白你所说的更改如何是代码味道。如果这个简单功能的需求发生变化,那么它们也会发生变化。如果您只需要在一个地方进行更改,那么听起来您已经做得很好了。如果您的客户需要两种方式(或更多方式)来执行此操作,那么您可能会考虑某种策略模式,这样您就不必重写周围的代码来执行任一功能。

如果您每周进行数十次这样的更改,那么它可能会变得混乱,您可能应该制定一个计划来更有效地处理非常繁忙的更改轴。否则,纪律和重构可以帮助您保持干净。

至于n²是否是一个问题,这取决于情况。 n 有多大?如果您必须经常执行此操作(即每小时数十次)并且 n 达到数千次,那么您可能会遇到问题。否则,只要您满足或超过需求并且您的 CPU/磁盘利用率不在危险区域,我就不会担心。

I tried to fit this into a comment, but there's too much to say...

I don't see how the change you're talking about is a code smell. If the requirements change for this simple function, then they change. If you only need to make the change in one place, then it sounds like you've done a good job. If your client is going to need both ways of doing it (or more), then you might consider some sort of strategy pattern so you don't have to rewrite the surrounding code to do either function.

If you're making dozens of these changes per week, then it could get messy and you probably ought to make a plan for how to more effectively deal with a very busy axis of change. Otherwise, discipline and refactoring can help you keep it clean.

As to whether or not n² is a problem, it depends. How big is n? If you have to do this frequently (i.e. dozens of times per hour) and n is in the 1000's of them, then you might have a problem. Otherwise I wouldn't sweat it as long as you're meeting or exceeding demand and your CPU/disk utilization is out of the danger zone.

硪扪都還晓 2024-12-25 15:07:29

第二个问题的问题可以通过简单地使用以下方法创建一个interface Exporter来解决
export(Document doc); 然后针对每种不同的格式实现它,例如类 DocExporterImpl 实现 Exporter

第一个取决于您的要求,并且没有任何设计模式本身可以解决这些问题。那里帮不了你。

The second questions' problem can be solved by simply creating an inteface Exporter with the method
export(Document doc); and then implementing it for each of the various formats, e.g. class DocExporterImpl implements Exporter.

The first one is dependent on your requirements and no design pattern as such solves these problems. Can't help you there.

吃不饱 2024-12-25 15:07:29

使用 Set 来跟踪哪些元素已被导出可能不是最漂亮的解决方案,但它将防止文档被导出两次。

Set<Document> alreadyExported = new HashSet<Document>();

for(Document parentDocument:Documents){
   ExportToPdf(parentDocument);
   for(Document childDocument:parentDocument.children()){
      if(!aldreadyExported.contains(childDocument)){
         ExportToPdf(childDocument);
         alreadyExported.add(childDocument);
      }  
   }
}

Using a Set to keep track of which elements have already been exported might not be the most beautiful solution, but it will prevent the documents from being exported twice.

Set<Document> alreadyExported = new HashSet<Document>();

for(Document parentDocument:Documents){
   ExportToPdf(parentDocument);
   for(Document childDocument:parentDocument.children()){
      if(!aldreadyExported.contains(childDocument)){
         ExportToPdf(childDocument);
         alreadyExported.add(childDocument);
      }  
   }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文