矩阵代数设计分解
我正在考虑重构一些非常复杂的代码,这是我正在工作的项目的子系统。 我对这段代码的部分检查是,它非常复杂,并且包含大量输入、中间值和输出,具体取决于某些核心业务逻辑。
我想重新设计这段代码,使其更易于维护并且执行速度更快,因此首先我一直在尝试查看每个参数及其相互依赖关系。 这导致了一个相当大且混乱的图表,我想要一种简化该图表的机制。
不久前,我在一本关于 SOA 设计的书中发现了一种名为“矩阵设计分解”的技术,该技术使用输出矩阵及其对输入的依赖关系,应用某种形式的矩阵代数,并可以为这些依赖关系生成业务流程图。
我知道 http://www.designdecomposition.com/ 上有一个网络工具,但它是有限的您可以拥有的输入/输出依赖项的数量。 我尝试四处寻找这个工具的算法源(这样我就可以尝试自己实现它而不受大小限制),但是我没有运气。
有人知道我可以使用类似的技术吗? 目前我什至正在考虑采用依赖矩阵并应用一些遗传算法来看看进化是否可以提出更简单的工作流程...
干杯,
Aidos
编辑:
我将解释动机:
原始代码是为计算所有内容的系统编写的每次用户执行操作(添加、删除或修改项目的某些属性)时的值(大约 60)。 这段代码是十多年前编写的,并且明显显示出老化的迹象 - 其他人在系统中添加了更复杂的计算,现在我们得到了完全不合理的性能(在控制权返回给用户之前最多 2 分钟)。 已决定将计算与用户操作分离,并提供一个按钮来“重新计算”值。
我的问题出现是因为正在进行如此多的计算,并且它们基于这样的假设:所有所需的数据都可用于计算 - 现在,当我尝试重新实现计算时,我不断遇到问题,因为我没有没有得到此计算所依赖的不同计算的结果。
这就是我想使用矩阵分解方法的地方。 MD 方法允许我指定所有输入和输出,并为我提供可用于生成所有输出的“最简单”工作流程。
然后,我可以使用这个“工作流程”来了解我需要执行的计算的优先级,以获得相同的结果而不产生任何异常。 它还向我展示了计算系统的哪些部分可以并行化,以及分叉点和连接点在哪里(我暂时还不用担心那部分)。 目前我所拥有的只是一个非常大的矩阵,其中显示了很多依赖项,但不知道从哪里开始。
我将从我的评论中详细说明一下:
我不想在实际程序中使用 EA 流程中的解决方案。 我想采用依赖矩阵并将其分解为模块,然后我将手动编码 - 这纯粹是一种设计辅助 - 我只是对这些模块的输入/输出感兴趣。 基本上是这些计算之间复杂的相互依赖性的表示,以及一些优先级的想法。
假设我有 A 需要 B 和 C。D 需要 A 和 E。F 需要 B、A 和 E,我想有效地将问题空间从一组复杂的依赖项划分为一个“工作流程”,我可以检查该工作流程以获得更好的结果理解。 一旦我有了这种理解,我就可以想出一个更好的设计/实现,而且仍然是人类可读的,所以对于这个例子,我知道我需要计算 A,然后是 C,然后是 D,然后是 F。--
我
知道这看起来有点像奇怪的是,如果你看看我在基于矩阵的分解之前链接到的网站,应该会让你对我的想法有一些了解......
I am looking at refactoring some very complex code which is a subsystem of a project I have at work. Part of my examination of this code is that it is incredibly complex, and contains a lot of inputs, intermediate values and outputs depending on some core business logic.
I want to redesign this code to be easier to maintain as well as executing a hell of a lot faster, so to start off with I have been trying to look at each of the parameters and their dependencies on each other. This has lead to quite a large and tangled graph and I would like a mechanism for simplifying this graph.
A while back I came across a technique in a book about SOA design called "Matrix Design Decomposition" which uses a matrix of outputs and the dependencies they have on the inputs, applies some form of matrix algebra and can generate Business Process diagrams for those dependencies.
I know there is a web tool available at http://www.designdecomposition.com/ however it is limited in the number of input/output dependencies you can have. I have tried looking around for the algorithmic source for this tool (so I could attempt to implement it myself without the size limitation), however I have had no luck.
Does anybody know a similar technique that I could use? Currently I am even considering taking the dependency matrix and applying some Genetic Algorithms to see if evolution can come up with a simpler workflow...
Cheers,
Aidos
EDIT:
I will explain the motivation:
The original code was written for a system which computed all of the values (about 60) every time the user performed an operation (adding, removing or modifying certain properties of a item). This code was written over ten years ago and is definitely showing signs of age - others have added more complex calculations into the system and now we are getting completely unreasonable performance (up to 2 minutes before control is returned to the user). It has been decided to detach the calculations from the user actions and provide a button to "recalculate" the values.
My problem arises because there are so many calculations that are going on and they are based on the assumption that all of the required data will be available for their computation - now when I try to re-implement the calculations I keep encountering problems because I haven't got the result for a different calculation that this calculation relies on.
This is where I want to use the matrix-decomposition approach. The MD approach allows me to specify all of the inputs and outputs and gives me the "simplest" workflow that I can use for generating all of the outputs.
I can then use this "workflow" to know the precedence of the calculations I need to perform to get the same result without generating any exceptions. It also shows me which parts of the calculation system I can parallelise and where the fork and join points will be (I won't worry about that part just yet). At the moment all I have is an insanely large matrix with lots of dependencies showing in it, with no idea where to start.
I will elaborate from my comment a little more:
I don't want to use the solution from the EA process in the actual program. I want to take the dependency matrix and decompose it into modules that I will then code manually - this is purely a design aid - I am just interested in what the inputs/outputs for these modules will be. Basically a representation of the complex interdependencies between these calculations, as well as some idea of precedence.
Say I have A requires B and C. D requires A and E. F requires B, A and E, I want to effectively partition the problem space from a complex set of dependencies into a "workflow" that I can examine to get a better understanding. Once I have this understanding I can come up with a better design / implementation that is still human readable, so for the example I know I need to calculate A, then C, then D, then F.
--
I know this seems kind of strange, if you take a look at the website I linked to before the matrix based decomposition there should give you some understanding of what I am thinking of...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
kquinn,如果我认为他指的是一段代码(我曾经在那里工作过),那么它已经是一个黑盒解决方案,没有人可以理解。 他并不想让事情变得更复杂,事实上更简单。 他想要实现的是一整堆相互关联的计算。
目前发生的情况是,每当发生任何变化时,都会发生大量事件,导致大量计算启动,进而导致更多事件持续发生,直到最终达到平衡状态。
我认为他想做的是找到那些外围计算的依赖关系,并从那里开始工作,这样它们就可以被重写,并找到一种方法让计算发生,而不是因为它们需要这样做。
关于简化图表,我无法提供太多建议,因为不幸的是,我在这方面没有太多经验。也就是说,我将开始寻找那些没有依赖性的外围计算,然后从那里遍历图表。 开始构建一个新的框架,其中以最简单的方式包含每个计算的核心业务逻辑,并在此过程中重构其中的废话。
kquinn, If it's the piece of code I think he's referring to (I used to work there), it's already a black box solution that no human can understand as is. He's not looking to make it more complicated, less in fact. What he's trying to achieve is a whole heap of interlinked calculations.
What currently happens, is that whenever anything changes, it's an avalanche of events which cause a whole bunch of calculations to fire off, which in turn causes a whole bunch more events which continues on until finally it reaches a state of equilibrium.
What I assume he wants to do is find the dependencies for those outlying calculations and work in from there so they can be rewritten and find a way for the calculations from happening for the sake of it, rather than because they need to.
I can't offer much advice in regards to simplifying the graph, as unfortunately it's not something I have much experience in. That said, I would start looking for those outlying calculations which have no dependencies, and just traverse the graph from there. Start building up a new framework that includes the core business logic of each calculation in the simplest possible way, and refactor the crap out of it along the way.
如果正如您所说,这是“核心业务逻辑”,那么您真的不想使用花哨的分解和进化算法来产生世界上没有人理解或有能力的“黑匣子”解决方案修改。 如果这些技术中的任何一个实际上产生了任何有用的结果,我会感到非常惊讶; 人脑在理清复杂关系方面仍然比任何机器都更有能力。
您要做的是传统的重构:清理各个过程,简化它们并在可能的情况下合并它们。 您的目标是使代码清晰,这样您的继任者就不必经历相同的过程。
If this is, as you say, "core business logic", then you really don't want to be screwing around with fancy decompositions and evolutionary algorithms that produce a "black box" solution that no one in the world understands or is capable of modifying. I would be very surprised if any of these techniques actually yielded any useful result; the human brain is still incomprehensibly more capable than any machine at untangling complicated relationships.
What you want to do is traditional refactoring: clean up the individual procedures, streamlining them and merging them where possible. Your goal is to make the code clear, so your successor doesn't have to go through the same process.
您使用什么语言?
您的问题应该很容易使用 Java Executors 和 Future<> 进行建模。 任务,但类似的框架也许也可以在您选择的平台上使用?
另外,如果我理解正确的话,您想要为大量相互依赖的计算生成一条关键路径——这是动态完成的事情,还是您“只是”需要静态分析?
关于算法解决方案; 拿起最接近的数值分析教科书,刷新您对奇异值分解和 LU 分解的记忆; 我从头到尾猜测这就是您链接到的工具背后的原因。
编辑:由于您使用的是 Java,我将给出建议的简要概述:
-> 使用线程池执行器轻松并行所有计算
-> 使用 Future<> 的对象映射解决相互依赖关系 或 FutureTask<>:s,即,如果变量是 A、B 和 C,其中 A = B + C,请执行以下操作:
现在,如果我还没有完全关闭(而且我很可能是这样)自从我从事 Java 工作以来,您应该能够通过调整线程池大小或使用不断增长的线程池来解决明显的死锁问题。 (不过,您仍然必须确保不存在相互依赖的任务,例如 A = B + C,且 B = A + 1...)
What language are you using?
Your problem should be pretty easy to model using Java Executors and Future<> tasks, but a similar framework is perhaps availabe on your chosen platform as well?
Also, if I understand this correctly, you want to generate a critical path for a large set of interdependent calculations -- is that something done dynamically, or do you "just" need a static analysis?
Regarding an algorithmic solution; pick up the closest copy of your numerical analysis textbook and refresh your memory on singular value decompositions and LU factorization; I'm guessing from the top off my head that this is what lies behind the tool you linked to.
EDIT: Since you're using Java, I'll give a brief outline of a suggestion proposal:
-> Use a threadpool executor to parallellize all calculations easily
-> Solve interdependencies with an object map of Future<> or FutureTask<>:s, i.e. if you variables are A, B and C, where A = B + C, do something like this:
Now, if I'm not totally off (and I may very well be, it was a while since I worked in Java), you should be able to solve the apparent deadlock problem by tuning the thread pool size, or use a growing thread pool. (You still have to make sure that there are no interdependent tasks though, such as if A = B + C, and B = A + 1...)
如果黑盒是线性的,您可以通过简单地连接许多输入向量和许多输出向量来发现所有系数。
您有输入 x[i] 和输出 y[i],然后创建一个矩阵 Y,其列为 y[0]、y[1]、... y[n],以及一个矩阵 X,其列为 x[ 0]、x[1]、...、x[n]。 会有一个变换 Y = T * X,那么你可以确定 T = Y * inverse(X)。
但既然你说它很复杂,我敢打赌它不是线性的。 然后,如果您仍然想要一个通用框架,您可以使用因子图
https://ieeexplore.ieee。 org/document/910572
我很好奇你是否能做到这一点。
我认为更容易的是理解代码并使用最佳实践重写它。
If the black-box is linear you can discover all the coefficients by simply concatenating many vectors of input and many vectors of output.
you have input x[i] and output y[i], then you create a matrix Y whose columns are y[0], y[1], ... y[n], and a matrix X whose columns are x[0], x[1], ..., x[n]. There will be a transformation Y = T * X, then you may determine T = Y * inverse(X).
But since you said it is complex I bet it is not linear. Then if you still want a general framework you can use this a factor-graph
https://ieeexplore.ieee.org/document/910572
I would be curious if you can do this.
What I think is easier is to understand the code and rewrite it using the best practices.