在优化的同时保持代码可读性
我正在用 Python 和 C 编写一个科学程序,其中包含一些复杂的物理模拟算法。实现算法后,我发现有很多可能的优化来提高性能。常见的有预先计算值、脱离循环计算、用更复杂的矩阵算法替换简单的矩阵算法等。但出现了一个问题。未优化的算法速度要慢得多,但其逻辑以及与理论的联系看起来更清晰易读。此外,扩展和修改优化算法也更加困难。
那么,问题是 - 我应该使用什么技术来保持可读性,同时提高性能?现在我试图保持快速和清晰的分支并并行开发它们,但也许有更好的方法?
I am writing a scientific program in Python and C with some complex physical simulation algorithms. After implementing algorithm, I found that there are a lot of possible optimizations to improve performance. Common ones are precalculating values, getting calculations out of cycle, replacing simple matrix algorithms with more complex and other. But there arises a problem. Unoptimized algorithm is much slower, but its logic and connection with theory look much clearer and readable. Also, it's harder to extend and modify optimized algorithm.
So, the question is - what techniques should I use to keep readability while improving performance? Now I am trying to keep both fast and clear branches and develop them in parallel, but maybe there are better methods?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
正如一般性评论(我对 Python 不太熟悉):我建议您确保可以轻松地将“参考实现”的缓慢部分与“优化”部分交换(例如,使用类似 策略 模式的内容)。
这将允许您交叉验证更复杂的算法的结果(以确保您没有弄乱结果),并使模拟算法的整体结构保持清晰(关注点分离)。您可以将优化的算法放入单独的源文件/文件夹/包中,并根据需要尽可能详细地单独记录它们。
除此之外,尝试避免常见的陷阱:不要进行过早的优化(检查它是否真的值得,例如使用分析器),并且不要重新发明轮子(看看以获得可用的库)。
Just as a general remark (I'm not too familiar with Python): I would suggest you make sure that you can easily exchange the slow parts of the 'reference implementation' with the 'optimized' parts (e.g., use something like the Strategy pattern).
This will allow you to cross-validate the results of the more sophisticated algorithms (to ensure you did not mess up the results), and will keep the overall structure of the simulation algorithm clear (separation of concerns). You can place the optimized algorithms into separate source files / folders / packages and document them separately, in as much detail as necessary.
Apart from this, try to avoid the usual traps: don't do premature optimization (check if it is actually worth it, e.g. with a profiler), and don't re-invent the wheel (look for available libraries).
你的问题是一个非常好的问题,几乎在任何想要称自己为专业人士的程序员编写的每一段代码中都会出现,无论简单还是复杂。
我试图记住并牢记,新来的读者对问题的粗略看法和我最初的方法几乎相同(也许是蛮力)。然后,当我对问题有了更深入的了解并且解决方案的路径变得更加清晰时,我尝试写出反映更好理解的评论。有时我会成功,这些评论对读者很有帮助,特别是当我六周后回到代码时,它们对我很有帮助。我的风格是写大量的评论,无论如何,当我不写时(因为:突然的顿悟让我兴奋;我想看到它运行;我的大脑被烧坏了),我几乎总是在事后非常后悔。
如果我可以维护两个并行代码流,那就太好了:简单的方式和更复杂的优化方式。但我从来没有成功过。
对我来说,最重要的是,如果我能写出清晰、完整、简洁、准确和最新的评论,那就是我能做到的最好的了。
您已经知道的另一件事是:优化通常并不意味着将大量代码硬塞到一个源代码行上,也许是通过调用一个其参数是另一个函数的函数,而该函数的参数是另一个函数,而该函数的参数又是另一个函数。我知道有些人这样做是为了避免暂时存储函数的值。但它对加速代码的作用很小(通常没有任何作用),而且它是一个很难遵循的东西。我知道,没有消息给你。
Yours is a very good question that arises in almost every piece of code, however simple or complex, that's written by any programmer who wants to call himself a pro.
I try to remember and keep in mind that a reader newly come to my code has pretty much the same crude view of the problem and the same straightforward (maybe brute force) approach that I originally had. Then, as I get a deeper understanding of the problem and paths to the solution become clearer, I try to write comments that reflect that better understanding. I sometimes succeed and those comments help readers and, especially, they help me when I come back to the code six weeks later. My style is to write plenty of comments anyway and, when I don't (because: a sudden insight gets me excited; I want to see it run; my brain is fried), I almost always greatly regret it later.
It would be great if I could maintain two parallel code streams: the naïve way and the more sophisticated optimized way. But I have never succeeded in that.
To me, the bottom line is that if I can write clear, complete, succinct, accurate and up-to-date comments, that's about the best I can do.
Just one more thing that you know already: optimization usually doesn't mean shoehorning a ton of code onto one source line, perhaps by calling a function whose argument is another function whose argument is another function whose argument is yet another function. I know that some do this to avoid storing a function's value temporarily. But it does very little (usually nothing) to speed up the code and it's a bitch to follow. No news to you, I know.
人们通常认为必须放弃可读性才能获得性能。
不一定如此。
您需要找出花费大量时间到底在做什么,为什么?
请注意,我并没有说您需要进行任何测量。
我的意思的示例。
这是一个说明 非常好,您可以做一些简单的更改来避免浪费运动,但在程序本身告诉您要修复什么之前不要修复任何内容。
It is common to assume you must give up readability to get performance.
That's not necessarily so.
You need to find out What exactly is it spending much time doing, and why?
Notice, I didn't say you need to do any measuring.
Here's an example of what I mean.
Chances are very good that you can do some simple changes to avoid waste motion, but don't fix anything until the program itself has told you what to fix.
另请参阅http://en.wikipedia.org/wiki/Literate_programming#Example
我们还听说这就是http://en.wikipedia.org/wiki/Aspect-oriented_programming 试图提供帮助,尽管我还没有真正研究过。 (这似乎是一种奇特的说法,“将你的优化、调试和其他垃圾放在你正在编写的函数之外”。)
also see http://en.wikipedia.org/wiki/Literate_programming#Example
I've also heard that this is what http://en.wikipedia.org/wiki/Aspect-oriented_programming attempts to help with, though I haven't really looked into it. (It just seems to be a fancy way of saying "put your optimizations and your debugs and your other junk outside of the function you're writing".)