开发一次性代码的好策略？

发布于 2024-08-03 07:29:28 字数 446 浏览 12 评论 0原文

我经常编写一次性代码（在研究环境中） -例如，探索科学属性或过程的算法或模型。其中许多“实验”都是一次性的，但有时我发现我需要稍后使用一些。例如，我刚刚发现了我 7 年前编写的字符串匹配代码（由于其他优先事项而停止），但现在对同事的项目很有价值。看完它（我真的写了这么难以理解的代码吗？）我意识到当我重新启动“项目”时，我可以做一些事情来帮助我（“实验”仍然是一个更好的词）。早期的实验“有效”，但我知道当时我没有时间重构，因为我的优先事项在其他地方。

哪些方法可以经济高效地挖掘和重复利用此类工作？

编辑：我已经回答了我自己的问题（如下），因为存在超出实际来源本身的问题。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

起风了 2024-08-10 07:29:28

我不同意所有“写评论”的答案。这是因为代码本身无法理解而提供的。

为自己获取一份代码完整（Steve McConnell ，第二版）。如果您首先学习编写可维护代码的技术，那么您不会花费更多时间，并且稍后您将能够轻松地返回工作。

您更喜欢哪一个：

带有注释的隐秘代码？
大部分OK代码没有？

我强烈喜欢后者，因为在隐藏代码未注释的情况下，好的代码更容易理解，而注释是原始开发人员可能犯错误的另一个地方。代码可能有错误，但绝不会错误。

一旦您对《Code Complete》感到满意，我建议您< em>实用程序员，因为它提供了稍微更高级别的软件开发建议。

回复收藏 0 原文

半衬遮猫 2024-08-10 07:29:28

[回答自己的问题]
这个问题还有其他几个方面尚未提出，但我在重新审视它时会发现这些方面很有用。其中一些可能是“不言而喻的”，但请记住此代码是 SVN 和 IDE 之前的代码。

可发现性。实际上很难找到代码。我相信它在我的 SourceForge 项目中，但 7 年来有太多版本和分支，我找不到它。因此，我必须有一个搜索代码的系统，但在 IDE 出现之前，我认为还没有这样的系统。
它有什么作用？。当前的 checkout 包含大约 13 个类（全部位于一个包中，因为当时重构并不容易）。有些是透明的 (DynamicAligner)，但另一些是不透明的 (MainBox，因其扩展了 Swing Box 而得名)。有四个 main() 程序，并且发行版中实际上有大约 3 个子项目。因此，拥有一个关于组件实际内容的外部清单至关重要。
有关如何运行它的说明。运行程序时，main() 将提供简短的命令行用法（例如DynamicAligner file1 file2），但它不会说明文件的内容实际上是什么样子。我当时当然知道这一点，但现在不知道了。因此，同级目录中应该有关联的 example 文件。这些比尝试记录文件格式更有价值。
它仍然有效吗？。应该可以不假思索地运行每个示例。第一个问题是相关的库、运行时等是否仍然相关且可用。一位前同事编写了一个只能在特定版本的 Python 上运行的系统。唯一的答案就是重写。因此，我们当然应该尽可能避免任何锁定，我已经训练自己（尽管不一定是同事）来做到这一点。

那么我和同事如何避免将来出现问题呢？我认为第一步是在创建代码时应该有一个创建“项目”（无论多么小）的规则，并且这些项目应该处于版本控制之下。这对你们中的一些人来说可能听起来很明显，但在某些环境（学术界、国内）中，建立项目管理系统会产生巨大的开销。我怀疑大多数学术代码不受任何版本控制。

接下来的问题是如何组织项目。默认情况下，它们不能出现在 Sourceforge 上，因为代码 (a) 很简单，并且 (b) 默认情况下不打开。我们需要一台既可以有公共项目也可以有私人项目的服务器。我会计算出，设置和运行它的工作量约为 0.1 FTE - 即每年 20 天来自各方（安装、培训、维护） - 是否有更简单的选择，我想知道，因为这是一个很大的工作在某些情况下会产生费用 - 我是花时间设置服务器还是写论文？

该项目应努力鼓励良好的纪律。这确实是我希望从这个问题中得到的答案。它可以包括：

所需组件的模板（清单、自述文件、提交日志、示例、所需库等。并非所有项目都可以在 maven 下运行 - 例如 FORTRAN）。
一种在大量（至少数百个）小项目中搜索助记符字符串的方法（我喜欢将代码转储到 Googledocs 中的想法，这可能是一条富有成效的途径 - 但需要额外的维护工作）。
清晰的命名约定。这些比评论更有价值。我现在经常使用 iterateOverAllXAndDoY 类型的名称。当例程实际创建信息时，我尝试使用 createX() 而不是 getX()。我有一个坏习惯，就是调用例程 process() 而不是 ConvertAllBToY()。

我知道但没有使用过 GIT、Mercurial 和 GoogleCode。我不知道这些是花费了多少努力才建立起来的，它们解答了我多少关心的问题。如果有一个 IDE 插件可以帮助创建更好的代码（例如“方法名称选择不当”），我会很高兴。

无论采用何种方法，对于那些天生没有良好代码纪律的人来说，它们都必须自然而然地出现，并且值得付出努力。

[Answering own question]
There are several other aspects to the problem which haven't been raised and which I would have found useful when revisiting it. Some of these may be "self-evident" but remember this code was pre-SVN and IDEs.

Discoverability. It has been difficult actually to find the code. I believe it's in my SourceForge project but there are so many versions and branches over 7 years that I can't find it. So I would have to have a system that searched code and until IDEs appeared I don't think there was any.
What does it do?. The current checkout contains about 13 classes (all in one package as it wasn't easy to refactor at the time). Some are clear (DynamicAligner) but others are opaque (MainBox, named because it extended a Swing Box). There are four main() programs and there are actually about 3 subprojects in the distrib. So it is critical to have an external manifest as to what the components actually were.
instructions on how to run it. When running the program, main() will offer a brief commandline usage (e.g. DynamicAligner file1 file2) but it doesn't say what the contents of files actually look like. I knew this at the time, of course but not now. So there should be associated example files in sibling directories. These are more valuable than trying to document file formats.
does it still work?. It should be possible to run each each example without thinking. The first question will be whether the associated libraries, runtimes, etc. are still relevant and available. One ex-coworker wrote a system which only runs with a particular version of Python. The only answer is to rewrite. So certainly we should avoid any lock-in where possible, and I have trained myself (though not necessarily coworkers) to do this.

So how can I and coworkers avoid problems in the future? I think the first step is that there should be a discipline of creating a "project" (however small) when you create code and that these projects should be under version control. This may sound obvious to some of you, but in some environments (academia, domestic) there is a significant overhead to setting up a project management system. I suspect that the majority of academic code is not under any version control.

Then there is the question as to how the projects should be organized. They can't be on Sourceforge by default as the code is (a) trivial and (b) not open by default. We need a server where there can be both communal projects and private ones. I would calculate that the effort to set this up and run it is about 0.1 FTE - that's 20 days a year from all parties (installation, training, maintenance) - if there are easier options I'd like to know as this is a large expense in some cases - do I spend my time setting up a server or do I write papers?

The project should try to encourage good discipline. This is really what I was hoping to get from this question. It could include:

A template of required components (manifest, README, log of commits, examples, required libraries, etc. Not all projects can run under maven - e.g. FORTRAN).
A means of searching a large number (hundreds at least) of small projects for mnemonic strings (I liked the idea of dumping the code in Googledocs, and this may be a fruitful avenue - but it's extra maintenance effort).
Clear naming conventions. These are more valuable than comments. I now regularly have names of the type iterateOverAllXAndDoY. I try to use createX() rather than getX() when the routine actually creates information. I have a bad habit of calling routines process() rather than convertAllBToY().

I am aware of but haven't used GIT and Mercurial and GoogleCode. I do not know how much effort these are to set up and how many of my concerns they answer. I would be delighted if there was an IDE plugin which helped create better code (e.g. "poor choice of method name").

And whatever the approaches they have got to come naturally to people who do not naturraly have good code discipline and to be worth the effort.

回复收藏 0 原文

晨与橙与城 2024-08-10 07:29:28

正如您的其他帖子中的出色答案所示，并且来自根据我自己的经验，用于研究的软件和已设计的软件之间存在难以跨越的鸿沟。在我看来，《Code Complete》可能会有所帮助，但帮助不大。作为一个经济问题，与为某些东西找到以后的用途而偶尔得到的奖励相比，重构所有东西以供重用是否值得？您的平衡点可能会有所不同。

这是存储片段的实用技巧。不要添加完整的注释，而是添加一些关键字：

“图同构包装器”
“聚合物模拟退火”
“字符串匹配费曼”
“平衡”

，然后将代码放在可通过 Google 搜索的位置，例如 GMail 帐户。

编辑：我可能会补充一点，免费的 Google 协作平台实际上是可搜索的 wiki，是放置代码的好地方，无论是以附件的形式还是粘贴的形式。

另外，我应该说我是代码完成并已向研究生提供了多年来编写科学研究软件的副本。这是一个好的开始，但没有灵丹妙药。我现在正在写一篇关于使用开源框架解决科学数据管理问题的论文，其中一个结论是，一些软件工程专业知识对于长期运行的系统至关重要。许多科学项目可能应该从一开始就为此做好预算。

回复收藏 0 原文

旧人哭 2024-08-10 07:29:28

我会回应其他人所说的，就评论代码编写的“原因”及其预期用途而言，但我还会添加以下内容：

代码就好像您计划将其投入生产一样，即使您是只是胡闹。代码的目的：

清晰度和可读性
遵循当时的编码约定。（命名约定等）。尽管这些约定会随着时间的推移而发生变化，但如果您坚持这些标准，您以后更有可能理解它。
安全（如果适用）
性能（如果适用）

我特别强调第一点，但其他也很重要。我发现如果我以后使用“测试代码”，我倾向于只在它有效的情况下使用它，而不是重构它。

回复收藏 0 原文

ぺ禁宫浮华殁 2024-08-10 07:29:28

我认为最重要的事情（如果你不进行重构就不会发生）是评论并记录你当时的思维过程。它将有助于使代码变得不那么难以理解，并帮助您在需要时找到好的部分。

回复收藏 0 原文

对你再特殊 2024-08-10 07:29:28

不，不，不，不，不！

即使在研究环境中也不要编写一次性代码。请！

目前我正在搞乱这样一个“一次性代码”，即BLAST项目。问题是，它最初只是一个游乐场，但后来碰巧变得有些成功，现在它是一个简洁的工具，实现了许多概念，但代码实际上无法维护。但这不是重点。

要点是，您为工程师进行研究，以便以后从您的发现中受益。在完成了关于一般概念的良好科学工作并编写了一个证明其成功的工具后，您很容易忘记您这样做不仅仅是为了出版和获得博士学位。你这样做是为了人类的利益。您的代码可能包含一堆难以调试的“特殊情况”，以及一组不适合任何会议文章的怪癖和黑客行为。在整个代码中记录和注释此类内容尤其重要。

如果开发人员决定在商业产品中实现您的概念，他可以研究您代码中的怪癖和黑客行为，并且实现中的错误将比可能少。大家都说“哇，他对A的研究真有用！”但如果你写“扔掉”，他们会说“他的概念在纸面上看起来不错，但 X 试图实现它并淹没在一堆错误中”。

（编辑：摘自下面的评论）为了帮助代码库的未来开发人员，您不需要太多。首先，评论每个函数的作用。其次，确保对棘手错误的每个非明显修复都放在修订控制系统中的单独提交中（当然，带有适当的注释）。这已经足够了。如果你甚至将东西模块化（即使它们还没有准备好完全重用——根据布鲁克斯的说法，成本要高出三倍），你就会受到实施你的研究的工程师的喜爱。

我认为，如果研究人员抛弃他们的傲慢，不再傲慢地认为他们不是那些写出好代码的卑鄙工作的肮脏程序员，世界将会变得更美好。编写好的代码不仅仅是这些愚蠢的程序员的工作。这是每个人都应该努力的非常有价值的事情。没有这个，你的实验场地、你的代码、你的创意就会消亡。

No, No, No, No, No!

Do not write throwaway code even in a research environment. Please!

Currently I'm messing with such a "throwaway code", namely BLAST project. The thing is that it started as a playground but then happened to become somewhat successful, Now it's a neat tool with many concepts implemented, but the code is virtually unmaintainable. But that's not the main point.

The main point is, you do research for engineers to later benefit from your findings. Having done a good scientific work on general concept and writing a tool that proves this successful, you can easily forget that you're doing it not for publication and PhD only. You do it for the benefit of the mankind. Your code may contain a bunch of "special cases", that were hard to debug, a set of quirks and hacks that do not fit into any conference article. It's especially important to document and comment such things throughout your code.

If a developer decided to implement your concepts in a commercial product, he could have studied the quirks and hacks from your code and the implementation would ten have less bugs than it might have had. Everyone says "Wow, his research on A really is useful!" But if you write "throwaway", they say "his concept looks nice on paper, but X tried to implement it and drowned in a bunch of bugs".

(EDIT: taken from comments below) To help future developers of your codebase, you don't need much. First, comment what each function does. Second, make sure that every non-obvious fix of a tricky bug is placed in a separate commit in revision-control system (with an appropriate comment, of course). That's quite enough. And if you even make things modular (even if they're not ready for outright reuse--that's three times more costly, according to Brooks) you will be adored by engineers who implement your research.

I think that the world would be a better place if researchers threw away their hubris and stopped haughty thinking that they're not these dirty coders who do menial job of writing a good code. Writing a good code is not just a job for these stupid programmers. It is a really valuable thing everyone should strive. Without this, your experimental ground, your code, your brainchild will just die.

回复收藏 0 原文