当前位置：文江博客话题详情

智能代码完成？有没有可以通过学习来写代码的AI？

发布于 2024-07-15 19:09:19 字数 601 浏览 12 评论 0 原文

我问这个问题是因为我知道这里有很多博学的CS类型可以给出明确的答案。

我想知道这样的人工智能是否存在（或正在研究/开发），它通过自己生成和编译代码来编写程序，然后通过从以前的迭代中学习来进步。我说的是让我们程序员变得过时的工作。我正在想象一种通过反复试验来学习编程语言中什么有效、什么无效的东西。

我知道这听起来有些天上掉馅饼，所以我想知道已经做了什么（如果有的话）。

当然，即使是人类程序员也需要输入和规范，因此此类实验必须仔细定义参数。就像人工智能要探索不同的计时功能一样，这个方面必须明确定义。

但有了复杂的学习人工智能，我很想知道它会产生什么。

我知道有很多人类品质是计算机无法复制的，比如我们的判断力、品味和偏见。但我的想象力喜欢这样的想法：一个程序在经过一天的思考后吐出一个网站，让我看看它想出了什么，尽管如此，我仍然经常认为它是垃圾；但也许每天一次我可能会给它反馈并帮助它学习。

这种想法的另一个途径是，最好给出像“菜单网站”或“图像工具”这样的高级描述，它会生成具有足够深度的代码，这对于我来说可以用作代码完成模块，然后在细节。但我认为这可以被设想为一种非智能静态分层代码完成方案。

这个怎么样？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

南风起 2024-07-22 19:09:19

这样的工具是存在的。它们是基因编程学科的主题。您如何评估它们的成功取决于它们的应用范围。

他们在设计工业流程管理、自动化医疗诊断或集成电路设计的最佳程序方面非常成功（效率比人类高出几个数量级）。这些过程受到很好的约束，具有明确且不可改变的成功衡量标准，以及大量的“宇宙知识”，即关于什么是有效的、有效的、什么不是的规则。

它们在尝试构建需要用户交互的主流程序时完全没用，因为学习系统需要的主要项目是明确的“适应度函数”，或者是对其当前提出的解决方案的质量进行评估。

处理“程序学习”时可以看到的另一个领域是归纳逻辑编程，尽管它是更多用于提供自动演示或语言/分类学学习。

回复收藏 0 原文

蓬勃野心 2024-07-22 19:09:19

免责声明：我不是以英语为母语的人，也不是该领域的专家，我是一个业余 - 预计接下来的内容会出现不精确和/或错误。因此，本着 stackoverflow 的精神，不要害怕纠正和改进我的散文和/或内容。另请注意，这不是对自动编程技术的完整调查（代码生成 (CG)来自模型驱动架构 (MDA) 至少值得一提）。

我想对 Varkhan 的回答（本质上是正确的）添加更多内容。

遗传编程 (GP) 方法自动编程与其合并适应度函数，两个不同的问题（“自编译”在概念上是显而易见的）：

自我改进/适应 - 合成程序的自我改进/适应，如果需要的话，合成器本身的自我改进/适应；和
程序综合。

关于自我改进/适应请参阅 Jürgen Schmidhuber 的 Goedel 机器：自我参照的通用问题解决者可以证明最佳的自我改进。（顺便说一句：有趣的是他在人工好奇心方面的工作。）也相关用于此讨论的是自主系统。

关于程序综合，我认为可以对3个主要分支进行分类：随机（概率 - 就像上面提到的 GP），归纳和演绎。

GP本质上是随机因为它通过交叉、随机突变、基因重复、基因删除等启发式方法产生可能的程序空间等等...（而不是使用适应度函数测试程序，让适者生存，复制）。

归纳程序综合通常被称为归纳编程（IP），其中< a href="https://en.wikipedia.org/wiki/Inducing_logic_programming" rel="nofollow noreferrer">归纳逻辑编程 (ILP) 是一个子领域。也就是说，一般来说，该技术并不限于逻辑程序综合或以逻辑编程语言（两者都不限于“..自动演示或语言/分类学学习”）。

IP 通常是确定性（但也有例外）：从开始>不完整规范（例如示例输入/输出对），并使用它来约束满足此类规范的可能程序的搜索空间，然后对其进行测试（生成和测试方法）或者直接合成一个程序来检测给定示例中的重复情况，然后进行概括（数据驱动或分析方法）。整个过程本质上是统计归纳/推理 - 即考虑将哪些内容包含到不完整的规范中类似于随机抽样。

生成并测试和数据驱动/分析§方法可以非常快，因此两者都是有前途的（即使到目前为止只有很少的综合程序在公开场合演示），但是生成和测试 em>（像 GP）是令人尴尬的并行，然后是显着的改进（缩放到实际的程序大小）可以预期。但请注意增量归纳编程 (IIP)§本质上是顺序的，已被证明比非增量方法更有效。

§ 这些链接直接指向 PDF 文件：抱歉，我无法找到摘要。

演示编程 (PbD) 和示例编程 (PbE) 是

演绎程序综合从（假定的）完整（正式）规范（逻辑条件）开始。其中一项技术利用自动定理证明器：为了综合一个程序，它构造了一个证明存在符合规范的对象；因此，通过 Curry-Howard-de Bruijn 同构（证明作为程序对应和公式作为类型对应），它从证明中提取一个程序。其他变体包括使用约束求解和子程序库的演绎组成。

在我看来，归纳和演绎综合在实践中正在从两个略有不同的角度解决同一问题，因为什么构成了完整< /em> 规范是有争议的（此外，今天完整的规范明天可能会变得不完整 - 世界不是静态的）。

当（如果）这些技术（自我改进/适应和程序综合）成熟时，它们有望提高声明式编程（此类设置被视为“编程”是有时会争论）：我们将更多地关注领域工程和需求分析和工程 a> 比软件手动设计开发、手动调试、手动系统性能调优等等（可能少与当前手册引入的复杂性相比，意外的复杂性，而不是自我改进/适应技术）。这还将提升当前技术尚未展示的敏捷性水平。

Disclaimer: I am not a native English speaker nor an expert in the field, I am an amateur - expect imprecisions and/or errors in what follow. So, in the spirit of stackoverflow, don't be afraid to correct and improve my prose and/or my content. Note also that this is not a complete survey of automatic programming techniques (code generation (CG) from Model-Driven Architectures (MDAs) merits at least a passing mention).

I want to add more to what Varkhan answered (which is essentially correct).

The Genetic Programming (GP) approach to Automatic Programming conflates, with its fitness functions, two different problems ("self-compilation" is conceptually a no-brainer):

self-improvement/adaptation - of the synthesized program and, if so desired, of the synthesizer itself; and
program synthesis.

w.r.t. self-improvement/adaptation refer to Jürgen Schmidhuber's Goedel machines: self-referential universal problem solvers making provably optimal self-improvements. (As a side note: interesting is his work on artificial curiosity.) Also relevant for this discussion are Autonomic Systems.

w.r.t. program synthesis, I think is possible to classify 3 main branches: stochastic (probabilistic - like above mentioned GP), inductive and deductive.

GP is essentially stochastic because it produces the space of likely programs with heuristics such as crossover, random mutation, gene duplication, gene deletion, etc... (than it tests programs with the fitness function and let the fittest survive and reproduce).

Inductive program synthesis is usually known as Inductive Programming (IP), of which Inductive Logic Programming (ILP) is a sub-field. That is, in general the technique is not limited to logic program synthesis or to synthesizers written in a logic programming language (nor both are limited to "..automatic demonstration or language/taxonomy learning").

IP is often deterministic (but there are exceptions): starts from an incomplete specification (such as example input/output pairs) and use that to constraint the search space of likely programs satisfying such specification and then to test it (generate-and-test approach) or to directly synthesize a program detecting recurrences in the given examples, which are then generalized (data-driven or analytical approach). The process as a whole is essentially statistical induction/inference - i.e. considering what to include into the incomplete specification is akin to random sampling.

Generate-and-test and data-driven/analytical§ approaches can be quite fast, so both are promising (even if only little synthesized programs are demonstrated in public until now), but generate-and-test (like GP) is embarrassingly parallel and then notable improvements (scaling to realistic program sizes) can be expected. But note that Incremental Inductive Programming (IIP)§, which is inherently sequential, has demonstrated to be orders of magnitude more effective of non-incremental approaches.

§ These links are directly to PDF files: sorry, I am unable to find an abstract.

Programming by Demonstration (PbD) and Programming by Example (PbE) are end-user development techniques known to leverage inductive program synthesis practically.

Deductive program synthesis start with a (presumed) complete (formal) specification (logic conditions) instead. One of the techniques leverage automated theorem provers: to synthesize a program, it constructs a proof of the existence of an object meeting the specification; hence, via Curry-Howard-de Bruijn isomorphism (proofs-as-programs correspondence and formulae-as-types correspondence), it extracts a program from the proof. Other variants include the use of constraint solving and deductive composition of subroutine libraries.

In my opinion inductive and deductive synthesis in practice are attacking the same problem by two somewhat different angles, because what constitute a complete specification is debatable (besides, a complete specification today can become incomplete tomorrow - the world is not static).

When (if) these techniques (self-improvement/adaptation and program synthesis) will mature, they promise to rise the amount of automation provided by declarative programming (that such setting is to be considered "programming" is sometimes debated): we will concentrate more on Domain Engineering and Requirements Analysis and Engineering than on software manual design and development, manual debugging, manual system performance tuning and so on (possibly with less accidental complexity compared to that introduced with current manual, not self-improving/adapting techniques). This will also promote a level of agility yet to be demonstrated by current techniques.