与“自然语言”相似吗？编程语言令人信服的卖点？

完美的未来在梦里 2024-09-17 17:52:07

当然，自然语言很少清晰、简单、干净、可爱、简洁、易于理解，这是大多数编程都是用远离自然的语言完成的原因之一。

超可爱的懒熊 2024-09-17 17:52:07

我对此的回答是，理想的编程语言介于自然语言和非常正式的语言之间。

一种极端是形式化的、最小化的数学语言。以 Brainfuck 为例：

,>++++++[<-------->-],[<+>-]<.    // according to Wikipedia, this means addition

或者，比上述混乱更好的方法是，任何类型的 lambda 演算。

λfxy.x
λfxy.y

这是在 lambda 演算中表达布尔真值的一种可能方式。看起来不太整洁，尤其是当您围绕它们构建逻辑运算符（例如AND 例如λpq.pqp）时。

我声称大多数人无法用如此简约、难以掌握的语言编写生产代码。

另一方面的问题，即人类所说的自然语言，是过于复杂和灵活的语言允许程序员表达模糊和不确定的事物，而这些事物对于当今的计算机来说毫无意义。让我们看一下这个示例程序：

MAYBE IT WILL RAIN CATS AND DOGS LATER ON. WOULD YOU LIKE THIS, DEAR COMPUTER?
IF SO, PRINT "HELLO" ON THE SCREEN.
IF YOU HATE RAIN MORE THAN GEORGE DOES, PRINT SOME VAGUE GARBAGE INSTEAD.
(IN THE LATTER CASE, IT IS UP TO YOU WHERE YOU OUTPUT THAT GARBAGE.)

这是一个明显的模糊案例。但有时使用更合理的自然语言程序会出错，例如：

READ AN INTEGER NUMBER FROM THE TERMINAL.
READ ANOTHER INTEGER NUMBER FROM THE TERMINAL.
IF IT IS LARGER THAN ZERO, PRINT AN ERROR.

IT 指的是哪个数字？以及应该打印什么样的错误（您忘记指定它。） - 您必须非常非常小心地非常明确地表达您的意思。

误解其他人已经太容易了。您期望计算机如何做得更好？

因此，计算机语言的语法和语法必须足够严格，不允许出现歧义。语句必须以确定性的方式进行评估。（可能存在极端情况；我在这里讨论的是一般情况。）

我个人更喜欢具有非常有限的关键字集的语言。您可以快速学习这样一门语言，并且您不必仅仅因为有 10,000 个关键字可以实现同一目标，就在 10,000 种实现同一目标的方法中进行选择（例如：GO/WALK/RUN/TROD/SLEEPWALK/等等去冰箱给我拿瓶啤酒！ ）。这意味着如果你需要考虑 10,000 种不同的做某事的方法，那不是因为语言，而是因为事实上有 9,999 种愚蠢的方法可以做到这一点，而 1 个优雅的解决方案比所有方法都更出色其他人。

请注意，我用大写写了所有自然语言示例。那是因为当我写这篇文章时，我脑海中浮现出古老的 GW-BASIC 和 COBOL。已经有一些依赖自然语言的编程语言的例子，我认为历史已经表明，它们总体上不如简洁的 C 风格语言那么广泛。

My answer to this would be that the ideal programming language lies somewhere between a natural language and a very formal language.

On the one extreme, there's the formal, minimal, mathematical languages. Take for example Brainfuck:

,>++++++[<-------->-],[<+>-]<.    // according to Wikipedia, this means addition

Or, what's somewhat preferable to the above mess, any type of lambda calculus.

λfxy.x
λfxy.y

This is one possible way of expressing the Boolean truth values in lambda calculus. Doesn't look very neat, especially when you build logical operators (such as AND being e.g. λpq.pqp) around them.

I claim that most people could not write production code in such a minimalistic, hard-to-grasp language.

The problem on the other end of the spectrum, namely natural languages as they are spoken by humans, is that languages with too much complexity and flexibility allows the programmer to express vague and indefinite things that can mean nothing to today's computers. Let's take this sample program:

MAYBE IT WILL RAIN CATS AND DOGS LATER ON. WOULD YOU LIKE THIS, DEAR COMPUTER?
IF SO, PRINT "HELLO" ON THE SCREEN.
IF YOU HATE RAIN MORE THAN GEORGE DOES, PRINT SOME VAGUE GARBAGE INSTEAD.
(IN THE LATTER CASE, IT IS UP TO YOU WHERE YOU OUTPUT THAT GARBAGE.)

Now this is an obvious case of vagueness. But sometimes you would get things wrong with more reasonable natural language programs, such as:

READ AN INTEGER NUMBER FROM THE TERMINAL.
READ ANOTHER INTEGER NUMBER FROM THE TERMINAL.
IF IT IS LARGER THAN ZERO, PRINT AN ERROR.

Which number is IT referring to? And what kind of error should be printed (you forgot to specify it.) — You would have to be really careful to be extremely explicit about what you mean.

It's already too easy to mis-understand other humans. How do you expect a computer to do better?

Thus, a computer language's syntax and grammar has to be strict enough so that it doesn't allow ambiguity. A statement must evaluate in a deterministic way. (There are maybe corner cases; I'm talking about the general case here.)

I personally prefer languages with a very limited set of keywords. You can quickly learn such a language, and you don't have to choose between 10,000 ways of achieving one goal simply because there's 10,000 keywords for doing the same thing (as in: GO/WALK/RUN/TROD/SLEEPWALK/etc. TO THE FRIDGE AND GET ME A BEER!). It means if you need to think about 10,000 different ways of doing something, it won't be due to the language, but due to the fact that there are 9,999 stupid ways to do it, and 1 elegant solution that just shines more than all the others.

Note that I wrote all natural language examples in upper-case. That's because I sort of had good old GW-BASIC and COBOL in mind while I wrote this. There've been some examples of programming languages that lean on natural language, and I think history has shown that they are, in general, somewhat less widespread than e.g. terse C-style languages.

回复收藏 0 原文

我的影子我的梦 2024-09-17 17:52:07

我最近读到，据 Gartner 称，目前全球范围内活跃使用的 COBOL 源代码超过 4000 亿行。

这只能证明银行和政府喜欢他们的遗留代码，但您可以将其视为类英语编程语言成功的证明。我不知道有任何其他编程语言与英语如此接近且如此冗长。

除此之外，我倾向于同意其他受访者的观点：程序员不喜欢打字太多，而且一般来说，基于数学速记的语言比基于英语的语言更具表现力和更精确。

在某种程度上，简洁、富有表现力的代码看起来就像线路噪音。 Perl、APL 和 J 都是“难以辨认的俏皮话”的例子。程序员也是人类，让他们与自然语言保持一些相似性可能是有益的，这样可以让他们的大脑记住一些熟悉的东西。因此，我传播一种让人想起但又不太接近自然语言的快乐媒介。

回复收藏 0 原文

一桥轻雨一伞开 2024-09-17 17:52:07

“当一种编程语言被创建出来，让程序员可以用简单的英语编程时，就会发现程序员不会说英语。” ~ 未知

回复收藏 0 原文

那片花海 2024-09-17 17:52:07

以我（不那么）谦卑的观点，不。

自然语言充满了歧义。通常我们不会想到它们，因为人类可以根据计算机通常无法获得的许多标准轻松消除它们的歧义。首先，我们有关于世界的知识（大象不适合穿睡衣），但当我们互相交谈时，我们使用的感官不仅仅是听觉，还有肢体语言。说话的语调和方式也有助于消除歧义。在书面文本中很难捕捉到反讽或讽刺，这或多或少是我们所说内容的转录，在即时通讯的情况下更多，在写得好的文章的情况下更少。一般来说，自然语言中存在大量的歧义，例如 PP、介词短语所附加的位置：

 "Workers [dumped [sacks [with flour]]]"
 "Workers [dumped [sacks] [with a fork-lift]]]"

任何人都会立即说出 PP 将附加在哪里，里面有装面粉的麻袋是合理的，使用叉子是合理的-举起倾倒东西。另一个非常麻烦的区域是“and”这个词，它严重扰乱了语法，或者我们使用的所有引用，一般的代词，但也有更复杂的引用，即。 “比尔买了一辆道奇蝰蛇，遗憾的是这辆车很糟糕”。

所以我们有三个选择，保留歧义并尝试处理它们，接受歧义消除中的很多错误和非常非常慢的解析，LALR或LL在这里不起作用，或者尝试制作类似于自然语言的人工语法，并保持它是确定性的，这更合理，但仍然很可怕。我们现在有一种与英语错误相似的语言，但事实并非如此，这令人困惑。我们没有任何正确语法的好处，也没有自然语言的好处，而是一个过于庞大的语言怪物，具有困难且不直观的语法，难以学习且编写缓慢。

第三种方式是意识到我们需要一种简洁的表达方式，这种方式也可以由计算机处理，不类似于任何自然语言，而是专注于算法的明确描述。这将提高可读性，特别是当我们与非常精确的自然语言对应物进行比较时。这就是为什么很多人在处理难题或高级算法时更喜欢阅读伪代码，它减轻了我们处理歧义的麻烦，并且更适合表达计算机指令。

In my (not so) humble opinion, no.

Natural language is full of ambiguities. Normally we do not think of them because humans can easily disambiguate them, based on many criteria often unavailable to the computer. First off we have knowledge about the world (elephants don't fit in pajamas), but also we use more senses than just hearing when we speak to each other, body language to name one. The intonation and manner things are said with also helps alot to disambiguate. It is harder to catch irony or sarcasm in written text, which is more or less a transcription of what we would say, more in the case IM less in the case of well written articles. In general there is loads and loads of ambiguity in natural language, for instance where the PPs, prepositional phrases attach:

 "Workers [dumped [sacks [with flour]]]"
 "Workers [dumped [sacks] [with a fork-lift]]]"

Any human immediatly tells where the PP will attach, its reasonable to have sacks with flour in them, and its reasonable to use a fork-lift to dump something. Another very troublesome area is the word "and" which messes up the grammar horrendously, or all the references we use, the pronouns in general, but also more complex references, ie. "Bill bought a Dodge Viper, sadly the car was a lemon".

So we have three options, keep the ambiguities in and try to deal with them, accepting very many errors in disambiguation and very very slow parsing, no LALR or LL will work here, or try to make an artifical grammar resembling natural language, and keeping it deterministic, which is more reasonable but still horrible. We now have a language that falsely resembles English, but it isn't which is confusing. We have none of the benefits of a proper syntax and none of the benefits of natural language, but an oversized overwordly monstrum, with a diffcult and unintuitive grammar, diffcult to learn and slow to write.

The third way is realizing we need a succinct way of expressing ourselves, which can also be processed by a computer, not resembling any natural language, but focusing on being an unambigous description of an algorithm. This will increase the readability, especially if we compare to a very precise natural language counter part. This is why many people prefer to also read the pseudo-code when dealing with difficult problems or advanced algorithms, it relieves us of the trouble with dealing with ambiguities, and is more optimal for expressing computer instructions.

回复收藏 0 原文

深海夜未眠 2024-09-17 17:52:07

问题不在于使用一种方法或另一种方法更容易描述复杂的想法，而是更容易理解机器语言（至少对于机器而言）。一如既往，最大的问题是含糊不清。计算机很难理解它，因此大多数编程语言的语法需要被构建以消除所有歧义，或者必须构建通用语言以使歧义实际上不再是一个问题（这是棘手的））。

任何允许歧义的编程语言都非常容易出错；任何不允许歧义的自然语言都会非常冗长和令人费解（我正在看着你，Lojban [好吧，也许 Lojban 还没有那么糟糕——还是……]）。

有些人表现出更喜欢自然语言而不是编程语言的倾向可能本质上根源于最终能够将物理教科书输入到解析器中的愿望，这样它就会在被要求时完成你的作业。

当然，这并不是说编程语言不应该有自然语言的暗示：特别是对于 OOP，让调用语法类似于自然语法是很有意义的，就像在 Obj-C 中一样，它是排序的疯狂的库游戏：

[pot makeCoffee:strong withSugar:NO];

在 BrainFuck 中做同样的事情将是，好吧，一个脑残，三页完整的代码来翻转开关就可以做到这一点。

本质上；最好的语言（可能）是那些类似于自然语言的语言，而不是假装是自然语言。（避免编程语言的恐怖谷，[如果有这样的事情]如果你愿意的话。[子条款！耶！]）

The issue isn't so much that it's easier to describe complex ideas using one approach or the other, but it certainly is easier understanding machine languages (at least for machines). The biggest issue is, as always, ambiguity. Computers are terrible at understand it, so most grammars for programming languages need to be constructed to either remove all ambiguity, or the general language must be constructed so that ambiguity isn't actually a problem (this is tricky).

Any programming language that allows for ambiguity would be terribly error prone; and any natural language that doesn't allow ambiguity would be terribly verbose and convoluted (I'm looking at you, Lojban [ok, maybe Lojban isn't so bad‚ still…]).

The propensity some people show for preferring natural languages for programming languages might essentially root out in the desire to eventually be able to input a physics textbook into a parser, whereupon it'll do your homework when asked.

Of course, that's not to say that programming languages shouldn't have hints of natural language: Especially for OOP it makes good sense to have calling grammar resemble natural grammar, like in Obj-C, which is sort of a game of mad libs:

[pot makeCoffee:strong withSugar:NO];

Doing the same in BrainFuck would be, well, a brainfuck, three full pages of code to flip a switch will do that to you.

In essensce; the best languages are (probably) the ones that resemble natural languages, without pretending to be one. (Avoiding the uncanny valley of programming languages, [if there is such a thing] if you will. [Subclauses! Yay!])

回复收藏 0 原文

情感失落者 2024-09-17 17:52:07

自然语言过于模糊，无法用作编程语言。必须人为地对其进行约束以消除歧义。

但它违背了拥有“自然”编程语言的目的，因为你拥有它的“冗长”，并且在可表达性方面没有任何优势。

回复收藏 0 原文

浮生未歇 2024-09-17 17:52:07

我认为我专业编码的第四种语言（继 Fortran、Pascal 和 Cobol 之后）是Natural。这是 1980 年代相当不起眼的 4GL，用于针对 ADABAS 数据库开发大型机系统。

我相信之所以称之为自然，是因为它自以为如此。据说像 cobol 一样具有管理可读性，但减去了一些废话。

这应该告诉你，对“自然”编程语言的尝试现在已经有 30 多年的商业历史了（如果算上 cobol，则更多），但它们几乎输给了那些不假装“自然”但确实允许的语言。程序员简洁地定义问题。当我第一次开始编写 1GL 代码时 -> 2GL-> 3GL 的演变并没有那么古老，主流工作的 4GL（当时定义为一种更类似于英语的编程语言）的发展似乎是明显的下一步。但事实并非如此。如果说现在加快编码速度变得更加困难，因为有更多抽象概念需要学习。

回复收藏 0 原文

如果没有你 2024-09-17 17:52:07

SQL 最初是根据自然语言设计的。幸运的是，它并没有太严格地坚持这一点，并且由于其概念不那么“自然主义”而有所进步。

但是任何尝试过用 SQL 编写复杂查询的人都会告诉您这并不那么容易。您担心查询中某些关键字的范围。你有一个非常难以理解的查询，它做了一些疯狂的事情，但每次你需要更改某些内容时你都会重新编写它，因为它更容易。

自然语言编程是一个坏主意。汇编越深入，犯的错误就越多，不是逻辑错误或类似的错误，而是对脚本解释器/字节码解释器/编译器如何使代码在 CPU 上运行的错误假设。

回复收藏 0 原文

陌若浮生 2024-09-17 17:52:07

对于初学者或将编程作为“次要活动”的人来说，这似乎是一个很棒的功能。但我怀疑你能否用自然语言达到实际编程语言的复杂性和多价性。

回复收藏 0 原文

定格我的天空 2024-09-17 17:52:07

如果有一种编程语言实际上遵守它模仿的自然语言的所有约定，那么那就太棒了。

然而实际上，许多所谓的“自然”编程语言的语法比英语严格得多，这意味着虽然它们很容易阅读，但它们是否真的那么容易编写是有争议的。

在英语中有意义的内容在 AppleScript 中通常是一个语法错误。

回复收藏 0 原文

梦在深巷 2024-09-17 17:52:07

对于计算机来说，日常语言并不那么清晰、简单、干净、可爱、简洁且易于理解。然而，对于人类来说，可读性非常重要，越接近自然语言，就越容易阅读。这就是为什么我们并不都使用汇编语言。

如果您拥有完全自然的语言，则有很多事情需要处理 - 需要解析句子，必须理解每个单词 - 并且存在大量的歧义空间。对于编程语言来说，这通常不是一件好事，因为这样我们就进入了心灵编程——计算机必须弄清楚你在想什么，而这是不容易得到的。

然而，如果你能让一些东西足够接近自然语言 - 是的，Inform 7 可能是最好的例子 - 所以句子看起来很自然，但仍然有一些你需要遵循的结构 - 那么代码几乎可以立即可读，甚至对人来说也是如此不懂语言的人。通常还需要记住不太专业的语法 - 因为你实际上只是在说英语（稍微修改的形式） - 但如果你必须做一些不寻常的事情，那么你可能需要跳过一些障碍才能做到这一点。

实际上，大多数语言都不会为此烦恼，因为这使它们更容易让您变得精确。然而，有些仍会更接近“自然语言”。这可能是一件好事：如果您必须将某些伪代码算法翻译为一种语言，则无需对其进行太多操作即可使其工作，从而降低了翻译中出错的风险。

作为一个例子，我们来比较一下 C 和 Pascal。此 Pascal 代码：

for i := 1 to 10 do begin
  j := j + 1;
end;

相当于此 C 代码：

for (i = 1; i <= 10; i++) {
  j = j + 1;
}

如果您事先不了解这两种语法，那么 Pascal 版本通常会更容易阅读，即使只是因为它不像 C 那样复杂。

我们还考虑一下运营商。 Pascal 和 C 都共享 +、- 和 *。它们也都有 /，但具有不同的语义：在 C 中，如果两个操作数都是整数，则 / 进行整数除法；在 Pascal 中，它总是进行“真正的”除法并使用 div 进行整数除法。这意味着在弄清楚该行代码中实际发生的情况时，您必须考虑类型。

C 还有许多其他运算符：&&、||、&、|、^、<<、>>>。 - 在 Pascal 中，这些运算符被命名为 and、or、and、or、xor< /代码>，<代码>shl，<代码>shr。它不是依赖于一些半任意的字符序列，而是拼写得更多。很明显，xor 是——好吧，XOR——与 C 版本不同，C 版本中 ^ 和 XOR 之间没有明显的相关性。

当然，这在某种程度上是一个观点问题：与类似 C 的语法相比，我更喜欢类似 Pascal 的语法，因为我认为它更具可读性，但这并不意味着其他人都这样做：更自然的语言是通常会更加冗长，而有些人就是不喜欢这种额外的冗长程度。

基本上，这是一个为问题领域选择最有意义的内容的问题：如果问题领域非常有限（例如 Inform），那么自然语言就非常有意义。如果它是一个非常通用的域（例如 C），那么您要么需要比我们当前能力更先进的处理，要么需要大量冗长的内容来填写详细信息 - 在这种情况下，您必须根据情况选择一个平衡什么样的用户将使用这些语言（对于普通人来说，你需要更多的自然性，对于了解编程的人来说，他们通常对不太自然的语言感到足够舒服，并且会更喜欢接近这一目标的语言）。

Everyday language isn't so clear, simple, clean, lovely, concise and understandable - to a computer. However, to a human, readability counts for a lot, and the closer you get to a natural language, the easier it is to read. That's why we're not all using assembly language.

If you have a completely natural language, there are a lot of things that need to be handled - the sentence needs to be parsed, each word must be understood - and there is plenty of room for ambiguity. That's generally not a good thing for a programming language, because then we're venturing into psychic programming - the computer has to figure out what you were thinking, which is not at all easy to get.

However, if you can make something sufficiently close to natural language - and yes, Inform 7 is probably the best example - so sentences look natural, but still have some structure you need to follow - then the code is almost instantly readable, even to people that don't know the language. There's usually also less specialized syntax to remember - because you're really just talking (a slightly modified form of) English - but if you have to do something out of the ordinary, then you might have to jump through some hoops to do that.

In practice, most languages don't bother with this, because that makes it easier for them to allow you to be precise. However, some will still hover closer to the "natural language". This can be a good thing: if you have to translate some pseudocode algorithm to a language, you don't need to manipulate it as much to make it work, reducing the risk that you make an error in the translation.

As an example, let's compare C and Pascal. This Pascal code:

for i := 1 to 10 do begin
  j := j + 1;
end;

is equivalent to this C code:

for (i = 1; i <= 10; i++) {
  j = j + 1;
}

If you had no prior knowledge of either syntax, the Pascal version is generally going to be simpler to read, if only because it's not as complex as a C for.

Let's also consider operators. Pascal and C both share +, - and *. They also both have /, but with different semantics: In C, / does an integer division if both operands are integers; in Pascal, it always does a "real" division and uses div for integer division. That means that you have to take the types into account when figuring out what actually happens in that line of code.

C also has a bunch of other operators: &&, ||, &, |, ^, <<, >> - in Pascal, those operators are instead named and, or, and, or, xor, shl, shr. Instead of relying on some semi-arbitrary sequence of characters, it's spelled out more. It's instantly obvious that xor is - well, XOR - unlike the C version, where there's no obvious correlation between ^ and XOR.

Of course, this is to some degree a matter of opinion: I much prefer a Pascal-like syntax to a C-like syntax, because I think it's more readable, but that doesn't mean everyone else does: A more natural language is usually going to be more verbose, and some people simply dislike that extra level of verbosity.

Basically, it's a matter of choosing what makes the most sense for the problem domain: if the problem domain is very limited (like with Inform), then a natural language makes perfect sense. If it's a very generic domain (like with C), then you either need far more advanced processing than we are currently capable of, or a lot of verbosity to fill in the details - and in that case, you have to choose a balance depending on what sort of users will be using the languages (for regular people, you need more naturalness, for people who know programming, they're usually comfortable enough with less natural languages and will prefer something closer to that end).

回复收藏 0 原文

铁轨上的流浪者 2024-09-17 17:52:07

我认为问题是，谁读取和谁编写有问题的应用程序代码？我认为，无论使用哪种语言或架构，训练有素的软件开发人员都应该编写代码，并在出现错误时分析代码。

回复收藏 0 原文

与“自然语言”相似吗？编程语言令人信服的卖点？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（13）

关于作者

相关话题

热门标签

推荐作者

毁梦

qq_02ocQH

花期渐远

鞋纸虽美，但不合脚ㄋ〞

adminaaa

yangzhenyu

友情链接

与“自然语言”相似吗？编程语言令人信服的卖点？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（13）

关于作者

相关话题

热门标签

推荐作者

毁梦

qq_02ocQH

花期渐远

鞋纸虽美，但不合脚ㄋ〞

adminaaa

yangzhenyu

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。