更好、更简单的“语义冲突”例子？

发布于 2024-08-27 05:54:28 字数 1629 浏览 6 评论 0原文

我喜欢区分版本控制系统 (VCS) 中的三种不同类型的冲突：

文本
句法
语义

文本冲突是由合并或更新过程检测到的冲突。这是由系统标记的。在解决冲突之前，VCS 不允许提交结果。

VCS 不会标记语法冲突，但结果将无法编译。因此，即使是稍微细心的程序员也应该明白这一点。（一个简单的例子可能是通过Left重命名变量，并通过Right使用该变量添加一些行。合并可能会有一个未解析的符号。或者，这可能会引入一个变量隐藏导致的语义冲突。）

最后，VCS 不会标记语义冲突，结果可以编译，但代码运行时可能会出现问题。在轻微的情况下，会产生不正确的结果。在严重的情况下，可能会引发崩溃。即使这些也应该由非常细心的程序员在提交之前通过代码审查或单元测试来检测。

我的语义冲突示例使用了 SVN (Subversion) 和 C++，但这些选择与问题的本质并不真正相关。

基本代码为：

int i = 0;
int odds = 0;
while (i < 10)
{
    if ((i & 1) != 0)
    {
        odds *= 10;
        odds += i;
    }
    // next
    ++ i;
}
assert (odds == 13579)

左 (L) 和右 (R) 更改如下。

左的“优化”（更改循环变量采用的值）：

int i = 1; // L
int odds = 0;
while (i < 10)
{
    if ((i & 1) != 0)
    {
        odds *= 10;
        odds += i;
    }
    // next
    i += 2; // L
}
assert (odds == 13579)

右的“优化”（更改循环变量的使用方式）：

int i = 0;
int odds = 0;
while (i < 5) // R
{
    odds *= 10;
    odds += 2 * i + 1; // R
    // next
    ++ i;
}
assert (odds == 13579)

这是结果合并或更新，并且未被 SVN 检测到（这是 VCS 的正确行为），因此它不是文本冲突。请注意，它可以编译，因此不是语法冲突。

int i = 1; // L
int odds = 0;
while (i < 5) // R
{
    odds *= 10;
    odds += 2 * i + 1; // R
    // next
    i += 2; // L
}
assert (odds == 13579)

assert 失败，因为 odds 是 37。

所以我的问题如下。还有比这个更简单的例子吗？是否有一个简单的示例，其中已编译的可执行文件出现新的崩溃？

作为第二个问题，您在实际代码中是否遇到过这种情况？再次强调，简单的例子特别受欢迎。

原文

I like to distinguish three different types of conflict from a version control system (VCS):

textual
syntactic
semantic

A textual conflict is one that is detected by the merge or update process. This is flagged by the system. A commit of the result is not permitted by the VCS until the conflict is resolved.

A syntactic conflict is not flagged by the VCS, but the result will not compile. Therefore this should also be picked up by even a slightly careful programmer. (A simple example might be a variable rename by Left and some added lines using that variable by Right. The merge will probably have an unresolved symbol. Alternatively, this might introduce a semantic conflict by variable hiding.)

Finally, a semantic conflict is not flagged by the VCS, the result compiles, but the code may have problems running. In mild cases, incorrect results are produced. In severe cases, a crash could be introduced. Even these should be detected before commit by a very careful programmer, through either code review or unit testing.

My example of a semantic conflict uses SVN (Subversion) and C++, but those choices are not really relevant to the essence of the question.

The base code is:

int i = 0;
int odds = 0;
while (i < 10)
{
    if ((i & 1) != 0)
    {
        odds *= 10;
        odds += i;
    }
    // next
    ++ i;
}
assert (odds == 13579)

The Left (L) and Right (R) changes are as follows.

Left's 'optimisation' (changing the values the loop variable takes):

int i = 1; // L
int odds = 0;
while (i < 10)
{
    if ((i & 1) != 0)
    {
        odds *= 10;
        odds += i;
    }
    // next
    i += 2; // L
}
assert (odds == 13579)

Right's 'optimisation' (changing how the loop variable is used):

int i = 0;
int odds = 0;
while (i < 5) // R
{
    odds *= 10;
    odds += 2 * i + 1; // R
    // next
    ++ i;
}
assert (odds == 13579)

This is the result of a merge or update, and is not detected by SVN (which is correct behaviour for the VCS), so it is not a textual conflict. Note that it compiles, so it is not a syntactic conflict.

int i = 1; // L
int odds = 0;
while (i < 5) // R
{
    odds *= 10;
    odds += 2 * i + 1; // R
    // next
    i += 2; // L
}
assert (odds == 13579)

The assert fails because odds is 37.

So my question is as follows. Is there a simpler example than this? Is there a simple example where the compiled executable has a new crash?

As a secondary question, are there cases of this that you have encountered in real code? Again, simple examples are especially welcome.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

回忆躺在深渊里 2024-09-03 05:54:28

提出简单的相关示例并不明显，这个评论总结了最好的原因：

如果更改很接近，那么琐碎的解决方案更有可能是正确的（因为那些不正确的解决方案更有可能触及代码的相同部分，从而导致重要的冲突），并且在这少数情况下如果不是这样，问题就会相对较快地显现出来，而且可能会以明显的方式显现出来。

[这基本上就是你的例子所说明的]

但是，检测由于代码中相距较远的区域中的更改之间的合并而引入的语义冲突，可能需要比大多数程序员更多地记住程序，或者在内核大小的项目中，比任何程序员都更能记住。
因此，即使您确实手动检查了这些 3 向差异，这也将是一项相对无用的练习：所付出的努力与所获得的信心远远不成比例。
事实上，我认为合并是转移注意力的事情：
代码中不同但相互依赖的部分之间的这种语义冲突在它们可以单独发展时是不可避免的。
这个并发开发流程是如何组织的——DVCS；中央空调系统； tarball 和补丁；每个人都在网络共享上编辑相同的文件 - 对于这一事实来说根本没有任何影响。
合并不会导致语义冲突，编程会导致语义冲突。

换句话说，我在实际代码中遇到的合并后语义冲突的真实情况并不简单，而是相当复杂。

话虽这么说，最简单的例子，如 Martin Fowler 在他的文章《Feature Branch》中所示，是方法重命名：

我更担心的问题是语义冲突。
一个简单的例子是，如果 Plum 教授更改了 Green 牧师的代码调用的方法的名称。重构工具允许您安全地重命名方法，但仅限于您的代码库。
因此，如果 G1-6 包含调用 foo 的新代码，Plum 教授无法在他的代码库中得知，因为他没有它。你只能在大合并中找到答案。
函数重命名是一种相对明显的语义冲突情况。
实际上，它们可以更加微妙。
测试是发现它们的关键，但是要合并的代码越多，发生冲突的可能性就越大，修复它们也就越困难。
冲突的风险，尤其是语义冲突，让大型合并变得可怕。

正如 Ole Lynge 在他的回答（已投票），马丁Fowler今天（编辑本次编辑时）确实写了一篇关于“语义冲突”的文章，包括以下插图：

semantic冲突说明

同样，这是基于函数重命名的，尽管提到了基于内部函数重构的更微妙的情况：

最简单的例子是重命名函数。
假设我认为如果将方法 clcBl 称为 calculateBill，它会更容易使用。
因此，这里的第一点是，无论您的工具多么强大，它只能保护您免受文本冲突的影响。
但是，有一些策略可以显着帮助我们应对这些问题
第一个是 SelfTestingCode。测试有效地探测我们的代码，看看他们对代码语义的看法是否与代码实际执行的操作一致
另一种有帮助的技术是更频繁地合并
人们常常试图根据 DVCS 如何使功能分支变得容易来证明它们的合理性。但这忽略了语义冲突的问题。
如果您的功能在几天之内快速构建，那么您遇到的语义冲突就会更少（如果不到一天，那么它实际上与 CI 相同）。然而，我们并不经常看到如此短的功能分支。

我认为需要在生命周期分支和功能分支之间找到一个中间立场。
如果您有一组开发人员在相同功能分支上，那么经常合并是关键。

It is not obvious to come up with simple relevant examples, and this comment sum up best why:

If the changes are close by, then trivial resolutions are more likely to be correct (because those that are incorrect are more likely to touch the same parts of the code and thus result in non-trivial conflicts), and in those few cases where they aren’t, the problem will manifest itself relatively quickly and probably in an obvious way.

[Which is basically what your example illustrates]

But detecting semantic conflicts introduced by merges between changes in widely separated areas of the code is likely to require holding more of the program in your head than most programmers can – or in projects the size of the kernel, than any programmer can.
So even if you did review those 3-way diffs manually, it would be a comparatively useless exercise: the effort would be far disproportionate with the gain in confidence.
In fact, I would argue that merging is a red herring:
this sort of semantic clash between disparate but interdependent parts of the code is inevitable the moment they can evolve separately.
How this concurrent development process is organized – DVCS; CVCS; tarballs and patches; everyone edits the same files on a network share – is of no consequence at all to that fact.
Merging doesn’t cause semantic clashes, programming causes semantic clashes.

In other words, the real case of semantic conflicts I have encountered in real code after a merge were not simple, but rather quite complex.

That being said, the simplest example, as illustrated by Martin Fowler in his article Feature Branch is a method rename:

The problem I worry more about is a semantic conflict.
A simple example of this is that if Professor Plum changes the name of a method that Reverend Green's code calls. Refactoring tools allow you to rename a method safely, but only on your code base.
So if G1-6 contain new code that calls foo, Professor Plum can't tell in his code base as he doesn't have it. You only find out on the big merge.
A function rename is a relatively obvious case of a semantic conflict.
In practice they can be much more subtle.
Tests are the key to discovering them, but the more code there is to merge the more likely you'll have conflicts and the harder it is to fix them.
It's the risk of conflicts, particularly semantic conflicts, that make big merges scary.

As Ole Lynge mentions in his answer (upvoted), Martin Fowler did write today (time of this edit) an post about "semantic conflict", including the following illustration:

semantic conflict illustration

Again, this is based on function renaming, even though subtler case based on internal function refactoring are mentioned:

The simplest example is that of renaming a function.
Say I think that the method clcBl would be easier to work with if it were called calculateBill.
So the first point here is that however powerful your tooling is, it will only protect you from textual conflicts.
There are, however, a couple of strategies that can significantly help us deal with them
The first of these is SelfTestingCode. Tests are effectively probing our code to see if their view of the code's semantics are consistent with what the code actually does
The other technique that helps is to merge more often
Often people try to justify DVCSs based on how they make feature branching easy. But that misses the issues of semantic conflicts.
If your features are built quickly, within a couple of days, then you'll run into less semantic conflicts (and if less than a day, then it's in effect the same as CI). However we don't see such short feature branches very often.

I think a middle ground needs to be found between shot-lived branches and feature-branches.
And merging often is key if you have a group of developer on the same feature branch.

回复收藏 0 原文