Delphi 2009 - 有错误吗?将假定无效的值添加到集合中

发布于 2024-10-14 16:00:15 字数 827 浏览 6 评论 0原文

首先,我不是一个非常有经验的程序员。我正在使用 Delphi 2009,并且一直在使用集合,这对我来说似乎表现得很奇怪,甚至不一致。我想这可能是我的问题,但下面看起来显然有问题:

unit test;

interface

uses
Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms,
Dialogs, StdCtrls;

type
TForm1 = class(TForm)
  Button1: TButton;
  Edit1: TEdit;
  procedure Button1Click(Sender: TObject);
private
    test: set of 1..2;
end;

var Form1: TForm1;

implementation

{$R *.dfm}

procedure TForm1.Button1Click(Sender: TObject);
begin
  test := [3];
  if 3 in test then
    Edit1.Text := '3';
end;

end.

如果运行程序并单击按钮,那么,果然,它将在文本字段中显示字符串“3”。但是,如果您对 100 这样的数字尝试同样的操作,则不会显示任何内容(在我看来,这应该是这样)。我错过了什么还是这是某种错误?意见将不胜感激!

编辑:到目前为止,似乎我并不孤单地有我的观察。如果有人对此有一些内部了解,我会很高兴听到。另外,如果有人使用 Delphi 2010(甚至 Delphi XE),如果您可以对此甚至一般设置行为(例如“测试:设置 256..257”)进行一些测试,我将不胜感激。看看新版本中是否有任何变化很有趣。

First of all, I'm not a very experienced programmer. I'm using Delphi 2009 and have been working with sets, which seem to behave very strangely and even inconsistently to me. I guess it might be me, but the following looks like there's clearly something wrong:

unit test;

interface

uses
Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms,
Dialogs, StdCtrls;

type
TForm1 = class(TForm)
  Button1: TButton;
  Edit1: TEdit;
  procedure Button1Click(Sender: TObject);
private
    test: set of 1..2;
end;

var Form1: TForm1;

implementation

{$R *.dfm}

procedure TForm1.Button1Click(Sender: TObject);
begin
  test := [3];
  if 3 in test then
    Edit1.Text := '3';
end;

end.

If you run the program and click the button, then, sure enough, it will display the string "3" in the text field. However, if you try the same thing with a number like 100, nothing will be displayed (as it should, in my opinion). Am I missing something or is this some kind of bug? Advice would be appreciated!

EDIT: So far, it seems that I'm not alone with my observation. If someone has some inside knowledge of this, I'd be very glad to hear about it. Also, if there are people with Delphi 2010 (or even Delphi XE), I would appreciate it if you could do some tests on this or even general set behavior (such as "test: set of 256..257") as it would be interesting to see if anything has changed in newer versions.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

木格 2024-10-21 16:00:15

我很好奇,想看看生成的编译代码,并且我弄清楚了以下有关集合在 Delphi 2010 中如何工作的内容。它解释了为什么您可以在以下情况下执行 test := [8] 测试:1..2 组,以及为什么 Assert(8 in test) 之后立即失败。

实际使用了多少空间?

一组字节对于每个可能的字节值都有一位,总共 256 位,即 32 个字节。 一组 1..2 需要 1 个字节,但令人惊讶的是,一组 100..101 也需要一个字节,因此 Delphi 的编译器在内存分配方面非常智能。另一方面,7..8 的集合需要 2 个字节,并且基于仅包含值 0101 的枚举进行设置需要(喘气)13 个字节!

测试代码:

TTestEnumeration = (te0=0, te101=101);
TTestEnumeration2 = (tex58=58, tex101=101);

procedure Test;
var A: set of 1..2;
    B: set of 7..8;
    C: set of 100..101;
    D: set of TTestEnumeration;
    E: set of TTestEnumeration2;
begin
  ShowMessage(IntToStr(SizeOf(A))); // => 1
  ShowMessage(IntToStr(SizeOf(B))); // => 2
  ShowMessage(IntToStr(SizeOf(C))); // => 1
  ShowMessage(IntToStr(SizeOf(D))); // => 13
  ShowMessage(IntToStr(SizeOf(E))); // => 6
end;

结论:

  • 集合背后的基本模型是字节集合,有256个可能的位,32个字节。
  • Delphi 确定总 32 字节范围中所需的连续子范围并使用它。对于 set of 1..2 的情况,它可能只使用第一个字节,因此 SizeOf() 返回 1。对于 set of 100.101它可能只使用第 13 个字节,因此 SizeOf() 返回 1。对于 set of 7..8 它可能使用前两个字节,因此我们得到 SizeOf()=2。这是一个特别有趣的案例,因为它向我们展示了位不会向左或向右移动来优化存储。另一个有趣的例子是 TTestEnumeration2 集合:它使用 6 个字节,即使周围有很多不可用的位。

编译器生成什么样的代码?

测试1,两组,均使用“第一个字节”。

procedure Test;
var A: set of 1..2;
    B: set of 2..3;
begin
  A := [1];
  B := [1];
end;

对于那些了解汇编器的人,可以自己看看生成的代码。对于那些不懂汇编程序的人来说,生成的代码相当于:

begin
  A := CompilerGeneratedArray[1];
  B := CompilerGeneratedArray[1];
end;

这不是拼写错误,编译器对两个赋值使用相同的预编译值。 CompiledGenerateArray[1] = 2

这是另一个测试:

procedure Test2;
var A: set of 1..2;
    B: set of 100..101;
begin
  A := [1];
  B := [1];
end;

同样,在伪代码中,编译后的代码如下所示:

begin
  A := CompilerGeneratedArray1[1];
  B := CompilerGeneratedArray2[1];
end;

同样,没有拼写错误:这次编译器对两个赋值使用不同的预编译值。 CompilerGenerateArray1[1]=2CompilerGenerateArray2[1]=0;编译器生成的代码足够聪明,不会用无效值覆盖“B”中的位(因为 B 保存有关位 96..103 的信息),但它对这两个赋值使用非常相似的代码。

结论

  • 如果您使用基本集中的值进行测试,则所有集合操作都可以很好地工作。对于 1..2 组,使用 12 进行测试。对于 7..8 组 仅使用 78 进行测试。我不认为 set 被破坏了。它在整个 VCL 中很好地实现了它的目的(并且它在我自己的代码中也占有一席之地)。
  • 在我看来,编译器会为集合赋值生成次优代码。我认为不需要表查找,编译器可以内联生成值,并且代码将具有相同的大小但更好的局部性。
  • 我的观点是,让 set of 1..2set of 0..7 表现相同的副作用是之前缺乏的副作用编译器中的优化。
  • 在OP的情况下(var test: set of 1..2; test := [7])编译器应该生成错误。我不会将其归类为错误,因为我认为编译器的行为不应该根据“程序员对坏代码做什么”来定义,而是根据“程序员对好代码做什么”来定义”;尽管如此,编译器应该生成 Constant expression违反子范围边界,就像您尝试以下代码一样:(

代码示例)

procedure Test;
var t: 1..2;
begin
  t := 3;
end;
  • 在运行时,如果代码是使用 {$R+} 编译的,错误的分配应该会引发错误,就像您尝试以下代码一样:(

代码示例)

procedure Test;
var t: 1..2;
    i: Integer;
begin
  {$R+}
  for i:=1 to 3 do
    t := i;
  {$R-}
end;

I was curious enough to take a look at the compiled code that gets produced, and I figured out the following about how sets work in Delphi 2010. It explains why you can do test := [8] when test: set of 1..2, and why Assert(8 in test) fails immediately after.

How much space is actually used?

An set of byte has one bit for every possible byte value, 256 bits in all, 32 bytes. An set of 1..2 requires 1 byte but surprisingly set of 100..101 also requires one byte, so Delphi's compiler is pretty smart about memory allocation. On the othter hand an set of 7..8 requires 2 bytes, and set based on a enumeration that only includes the values 0 and 101 requires (gasp) 13 bytes!

Test code:

TTestEnumeration = (te0=0, te101=101);
TTestEnumeration2 = (tex58=58, tex101=101);

procedure Test;
var A: set of 1..2;
    B: set of 7..8;
    C: set of 100..101;
    D: set of TTestEnumeration;
    E: set of TTestEnumeration2;
begin
  ShowMessage(IntToStr(SizeOf(A))); // => 1
  ShowMessage(IntToStr(SizeOf(B))); // => 2
  ShowMessage(IntToStr(SizeOf(C))); // => 1
  ShowMessage(IntToStr(SizeOf(D))); // => 13
  ShowMessage(IntToStr(SizeOf(E))); // => 6
end;

Conclusions:

  • The basic model behind the set is the set of byte, with 256 possible bits, 32 bytes.
  • Delphi determines the required continuous sub-range of the total 32 bytes range and uses that. For the case set of 1..2 it probably only uses the first byte, so SizeOf() returns 1. For the set of 100.101 it probably only uses the 13th byte, so SizeOf() returns 1. For the set of 7..8 it's probably using the first two bytes, so we get SizeOf()=2. This is an especially interesting case, because it shows us that bits are not shifted left or right to optimize storage. The other interesting case is the set of TTestEnumeration2: it uses 6 bytes, even those there are lots of unusable bits around there.

What kind of code is generated by the compiler?

Test 1, two sets, both using the "first byte".

procedure Test;
var A: set of 1..2;
    B: set of 2..3;
begin
  A := [1];
  B := [1];
end;

For those understand Assembler, have a look at the generated code yourself. For those that don't understand assembler, the generated code is equivalent to:

begin
  A := CompilerGeneratedArray[1];
  B := CompilerGeneratedArray[1];
end;

And that's not a typo, the compiler uses the same pre-compiled value for both assignments. CompiledGeneratedArray[1] = 2.

Here's an other test:

procedure Test2;
var A: set of 1..2;
    B: set of 100..101;
begin
  A := [1];
  B := [1];
end;

Again, in pseudo-code, the compiled code looks like this:

begin
  A := CompilerGeneratedArray1[1];
  B := CompilerGeneratedArray2[1];
end;

Again, no typo: This time the compiler uses different pre-compiled values for the two assignments. CompilerGeneratedArray1[1]=2 while CompilerGeneratedArray2[1]=0; The compiler generated code is smart enough not to overwrite the bits in "B" with invalid values (because B holds information about bits 96..103), yet it uses very similar code for both assignments.

Conclusions

  • All set operations work perfectly well IF you test with values that are in the base-set. For the set of 1..2, test with 1 and 2. For the set of 7..8 only test with 7 and 8. I don't consider the set to be broken. It serves it's purpose very well all over the VCL (and it has a place in my own code as well).
  • In my opinion the compiler generates sub-optimal code for set assignments. I don't think the table-lookups are required, the compiler could generate the values inline and the code would have the same size but better locality.
  • My opinion is that the side-effect of having the set of 1..2 behave the same as set of 0..7 is the side-effect of the previous lack of optimization in the compiler.
  • In the OP's case (var test: set of 1..2; test := [7]) the compiler should generate an error. I would not classify this as a bug because I don't think the compiler's behavior is supposed to be defined in terms of "what to do on bad code by the programmer" but in terms of "what to do with good code by the programmer"; None the less the compiler should generate the Constant expression violates subrange bounds, as it does if you try this code:

(code sample)

procedure Test;
var t: 1..2;
begin
  t := 3;
end;
  • At runtime, if the code is compiled with {$R+}, the bad assignment should raise an error, as it does if you try this code:

(code sample)

procedure Test;
var t: 1..2;
    i: Integer;
begin
  {$R+}
  for i:=1 to 3 do
    t := i;
  {$R-}
end;
ゝ偶尔ゞ 2024-10-21 16:00:15

根据官方文档关于集合(我的重点):

集合构造函数的语法是:[
item1, ..., itemn ] 其中每个项目是
要么是表示一个的表达式
集合基本类型

的序号

现在,根据子范围类型:

当您使用数字或字符时
定义子范围的常量,
基本类型是最小的整数或
包含的字符类型
指定范围。

因此,如果您指定

type
  TNum = 1..2;

,则基本类型将是字节(最有可能),因此,如果

type
  TSet = set of TNum;
var
  test: TSet;

那么

test := [255];

将起作用,但并非

test := [256];

全部按照官方规范。

According to the official documentation on sets (my emphasis):

The syntax for a set constructor is: [
item1, ..., itemn ] where each item is
either an expression denoting an
ordinal of the set's base type

Now, according to Subrange types:

When you use numeric or character
constants to define a subrange, the
base type is the smallest integer or
character type that contains the
specified range.

Therefore, if you specify

type
  TNum = 1..2;

then the base type will be byte (most likely), and so, if

type
  TSet = set of TNum;
var
  test: TSet;

then

test := [255];

will work, but not

test := [256];

all according to the official specification.

水水月牙 2024-10-21 16:00:15

我没有“内部知识”,但编译器逻辑似乎相当透明。

首先,编译器认为任何像 set of 1..2 这样的集合都是 set of 0..255 的子集。这就是为什么set of 256..257是不允许的。

其次,编译器优化了内存分配 - 因此它只为 set of 1..2 分配 1 个字节。相同的 1 个字节被分配给 set of 0..7,并且这两个集合在二进制级别上似乎没有区别。简而言之,编译器在考虑对齐的情况下分配尽可能少的内存(这意味着编译器永远不会为 set 分配 3 个字节 - 它会分配 4 个字节,即使 set code> 适合 3 个字节,例如 set of 1..20)。

编译器处理的方式存在一些不一致,可以通过以下代码示例进行演示:

type
   TTestSet = set of 1..2;
   TTestRec = packed record
     FSet: TTestSet;
     FByte: Byte;
   end;

var
  Rec: TTestRec;

procedure TForm9.Button3Click(Sender: TObject);
begin
  Rec.FSet:= [];
  Rec.FByte:= 1;           // as a side effect we set 8-th element of FSet
                           //   (FSet actually has no 8-th element - only 0..7)
  Assert(8 in Rec.FSet);   // The assert should fail, but it does not!
  if 8 in Rec.FSet then    // another display of the bug
    Edit1.Text := '8';
end;

I have no "inside knowledge", but the compiler logic seems rather transparent.

First, the compiler thinks that any set like set of 1..2 is a subset of set of 0..255. That is why set of 256..257 is not allowed.

Second, the compiler optimizes memory allocation - so it allocates only 1 byte for set of 1..2. The same 1 byte is allocated for set of 0..7, and there seems to be no difference between the both sets on binary level. In short, the compiler allocates as little memory as possible with alignment taken into account (that means for example that compiler never allocates 3 bytes for set - it allocates 4 bytes, even if set fits into 3 bytes, like set of 1..20).

There is some inconsistency in a way the compiler treats sets, which can be demonstrated by the following code sample:

type
   TTestSet = set of 1..2;
   TTestRec = packed record
     FSet: TTestSet;
     FByte: Byte;
   end;

var
  Rec: TTestRec;

procedure TForm9.Button3Click(Sender: TObject);
begin
  Rec.FSet:= [];
  Rec.FByte:= 1;           // as a side effect we set 8-th element of FSet
                           //   (FSet actually has no 8-th element - only 0..7)
  Assert(8 in Rec.FSet);   // The assert should fail, but it does not!
  if 8 in Rec.FSet then    // another display of the bug
    Edit1.Text := '8';
end;
后知后觉 2024-10-21 16:00:15

集合存储为数字,并且实际上可以保存不在该集合所基于的枚举中的值。我预计会出现错误,至少当编译器选项中的范围检查处于打开状态时,但情况似乎并非如此。我不确定这是一个错误还是设计使然。

[编辑]

但这很奇怪:

type
  TNum = 1..2;
  TSet = set of TNum;

var
  test: TSet;
  test2: TNum;

test2 := 4;  // Not accepted
test := [4]; // Accepted

A set is stored as a number and can actually hold values that are not in the enumeration on which the set is based. I would expect an error, at least when Range Checking is on in the compiler options, but this doesn't seem to be the case. I'm not sure if this is a bug or by design though.

[edit]

It is odd, though:

type
  TNum = 1..2;
  TSet = set of TNum;

var
  test: TSet;
  test2: TNum;

test2 := 4;  // Not accepted
test := [4]; // Accepted
海的爱人是光 2024-10-21 16:00:15

从我的角度来看,这是允许非连续枚举类型的副作用。

.NET 位标志也是如此:因为在这两种情况下,基础类型都与整数兼容,因此您可以在其中插入任何整数(在 Delphi 中仅限于 0..255)。

——杰罗恩

From the top of my head, this was a side effect of allowing non contiguous enumeration types.

The same holds for .NET bitflags: because in both cases the underlying types are compatible with integer, you can insert any integer in it (in Delphi limited to 0..255).

--jeroen

感性不性感 2024-10-21 16:00:15

就我而言,没有错误。

例如,采用以下代码

var aByte: Byte;
begin
  aByte := 255;
  aByte := aByte + 1;
  if aByte = 0 then
    ShowMessage('Is this a bug?');
end;

现在,您可以从该代码中获得 2 个结果。如果使用 Range Checking TRUE 进行编译,则第二行将引发异常。如果您没有使用范围检查进行编译,则代码将在没有任何错误的情况下执行并显示消息对话框。

您遇到的情况与集合类似,只是没有编译器开关来强制在这种情况下引发异常(嗯,据我所知......)。

现在,从您的示例来看:

private         
  test: set of 1..2;  

这本质上声明了一个字节大小的集合(如果您调用 SizeOf(Test),它应该返回 1)。一个字节大小的集合只能包含 8 个元素。在这种情况下,它可以包含[0]到[7]。

现在,举个例子:

begin
  test := [8]; //Here, we try to set the 9th bit of a Byte sized variable. It doesn't work
  Test := [4]; //Here, we try to set the 5th bit of a Byte Sized variable. It works.      
end;

现在,我需要承认我希望第一行出现“常量表达式违反子范围边界”(但不是第二行)

所以是的......编译器可能存在一个小问题。

至于你的结果不一致......我很确定使用集合的子范围值之外的设置值并不能保证在不同版本的Delphi上给出一致的结果(甚至可能在不同的编译上都不能......所以如果你的范围是 1..2,坚持使用 [1] 和 [2]。

As far as I'm concerned, no bugs there.

For exemple, take the following code

var aByte: Byte;
begin
  aByte := 255;
  aByte := aByte + 1;
  if aByte = 0 then
    ShowMessage('Is this a bug?');
end;

Now, you can get 2 result from this code. If you compiled with Range Checking TRUE, an exception will be raise on the 2nd line. If you did NOT compile with Range Checking, the code will execute without any error and display the message dialogs.

The situation you encountered with the sets is similar, except that there is no compiler switch to force an exception to be raised in this situation (Well, as far as I know...).

Now, from your exemple:

private         
  test: set of 1..2;  

That essentially declare a Byte sized set (If you call SizeOf(Test), it should return 1). A byte sized set can only contain 8 elements. In this case, it can contains [0] to [7].

Now, some exemple:

begin
  test := [8]; //Here, we try to set the 9th bit of a Byte sized variable. It doesn't work
  Test := [4]; //Here, we try to set the 5th bit of a Byte Sized variable. It works.      
end;

Now, I need to admit I would kind of expect the "Constant expression violates subrange bounds" on the first line (but not on 2nd)

So yeah... there might be a small issue with the compiler.

As for your result being inconsistent... I'm pretty sure using set values out of the set's subrange values isn't guaranteed to give consistent result over different version of Delphi (Maybe not even over different compiles... So if your range is 1..2, stick with [1] and [2].

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文