为什么这个字符串的引用计数为 4? (德尔福2007)

发布于 2024-11-17 05:41:50 字数 2772 浏览 2 评论 0原文

这是一个非常特定于 Delphi 的问题(甚至可能是 Delphi 2007 特定的)。我目前正在编写一个简单的 StringPool 类来实习字符串。作为一名优秀的小程序员,我还添加了单元测试,并发现了一些让我困惑的东西。

这是实习代码:

function TStringPool.Intern(const _s: string): string;
var
  Idx: Integer;
begin
  if FList.Find(_s, Idx) then
    Result := FList[Idx]
  else begin
    Result := _s;
    if FMakeStringsUnique then
      UniqueString(Result);
    FList.Add(Result);
  end;
end;

没什么特别的: FList 是一个已排序的 TStringList,因此所有代码所做的就是在列表中查找字符串,如果它已经存在,则返回现有字符串。如果它还没有在列表中,它会首先调用UniqueString来确保引用计数为1,然后将其添加到列表中。 (我检查了 Result 的引用计数,正如预期的那样,在添加两次 'hallo' 后,它的引用计数为 3。)

现在进入测试代码:

procedure TestStringPool.TestUnique;
var
  s1: string;
  s2: string;
begin
  s1 := FPool.Intern('hallo');
  CheckEquals(2, GetStringReferenceCount(s1));
  s2 := s1;
  CheckEquals(3, GetStringReferenceCount(s1));
  CheckEquals(3, GetStringReferenceCount(s2));
  UniqueString(s2);
  CheckEquals(1, GetStringReferenceCount(s2));
  s2 := FPool.Intern(s2);
  CheckEquals(Integer(Pointer(s1)), Integer(Pointer(s2)));
  CheckEquals(3, GetStringReferenceCount(s2));
end;

这会将字符串 'hallo' 添加到字符串池两次,并检查字符串的引用计数和另外 s1 和 s2 确实指向相同的字符串描述符。

每个 CheckEquals 都按预期工作,但最后一个除外。它失败并显示错误“预期:<3>,但实际是:<4>”。

那么,为什么这里的引用计数是4呢?我本来期望 3:

  • s1
  • s2
  • 和 StringList 中的另一个。

这是 Delphi 2007,因此字符串是 AnsiStrings。

哦,是的,函数 StringReferenceCount 的实现如下:

function GetStringReferenceCount(const _s: AnsiString): integer;
var
  ptr: PLongWord;
begin
  ptr := Pointer(_s);
  if ptr = nil then begin
    // special case: Empty strings are represented by NIL pointers
    Result := MaxInt;
  end else begin
    // The string descriptor contains the following two longwords:
    // Offset -1: Length
    // Offset -2: Reference count
    Dec(Ptr, 2);
    Result := ptr^;
  end;
end;

在调试器中,可以将其评估为:

plongword(integer(pointer(s2))-8)^

只是添加到 Serg 的答案(似乎 100% 正确):

如果我替换

s2 := FPool.Intern(s2);

s3 := FPool.Intern(s2);
s2 := '';

and然后检查 s3(和 s1)的引用计数,正如预期的那样,它是 3。只是因为将 FPool.Intern(s2) 的结果再次赋值给 s2(s2 既是参数,又是函数结果的目的地),导致了这种现象。 Delphi 引入了一个隐藏的字符串变量来分配结果。

另外,如果我将函数更改为过程:

procedure TStringPool.Intern(var _s: string);

引用计数如预期为 3,因为不需要隐藏变量。


如果有人对这个 TStringPool 实现感兴趣:它是 MPL 下的开源代码,并且作为 dzlib 的一部分提供,而 dzlib 又是 dzchart 的一部分:

https://sourceforge.net/p/dzlib/code/HEAD/tree/dzlib/trunk/src/u_dzStringPool.pas

但如上所述:这并不完全是火箭科学。 ;-)

This is a very Delphi specific question (maybe even Delphi 2007 specific). I am currently writing a simple StringPool class for interning strings. As a good little coder I also added unit tests and found something that baffled me.

This is the code for interning:

function TStringPool.Intern(const _s: string): string;
var
  Idx: Integer;
begin
  if FList.Find(_s, Idx) then
    Result := FList[Idx]
  else begin
    Result := _s;
    if FMakeStringsUnique then
      UniqueString(Result);
    FList.Add(Result);
  end;
end;

Nothing really fancy:
FList is a TStringList that is sorted, so all the code does is looking up the string in the list and if it is already there it returns the existing string. If it is not yet in the list, it will first call UniqueString to ensure a reference count of 1 and then add it to the list. (I checked the reference count of Result and it is 3 after 'hallo' has been added twice, as expected.)

Now to the testing code:

procedure TestStringPool.TestUnique;
var
  s1: string;
  s2: string;
begin
  s1 := FPool.Intern('hallo');
  CheckEquals(2, GetStringReferenceCount(s1));
  s2 := s1;
  CheckEquals(3, GetStringReferenceCount(s1));
  CheckEquals(3, GetStringReferenceCount(s2));
  UniqueString(s2);
  CheckEquals(1, GetStringReferenceCount(s2));
  s2 := FPool.Intern(s2);
  CheckEquals(Integer(Pointer(s1)), Integer(Pointer(s2)));
  CheckEquals(3, GetStringReferenceCount(s2));
end;

This adds the string 'hallo' to the string pool twice and checks the string's reference count and also that s1 and s2 indeed point to the same string descriptor.

Every CheckEquals works as expected but the last. It fails with the error "expected: <3> but was: <4>".

So, why is the reference count 4 here? I would have expected 3:

  • s1
  • s2
  • and another one in the StringList

This is Delphi 2007 and the strings are therefore AnsiStrings.

Oh yes, the function StringReferenceCount is implemented as:

function GetStringReferenceCount(const _s: AnsiString): integer;
var
  ptr: PLongWord;
begin
  ptr := Pointer(_s);
  if ptr = nil then begin
    // special case: Empty strings are represented by NIL pointers
    Result := MaxInt;
  end else begin
    // The string descriptor contains the following two longwords:
    // Offset -1: Length
    // Offset -2: Reference count
    Dec(Ptr, 2);
    Result := ptr^;
  end;
end;

In the debugger the same can be evaluated as:

plongword(integer(pointer(s2))-8)^

Just to add to the answer from Serg (which seems to be 100% correct):

If I replace

s2 := FPool.Intern(s2);

with

s3 := FPool.Intern(s2);
s2 := '';

and then check the reference count of s3 (and s1) it is 3 as expected. It's just because of assigning the result of FPool.Intern(s2) to s2 again (s2 is both, a parameter and the destination for the function result) that causes this phenomenon. Delphi introduces a hidden string variable to assign the result to.

Also, if I change the function to a procedure:

procedure TStringPool.Intern(var _s: string);

the reference count is 3 as expected because no hidden variable is required.


In case anybody is interested in this TStringPool implementation: It's open source under the MPL and available as part of dzlib, which in turn is part of dzchart:

https://sourceforge.net/p/dzlib/code/HEAD/tree/dzlib/trunk/src/u_dzStringPool.pas

But as said above: It's not exactly rocket science. ;-)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

千秋岁 2024-11-24 05:41:50

测试一下:

function RefCount(const _s: AnsiString): integer;
var
  ptr: PLongWord;
begin
  ptr := Pointer(_s);
  Dec(Ptr, 2);
  Result := ptr^;
end;

function Add(const S: string): string;
begin
  Result:= S;
end;

procedure TForm9.Button1Click(Sender: TObject);
var
  s1: string;
  s2: string;

begin
  s1:= 'Hello';
  UniqueString(s1);
  s2:= s1;
  ShowMessage(Format('%d', [RefCount(s1)]));   // 2
  s2:= Add(s1);
  ShowMessage(Format('%d', [RefCount(s1)]));   // 2
  s1:= Add(s1);
  ShowMessage(Format('%d', [RefCount(s1)]));   // 3
end;

如果您编写 s1:= Add(s1) 编译器会创建一个隐藏的本地字符串变量,并且该变量负责递增引用计数。你不应该为此烦恼。

Test this:

function RefCount(const _s: AnsiString): integer;
var
  ptr: PLongWord;
begin
  ptr := Pointer(_s);
  Dec(Ptr, 2);
  Result := ptr^;
end;

function Add(const S: string): string;
begin
  Result:= S;
end;

procedure TForm9.Button1Click(Sender: TObject);
var
  s1: string;
  s2: string;

begin
  s1:= 'Hello';
  UniqueString(s1);
  s2:= s1;
  ShowMessage(Format('%d', [RefCount(s1)]));   // 2
  s2:= Add(s1);
  ShowMessage(Format('%d', [RefCount(s1)]));   // 2
  s1:= Add(s1);
  ShowMessage(Format('%d', [RefCount(s1)]));   // 3
end;

If you write s1:= Add(s1) the compiler creates a hidden local string variable, and this variable is responsible for incrementing ref count. You should not bother about it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文