如何让 TStringList 在 Delphi 中以不同方式排序

发布于 2024-08-19 11:23:19 字数 954 浏览 7 评论 0 原文

我有一个简单的 TStringList。我对其进行了 TStringList.Sort 操作。

然后我注意到下划线“_”排在大写字母“A”之前。这与对相同文本进行排序并在 A 之后排序 _ 的第三方包形成鲜明对比。

根据 ANSI 字符集,AZ 是字符 65 - 90,而 _ 是 95。所以看起来第 3 方包正在使用它order 而 TStringList.Sort 不是。

我深入研究了 TStringList.Sort 的内部结构,它使用 AnsiCompareStr(区分大小写)或 AnsiCompareText(不区分大小写)进行排序。我尝试了两种方法,将 StringList 的 CaseSensitive 值设置为 true,然后设置为 false。但在这两种情况下,“_”都会先排序。

我只是无法想象这是 TStringList 中的错误。所以这里一定还有其他我没有看到的东西。那可能是什么?

我真正需要知道的是如何让我的 TStringList 进行排序,以便它与其他包的顺序相同。

作为参考,我正在使用 Delphi 2009,并且在程序中使用 Unicode 字符串。


因此,这里的最终答案是用您想要的任何内容(例如非 ansi 比较)覆盖 Ansi 比较,如下所示:

type
  TMyStringList = class(TStringList)
  protected
    function CompareStrings(const S1, S2: string): Integer; override;
  end;

function TMyStringList.CompareStrings(const S1, S2: string): Integer;
begin
  if CaseSensitive then
    Result := CompareStr(S1, S2)
  else
    Result := CompareText(S1, S2);
end;

I have a simple TStringList. I do a TStringList.Sort on it.

Then I notice that the underscore "_" sorts before the capital letter "A". This was in contrast to a third party package that was sorting the same text and sorted _ after A.

According to the ANSI character set, A-Z are characters 65 - 90 and _ is 95. So it looks like the 3rd party package is using that order and TStringList.Sort isn't.

I drilled down into guts of TStringList.Sort and it is sorting using AnsiCompareStr (Case Sensitive) or AnsiCompareText (Case Insensitive). I tried it both ways, setting my StringList's CaseSensitive value to true and then false. But in both cases, the "_" sorts first.

I just can't imagine that this is a bug in TStringList. So there must be something else here that I am not seeing. What might that be?

What I really need to know is how can I get my TStringList to sort so that it is in the same order as the other package.

For reference, I am using Delphi 2009 and I'm using Unicode strings in my program.


So the final answer here is to override the Ansi compares with whatever you want (e.g. non-ansi compares) as follows:

type
  TMyStringList = class(TStringList)
  protected
    function CompareStrings(const S1, S2: string): Integer; override;
  end;

function TMyStringList.CompareStrings(const S1, S2: string): Integer;
begin
  if CaseSensitive then
    Result := CompareStr(S1, S2)
  else
    Result := CompareText(S1, S2);
end;

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

独﹏钓一江月 2024-08-26 11:23:19

定义“正确”。
i18n 排序完全取决于您的区域设置。
所以我完全同意 PA 这不是一个错误:默认的 Sort 行为的工作方式如下旨在让 i18n 正常工作。

就像 Gerry 提到的那样,TStringList.Sort 使用 AnsiCompareStr >AnsiCompareText(我将用几行解释它是如何做到这一点的)。

但是:TStringList很灵活,它包含SortCustomSortCompareStrings,它们都是虚拟的(因此您可以在后代类中重写它们)
此外,当您调用CustomSort时,您可以插入自己的Compare函数。

在这个答案的最后是一个 Compare 函数,可以执行您想要的操作:

  • 区分大小写
  • 不使用任何区域
  • 设置只需比较字符串字符的序数值

CustomSort 定义为this:

procedure TStringList.CustomSort(Compare: TStringListSortCompare);
begin
  if not Sorted and (FCount > 1) then
  begin
    Changing;
    QuickSort(0, FCount - 1, Compare);
    Changed;
  end;
end;

默认情况下,Sort方法有一个非常简单的实现,传递一个名为StringListCompareStrings的默认Compare函数:

procedure TStringList.Sort;
begin
  CustomSort(StringListCompareStrings);
end;

因此,如果您定义自己的TStringListSortCompare 兼容 Compare 方法,然后您可以定义自己的排序。
TStringListSortCompare 被定义为一个全局函数,采用 TStringList 和两个索引来引用您想要比较的项目:

type
  TStringListSortCompare = function(List: TStringList; Index1, Index2: Integer): Integer;

您可以使用 StringListCompareStrings 作为实现您自己的指南:

function StringListCompareStrings(List: TStringList; Index1, Index2: Integer): Integer;
begin
  Result := List.CompareStrings(List.FList^[Index1].FString,
                                List.FList^[Index2].FString);
end;

因此,默认情况下 TStringList.Sort 遵循TList.CompareStrings:

function TStringList.CompareStrings(const S1, S2: string): Integer;
begin
  if CaseSensitive then
    Result := AnsiCompareStr(S1, S2)
  else
    Result := AnsiCompareText(S1, S2);
end;

然后使用底层 Windows API 函数 CompareString 使用默认用户区域设置 LOCALE_USER_DEFAULT

function AnsiCompareStr(const S1, S2: string): Integer;
begin
  Result := CompareString(LOCALE_USER_DEFAULT, 0, PChar(S1), Length(S1),
    PChar(S2), Length(S2)) - 2;
end;

function AnsiCompareText(const S1, S2: string): Integer;
begin
  Result := CompareString(LOCALE_USER_DEFAULT, NORM_IGNORECASE, PChar(S1),
    Length(S1), PChar(S2), Length(S2)) - 2;
end;

终于有了你需要的比较功能。再次强调限制:

  • 区分大小写
  • 不使用任何语言环境
  • 只需比较字符串字符的序数值

这是代码:

function StringListCompareStringsByOrdinalCharacterValue(List: TStringList; Index1, Index2: Integer): Integer;
var
  First: string;
  Second: string;
begin
  First := List[Index1];
  Second := List[Index2];
  if List.CaseSensitive then
    Result := CompareStr(First, Second)
  else
    Result := CompareText(First, Second);
end;

Delphi 不是封闭的,恰恰相反:通常它是一个非常灵活的体系结构。
通常只需进行一些挖掘即可了解可以在哪里发挥这种灵活性。

——杰罗恩

Define "correctly".
i18n sorting totally depends on your locale.
So I totally agree with PA that this is not a bug: the default Sort behaviour works as designed to allow i18n to work properly.

Like Gerry mentions, TStringList.Sort uses AnsiCompareStr and AnsiCompareText (I'll explain in a few lines how it does that).

But: TStringList is flexible, it contains Sort, CustomSort and CompareStrings, which all are virtual (so you can override them in a descendant class)
Furthermore, when you call CustomSort, you can plug in your own Compare function.

At the of this answer is a Compare function that does what you want:

  • Case Sensitive
  • Not using any locale
  • Just compare the ordinal value of the characters of the strings

CustomSort is defined as this:

procedure TStringList.CustomSort(Compare: TStringListSortCompare);
begin
  if not Sorted and (FCount > 1) then
  begin
    Changing;
    QuickSort(0, FCount - 1, Compare);
    Changed;
  end;
end;

By default, the Sort method has a very simple implementation, passing a default Compare function called StringListCompareStrings:

procedure TStringList.Sort;
begin
  CustomSort(StringListCompareStrings);
end;

So, if you define your own TStringListSortCompare compatible Compare method, then you can define your own sorting.
TStringListSortCompare is defined as a global function taking the TStringList and two indexes referring the items you want to compare:

type
  TStringListSortCompare = function(List: TStringList; Index1, Index2: Integer): Integer;

You can use the StringListCompareStrings as a guideline for implementing your own:

function StringListCompareStrings(List: TStringList; Index1, Index2: Integer): Integer;
begin
  Result := List.CompareStrings(List.FList^[Index1].FString,
                                List.FList^[Index2].FString);
end;

So, by default TStringList.Sort defers to TList.CompareStrings:

function TStringList.CompareStrings(const S1, S2: string): Integer;
begin
  if CaseSensitive then
    Result := AnsiCompareStr(S1, S2)
  else
    Result := AnsiCompareText(S1, S2);
end;

Which then use the under lying Windows API function CompareString with the default user locale LOCALE_USER_DEFAULT:

function AnsiCompareStr(const S1, S2: string): Integer;
begin
  Result := CompareString(LOCALE_USER_DEFAULT, 0, PChar(S1), Length(S1),
    PChar(S2), Length(S2)) - 2;
end;

function AnsiCompareText(const S1, S2: string): Integer;
begin
  Result := CompareString(LOCALE_USER_DEFAULT, NORM_IGNORECASE, PChar(S1),
    Length(S1), PChar(S2), Length(S2)) - 2;
end;

Finally the Compare function you need. Again the limitations:

  • Case Sensitive
  • Not using any locale
  • Just compare the ordinal value of the characters of the strings

This is the code:

function StringListCompareStringsByOrdinalCharacterValue(List: TStringList; Index1, Index2: Integer): Integer;
var
  First: string;
  Second: string;
begin
  First := List[Index1];
  Second := List[Index2];
  if List.CaseSensitive then
    Result := CompareStr(First, Second)
  else
    Result := CompareText(First, Second);
end;

Delphi ain't closed, quite the opposite: often it is a really flexible architecture.
It is often just a bit of digging to see where you can hook into the that flexibility.

--jeroen

黯然#的苍凉 2024-08-26 11:23:19

AnsiCompareStr / AnsiCompareText 考虑的不仅仅是字符数。他们考虑了用户的区域设置,因此“e”将与“é”、“ê”等一起排序。

要使其按 Ascii 顺序排序,请使用自定义比较函数 如此处所述

AnsiCompareStr / AnsiCompareText take more than character number into account. They take the users locale into account, so "e" will sort along with "é", "ê" etc.

To make it sort it in Ascii order, use a custom compare function as described here

纸伞微斜 2024-08-26 11:23:19

AnsiCompareStr(与 LOCALE_USER_DEFAULT 比较字符串)有错误,因为它获取标点符号相等的字符:

e1
é1
e2
é2

正确的顺序是(例如捷克语):

e1
e2
é1
é2

有谁知道如何避免订购时出现此错误?


11.2.2010:我必须道歉,所描述的行为完全符合语言规则。虽然我认为这是愚蠢和“糟糕”的,但这并不是 API 函数中的错误。

Windows XP 中的资源管理器使用所谓的直观文件名排序,可以提供更好的结果,但不能以编程方式使用。

AnsiCompareStr (CompareString with LOCALE_USER_DEFAULT) has fault, because it gets characters with punctation as equal:

e1
é1
e2
é2

Correct order is (for example for Czech):

e1
e2
é1
é2

Does anybody know how to avoid this error in ordering?


11.2.2010: I must apologize described behavior is fully according linguistic rules. Although I think it is silly and "bad" it is not error in API function.

Explorer in Windows XP uses so called intuitive filname ordering which gives better results but it can't be used programatically.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文