拉撒路。相当于 Unicode 符号的 Chr()

发布于 2024-11-27 23:01:44 字数 192 浏览 4 评论 0原文

freepascal中有没有函数可以通过代码显示Unicode符号(例如U+1D15E)?不幸的是 Chr() 仅适用于 ANSI 符号(代码小于 127)。
我想使用自定义符号字体中的符号,但将它们直接放入源代码中非常不方便(它们在 Lazarus 中显示为 ? 或其他内容,因为它们在系统字体中不存在)。

Is there any function in freepascal to show the Unicode symbol by its code (e.g. U+1D15E)? Unfortunately Chr() works only with ANSI symbols (with codes less than 127).
I want to use symbols from custom symbolic font and it is very inconvenient to put them into sourcecode directly (they are shown in Lazarus as ? or something else because they are absent in system fonts).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

回眸一遍 2024-12-04 23:01:44

查看此页面。我假设 Freepascal 要么使用 UTF-16,其中它成为两个 WideChar 的代理对(见表),要么使用 UTF-8,其中它成为字节值序列(再次见表)。

UTF-8:

const
  HalfNoteString = UTF8String(#$F0#$9D#$85#$9E);

UTF-16:

const
  HalfNoteString = UnicodeString(#$D834#$DD5E);

字符串类型的名称可能不同,因为我不太了解 FreePascal。也许 AnsiString 和 WideString。

Take a look at this page. I assume that Freepascal either uses UTF-16, in which it becomes a surrogate pair of two WideChars (see table) or UTF-8, in which it becomes a sequence of byte values (see table again).

UTF-8:

const
  HalfNoteString = UTF8String(#$F0#$9D#$85#$9E);

UTF-16:

const
  HalfNoteString = UnicodeString(#$D834#$DD5E);

The names of the string types may differ, as I don't know FreePascal very well. Perhaps AnsiString and WideString.

思念满溢 2024-12-04 23:01:44

我从未使用过 Free Pascal,但如果我是你,我会尝试

var
  s: char;
begin
  s := char($222b);                   // Just cast a word

,或者,如果编译器真的很顽固,

var
  s: char;
begin
  PWord(@s)^ := $222b;                // Forcibly write a word

I have never used Free Pascal, but if I were you, I'd try

var
  s: char;
begin
  s := char($222b);                   // Just cast a word

or, if the compiler is really stubborn,

var
  s: char;
begin
  PWord(@s)^ := $222b;                // Forcibly write a word
安静被遗忘 2024-12-04 23:01:44

据我所知,FPC 当前的 unicode 状态

  1. 文字的代码页可以使用 $codepage 设置http://www.freepascal.org/docs-html/prog/progsu81.html
  2. FPC 2.4.x+ 确实有 unicodestring (因为它是 +/- Kylix 宽字符串)但是仅提供基本的日常支持。 (位置和复制,不是格式等例程),但“记录”缺少代码页字段。
  3. Lazarus 小部件期望正常 ansistrings 中使用 UTF8(没有代码页数据的 D7..D2007 ansistrings),并且程序员必须在必要时手动插入转换。因此,在 Windows 上,小部件主要使用 unicode (-W) 调用,但采用带有 UTF8 的 ansisstring。
  4. FPC 不遵循 ansistring 方案中的 utf8,因此对于 sysutils 中的某些字符串接受例程,Lazarus 中有一些特殊例程假设 UTF8 调用 -W 变体)
  5. FPC ansistring 是系统默认的 1 字节编码。 Windows 上为 ansi,大多数其他平台上为 utf8。
  6. Trunk 2.7.1 提供对新 D2009+ ansistring(带代码页)的支持。
  7. 目前还没有讨论如何处理默认的 stringtype(例如,“string”在 *nix 上是 utf8string,在 Windows 上是 unicodestring,还是到处都是 unicodestring 或 utf8string?)
  8. 其他与 unicodestring 相关的增强功能(例如 tstringlist.savetofile 的编码参数)是没有实施。同样,对于伪对象(例如 TCharacter,据我所知,它们大多是静态的)

更新:2.7.1 有一个可变编码 ansisstring 类型,并且 lazarus 已修复以继续工作。不过,还没有真正利用它,例如,大多数 RTL 仍然使用 -A 调用,并且采用字符串的 sysutils 和系统过程的原型尚未更改为 rawbytestring。

Current unicode status of FPC to my best knowledge

  1. The codepage of literals can be set with $codepage http://www.freepascal.org/docs-html/prog/progsu81.html
  2. FPC 2.4.x+ does have unicodestring (since it is +/- Kylix widestring) but only basic routine support. (pos and copy, not routines like format), but the "record" misses the codepage field.
  3. Lazarus widgets expect UTF8 in normal ansistrings (D7..D2007 ansistrings without codepage data), and programmers must manually insert conversions if necessary. So on Windows the widgets ARE mostly using unicode (-W) calls, but take ansistrings with UTF8 in it.
  4. FPC doesn't follow the utf8 in ansistring scheme , so for some string accepting routines in sysutils, there are special routines in Lazarus that assume UTF8 that call -W variants)
  5. FPC ansistring is the system default 1-byte encoding. ansi on Windows, utf8 on most other platforms.
  6. Trunk, 2.7.1, provides support for the new D2009+ ansistring (with codepages).
  7. There has been no discussion yet how to deal with the default stringtype (e.g. will "string" be utf8string on *nix and unicodestring on Windows, or unicodestring or utf8string everywhere?)
  8. Other unicodestring related enhancement (like encoding parameters to tstringlist.savetofile) are not implemented. Likewise for the pseudo objects (like TCharacter which are afaik mostly static)

Update: 2.7.1 has a variable encoding ansistring type, and lazarus has been fixed to keep working. Nothing is really taking advantage from it yet though, e.g. most of the RTL still uses -A calls, and prototypes of sysutils and system procedures that takes strings haven't changed to rawbytestring yet.

淡淡绿茶香 2024-12-04 23:01:44

我假设问题是从 UCS4 编码(实际上是 Unicode 代码点编号)转换为 UTF16。

在Delphi中,您可以使用UCS4StringToUnicodeString函数。

警告:要小心UCS4String类型。它实际上是一个以零结尾的动态数组,而不是一个字符串(这意味着它是从零开始的)。

var
  S1: UCS4String;
  S: string;

begin
  SetLength(S1, 2);
  S1[0]:= UCS4Char($1D15E);
  S1[1]:= UCS4Char(0);
  S:= UCS4StringToUnicodeString(S1);
  ShowMessage(Format('%d, %x, %x', [Length(S), Ord(S[1]), Ord(S[2])]));
end;

I assume the problem is to convert from UCS4 encoding (which is actually a Unicode codepoint number) to UTF16.

In Delphi, you can use UCS4StringToUnicodeString function.

Warning: Be careful with UCS4String type. It is actually a zero-terminated dynamic array, not a string (that means it is zero-based).

var
  S1: UCS4String;
  S: string;

begin
  SetLength(S1, 2);
  S1[0]:= UCS4Char($1D15E);
  S1[1]:= UCS4Char(0);
  S:= UCS4StringToUnicodeString(S1);
  ShowMessage(Format('%d, %x, %x', [Length(S), Ord(S[1]), Ord(S[2])]));
end;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文