Ada 中的 HTML 实体转换器
我想编写一个 Ada 程序,用适用的 HTML 实体替换 Latin1 字符,但我的代码不起作用:text.txt
和 converted.txt
始终相同。我的导师说代码是正确的。 提前致谢!
这是我的代码:
with Ada.Text_IO;
procedure Entity_Converter is
use Ada.Text_IO;
Source : File_Type;
Target : File_Type;
Source_Char : Character;
begin
Open (Source, In_File, "test.txt");
Create (Target, Out_File, "converted.txt");
while not End_Of_File (Source) loop
Get (Source, Source_Char);
case Source_Char is
when 'ä' =>
Put (Target, "ä");
when 'Ä' =>
Put (Target, "Ä");
when 'ö' =>
Put (Target, "ö");
when 'Ö' =>
Put (Target, "Ö");
when 'ü' =>
Put (Target, "ü");
when 'Ü' =>
Put (Target, "Ü");
when 'ß' =>
Put (Target, "ß");
when others =>
Put (Target, Source_Char);
end case;
end loop;
Close (Source);
Close (Target);
end Entity_Converter;
I want to write an Ada program which replaces Latin1 characters with applicable HTML entities, but my code does not work: text.txt
and converted.txt
are always the same. My tutor said that code is correct.
Thanks in advance!
Here is my code:
with Ada.Text_IO;
procedure Entity_Converter is
use Ada.Text_IO;
Source : File_Type;
Target : File_Type;
Source_Char : Character;
begin
Open (Source, In_File, "test.txt");
Create (Target, Out_File, "converted.txt");
while not End_Of_File (Source) loop
Get (Source, Source_Char);
case Source_Char is
when 'ä' =>
Put (Target, "ä");
when 'Ä' =>
Put (Target, "Ä");
when 'ö' =>
Put (Target, "ö");
when 'Ö' =>
Put (Target, "Ö");
when 'ü' =>
Put (Target, "ü");
when 'Ü' =>
Put (Target, "Ü");
when 'ß' =>
Put (Target, "ß");
when others =>
Put (Target, Source_Char);
end case;
end loop;
Close (Source);
Close (Target);
end Entity_Converter;
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
结果取决于源文本和测试文件的编码。
要解决前者,请使用包
Ada.Characters.Latin_1
:后者取决于您的编辑器。
The result depends on the encoding of both the source text, as well as the test file.
To address the former, use the constants of the package
Ada.Characters.Latin_1
:The latter depends on your editor.
我在 Mac 上运行并复制了您的源代码。当我编译它时,它抱怨(例如)
'ä'
需要双引号;暗示来源使用宽字符。看起来它是UTF-8[1]的,所以我用-gnatW8
编译,看起来成功了。然后,我在其源文本的副本上运行该程序,它无法转换文本。
使用
-gnatdg
进行编译,这使得 GNAT 生成其内部源代码树的表示,在我看来,GNAT 似乎已经读取了
ä
的 UTF-8 编码并且使用 Latin-1 版本作为 case 语句;鉴于它说的是Character
,这并非不合理,并且足以解释为什么它无法自行转换。然后我尝试使用
Ada.Wide_Text_IO
和Wide_Character
。遗憾的是,该计划失败了,原因与之前相同。我们可以看看一个功能吗?甚至是一个错误?[1] 当然,由于我下载它的迂回方式,该文件可能最终以 UTF-8 格式结束。
I’m running on a Mac and I copied your source. When I compiled it, it complained that (for example)
’ä’
needed double quotes; a hint that the source uses wide characters. It seems it’s in UTF-8[1], so I compiled with-gnatW8
, which appeared to be successful.I then ran the program on a copy of its own source text, and it failed to transform the text.
Compiling with
-gnatdg
, which makes GNAT produce a representation of its internal source tree, I getwhich looks to me as though GNAT has read the UTF-8 encoding of
ä
and used the Latin-1 version for the case statement; not unreasonable given that it saysCharacter
, and quite enough to explain why it failed to convert itself.I then tried using
Ada.Wide_Text_IO
andWide_Character
. Sadly the program failed, for the same reason as before. Could we be looking at a feature? or even a bug?[1] The file may have ended up in UTF-8 because of the roundabout way I downloaded it, of course.