Tcl 用于获取字符串中每个字符的 ASCII 代码

发布于 2024-08-10 05:18:22 字数 1177 浏览 3 评论 0原文

我需要获取字符串中每个字符的 ASCII 字符。实际上它是一个(小)文件中的每个字符。以下前 3 行成功地将文件的所有内容提取到字符串中(根据此食谱):

set fp [open "store_order_create_ddl.sql" r]
set data [read $fp]
close $fp

我相信我我正确识别了字符的 ASCII 代码(请参阅 http://wiki.tcl.tk/1497) 。但是,我在弄清楚如何循环字符串中的每个字符时遇到问题。

首先,我不认为以下是使用 Tcl 循环字符串中的字符的特别惯用的方法。其次,更重要的是,它的行为不正确,在每个字符之间插入了额外的元素。

下面是我编写的代码,用于作用于上面设置的“data”变量的内容,后面是一些示例输出。

代码:

for {set i 0} {$i < [string length $data]} {incr i} {
  set char [string index $data $i]
  scan $char %c ascii
  puts "char: $char (ascii: $ascii)"
}

输出:

char: C (ascii: 67)
char:  (ascii: 0)
char: R (ascii: 82)
char:  (ascii: 0)
char: E (ascii: 69)
char:  (ascii: 0)
char: A (ascii: 65)
char:  (ascii: 0)
char: T (ascii: 84)
char:  (ascii: 0)
char: E (ascii: 69)
char:  (ascii: 0)
char:   (ascii: 32)
char:  (ascii: 0)
char: T (ascii: 84)
char:  (ascii: 0)
char: A (ascii: 65)
char:  (ascii: 0)
char: B (ascii: 66)
char:  (ascii: 0)
char: L (ascii: 76)
char:  (ascii: 0)
char: E (ascii: 69)

I need to get the ASCII character for every character in a string. Actually its every character in a (small) file. The following first 3 lines successfully pull all a file's contents into a string (per this recipe):

set fp [open "store_order_create_ddl.sql" r]
set data [read $fp]
close $fp

I believe I am correctly discerning the ASCII code for the characters (see http://wiki.tcl.tk/1497). However I'm having a problem figuring out how to loop over every character in the string.

First of all I don't think the following is an especially idiomatic way of looping over characters in a string with Tcl. Second and more importantly, it behaves incorrectly, inserting an extra element between every character.

Below is the code I've written to act on the contents of the "data" variable set above, followed by some sample output.

CODE:

for {set i 0} {$i < [string length $data]} {incr i} {
  set char [string index $data $i]
  scan $char %c ascii
  puts "char: $char (ascii: $ascii)"
}

OUTPUT:

char: C (ascii: 67)
char:  (ascii: 0)
char: R (ascii: 82)
char:  (ascii: 0)
char: E (ascii: 69)
char:  (ascii: 0)
char: A (ascii: 65)
char:  (ascii: 0)
char: T (ascii: 84)
char:  (ascii: 0)
char: E (ascii: 69)
char:  (ascii: 0)
char:   (ascii: 32)
char:  (ascii: 0)
char: T (ascii: 84)
char:  (ascii: 0)
char: A (ascii: 65)
char:  (ascii: 0)
char: B (ascii: 66)
char:  (ascii: 0)
char: L (ascii: 76)
char:  (ascii: 0)
char: E (ascii: 69)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

蓝梦月影 2024-08-17 05:18:22

以下代码应该可以工作:

set data {CREATE TABLE}
foreach char [split $data ""] {
    lappend output [scan $char %c]
}
set output ;# 67 82 69 65 84 69 32 84 65 66 76 69

就输出中的额外字符而言,问题似乎出在文件中的输入数据上。是否有某种原因导致文件中的每个字符之间存在空字符 (\0)?

The following code should work:

set data {CREATE TABLE}
foreach char [split $data ""] {
    lappend output [scan $char %c]
}
set output ;# 67 82 69 65 84 69 32 84 65 66 76 69

As far as the extra characters in your output, it seems like the problem is with your input data from the file. Is there some reason there would be null characters (\0) in between every character in the file?

酒解孤独 2024-08-17 05:18:22

在寻找其他东西时遇到了这个较旧的问题.. 为了其他可能正在寻找此问题答案的人的利益而回答它..

首先,了解字符编码是什么。示例中的源数据不是 ASCII 字符编码,因此 ASCII 字符代码(代码 0-127)实际上没有任何意义 - 除了本例之外,编码似乎是 UTF-16,其中包括 ASCII 代码作为子集。您可能想要的是从 0 到 255 的全范围“字符”代码,但根据您的系统、数据源等,代码 128-255 可能是 ANSI、ISO 或其他一些奇怪的代码页。您要做的是将数据转换为您知道如何处理的格式,例如非常常见的 ISO 8859-1 代码(编码“iso8859-1”),它与 Windows 1252 标准编码(编码“iso8859-1”)非常相似cp1252"),或使用“encoding”命令的 UTF-8(编码“utf-8”):

set data [encoding Convertto utf-8 $data] ;# For UTF-8

set data [encoding Convertto iso8859-1 $data ] ;# 对于 ISO 8859-1

等。如果您正在从文件中读取数据,则可能需要在读取数据之前设置文件编码(通过 fconfigure),以确保正确读取文件数据。有关处理字符集编码的更多详细信息,请查找“encoding”(和“fconfigure”)的手册页。

一旦您控制了数据的编码,示例代码的其余部分就应该按预期工作。

Came across this older question while looking for something else.. Going to answer it for the benefit of anyone else who may be looking for an answer to this question..

First off, understand what character encodings are. The source data in the example is NOT ASCII character encoding, so the ASCII character codes (codes 0-127) really have no meaning--Except in this example, the encoding appears to be UTF-16, which includes ASCII codes as a subset. What you probably want is the full range of "character" codes from 0 to 255, but depending on your system, the source of the data, etc, codes 128-255 may be ANSI, ISO, or some other strange code page. What you want to do is convert the data in to a format you know how to handle, such as the very common ISO 8859-1 code (encoding "iso8859-1"), which is very similar to Windows 1252 standard encoding (encoding "cp1252"), or UTF-8 (encoding "utf-8") with the "encoding" command:

set data [encoding convertto utf-8 $data] ;# For UTF-8

set data [encoding convertto iso8859-1 $data] ;# For ISO 8859-1

and so on. If you're reading the data from a file, you may want to set the file encoding (via fconfigure) prior to reading the data as well, to make sure you're reading the file data correctly. Look up the man pages for "encoding" (and "fconfigure") for more details on handing character set encoding.

Once you have the encoding of the data under control, the rest of the example code should work as expected.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文