当前位置：文江博客话题详情

如何在没有C库中的PRONTF的情况下在装配级别编程中打印一个整数？（ITOA，整数到小数ASCII字符串）

发布于 2025-01-22 09:29:12 字数 353 浏览 0 评论 0 原文

谁能告诉我纯粹的汇编代码，用于以小数格式显示寄存器的值？请不要建议使用printf hack，然后使用GCC编译。

描述：

嗯，我对NASM进行了一些研究并进行了一些实验，并认为我可以使用C库中的PrintF函数来打印整数。我这样做是通过用GCC编译器编译对象文件的，并且一切正常。

但是，我要实现的是打印以十进制形式的任何寄存器中存储的值。

我进行了一些研究，并认为DOS命令行的中断向量021H可以显示字符串和字符，而2或9位于AH寄存器中，并且数据在DX中。

结论：

我发现的一个示例均未显示如何以小数形式显示寄存器的内容值，而无需使用C库的printf。有人知道如何在集会中做到这一点吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

倚栏听风 2025-01-29 09:29:12

您需要编写一个二进制以进行十进制转换程序，然后使用小数位数来产生“数字字符”以打印。

您必须假设某个地方会在您选择的输出设备上打印一个字符。称此子例程为“ print_character”；假设它在eax中采用字符代码并保留所有寄存器。（如果您没有这样的子例程，则有一个其他问题，应该是另一个问题的基础）。

如果您在寄存器（例如，eax）中具有数字的二进制代码（例如，值为0-9），则可以通过添加“零”字符的ASCII代码将该值转换为数字的字符到寄存器。这很简单：

       add     eax, 0x30    ; convert digit in EAX to corresponding character digit

然后您可以调用print_character以打印数字字符代码。

要输出任意值，您需要挑选数字并打印它们。

从根本上挑选数字需要使用十大的力量。最容易使用十大功能，例如10本身。想象一下，我们有一个分裂的逐日习惯，它在EAX中具有价值，并在EDX中产生了商，并在EAX中产生了其余部分。我将其作为练习，让您弄清楚如何实施这种例行程序。

然后，一个简单的例程正确的想法是为该值可能拥有的所有数字产生一个数字。 32位寄存器将值存储至40亿，因此您可能会得到10位印刷。因此：

         mov    eax, valuetoprint
         mov    ecx, 10        ;  digit count to produce
loop:    call   dividebyten
         add    eax, 0x30
         call   printcharacter
         mov    eax, edx
         dec    ecx
         jne    loop

这有效...但是以相反的顺序打印数字。哎呀！好吧，我们可以利用下降堆栈来存储生产的数字，然后以相反的顺序弹出：

         mov    eax, valuetoprint
         mov    ecx, 10        ;  digit count to generate
loop1:   call   dividebyten
         add    eax, 0x30
         push   eax
         mov    eax, edx
         dec    ecx
         jne    loop1
         mov    ecx, 10        ;  digit count to print
loop2:   pop    eax
         call   printcharacter
         dec    ecx
         jne    loop2

将其作为练习给读者：抑制领先的零。另外，由于我们将数字字符写入内存，而不是将其写入堆栈，我们可以将它们写入缓冲区，然后打印缓冲区内容。也留给读者作为练习。

You need to write a binary to decimal conversion routine, and then use the decimal digits to produce "digit characters" to print.

You have to assume that something, somewhere, will print a character on your output device of choice. Call this subroutine "print_character"; assumes it takes a character code in EAX and preserves all the registers.. (If you don't have such a subroutine, you have an additional problem that should be the basis of a different question).

If you have the binary code for a digit (e.g., a value from 0-9) in a register (say, EAX), you can convert that value to a character for the digit by adding the ASCII code for the "zero" character to the register. This is as simple as:

       add     eax, 0x30    ; convert digit in EAX to corresponding character digit

You can then call print_character to print the digit character code.

To output an arbitrary value, you need to pick off digits and print them.

Picking off digits fundamentally requires working with powers of ten. It is easiest to work with one power of ten, e.g., 10 itself. Imagine we have a divide-by-10 routine that took a value in EAX, and produced a quotient in EDX and a remainder in EAX. I leave it as an exercise for you to figure out how to implement such a routine.

Then a simple routine with the right idea is to produce one digit for all digits the value might have. A 32 bit register stores values to 4 billion, so you might get 10 digits printed. So:

         mov    eax, valuetoprint
         mov    ecx, 10        ;  digit count to produce
loop:    call   dividebyten
         add    eax, 0x30
         call   printcharacter
         mov    eax, edx
         dec    ecx
         jne    loop

This works... but prints the digits in reverse order. Oops! Well, we can take advantage of the pushdown stack to store digits produced, and then pop them off in reverse order:

         mov    eax, valuetoprint
         mov    ecx, 10        ;  digit count to generate
loop1:   call   dividebyten
         add    eax, 0x30
         push   eax
         mov    eax, edx
         dec    ecx
         jne    loop1
         mov    ecx, 10        ;  digit count to print
loop2:   pop    eax
         call   printcharacter
         dec    ecx
         jne    loop2

Left as an exercise to the reader: suppress leading zeros. Also, since we are writing digit characters to memory, instead of writing them to the stack we could write them to a buffer, and then print the buffer content. Also left as an exercise to the reader.

回复收藏 0 原文

呆° 2025-01-29 09:29:12

您需要手动将二进制整数变成ASCII小数位数的字符串/数组。 ascii数字由范围内的1个字节整数表示'0'0'（0x30）到'9'（0x39）。 http://www.asciatible.com/

有关Hex的Power-of-2基础，请参见如何将二进制整数号转换为十六进制字符串？ 2基底底座可以进行更多的优化和简化，因为每组位映射分别为十六进制 /八分位数。

大多数操作系统 /环境没有接受整数并将其转换为小数的系统调用。您必须自己执行此操作，然后再将字节发送到OS，或将其复制到视频内存中，或在视频内存中绘制相应的字体字形...

到目前为止，最有效的方法是进行单个系统调用整个字符串一次，因为编写8个字节的系统调用基本上与编写1个字节相同。

这意味着我们需要一个缓冲区，但这根本不会增加我们的复杂性。 2^32-1仅为4294967295，只有10个小数位数。我们的缓冲区不需要大，因此我们可以使用堆栈。

通常的算法会产生数字LSD-First（首先是数字最低）。由于打印顺序是MSD-优先，因此我们可以从缓冲区的末端开始，然后向后工作。要在其他地方打印或复制，只需跟踪它的开始的位置即可，不要为将其设置为固定缓冲区的开始即可。无需弄乱推送/弹出即可扭转任何东西，只需首先将其向后产生即可。

char *itoa_end(unsigned long val, char *p_end) {
  const unsigned base = 10;
  char *p = p_end;
  do {
    *--p = (val % base) + '0';
    val /= base;
  } while(val);                  // runs at least once to print '0' for val=0.

  // write(1, p,  p_end-p);
  return p;  // let the caller know where the leading digit is
}

gcc/clang做得很好，使用魔术常数乘数而不是 div 有效除以10。（ godbolt编译器资源管理器用于ASM输出）。

this code-review q＆amp; a 将字符串累积到8字节寄存器而不是内存中的内容，您希望字符串启动而无需额外复制。

处理签名的整数：

在未符号的绝对值上使用此算法。（ val = val＆lt; 0？0u-val：val; val; ，即xor-Zero/ sub / cmovs ，可以使原始值保持周围；

这是一个简单的评论的NASM版本，使用 div （缓慢但较短的代码），适用于32位未签名的整数和Linux write System Call。 仅通过将寄存器更改为 ecx 而不是 rcx ，就可以轻松地将其移植到32位模式代码。但是添加RSP，24 将变成添加ESP，20 ，因为 push ecx 只有4个字节，而不是8个。（您还应保存/还原/还原 ESI 对于通常的32位调用约定，除非您将其用于宏或内部使用功能。）

System-call零件特定于64位Linux。将其替换为适合您的系统的任何内容，例如，在32位Linux上调用VDSO页面以进行有效的系统调用，或直接使用 int 0x80 直接用于效率低下的系统调用。请参阅呼叫32和32和32的公约64位系统在UNIX/Linux上调用。或参见在32位 int 0x80 版本的另一个问题上，以相同的方式工作。

如果您只需要不打印字符串， rsi 在离开循环后指向第一个数字。您可以将其从TMP缓冲区复制到实际需要的任何地方的开始。或者，如果您将其直接生成最终目的地（例如通过指针ARG），则可以与领先的零一起使用，直到到达剩下的空间的前面为止。除非您始终将零以达到固定宽度，否则没有简单的方法可以在开始之前找出要有多少位数字。

ALIGN 16
; void print_uint32(uint32_t edi)
; x86-64 System V calling convention.  Clobbers RSI, RCX, RDX, RAX.
; optimized for simplicity and compactness, not speed (DIV is slow)
global print_uint32
print_uint32:
    mov    eax, edi              ; function arg

    mov    ecx, 0xa              ; base 10
    push   rcx                   ; ASCII newline '\n' = 0xa = base
    mov    rsi, rsp
    sub    rsp, 16               ; not needed on 64-bit Linux, the red-zone is big enough.  Change the LEA below if you remove this.

;;; rsi is pointing at '\n' on the stack, with 16B of "allocated" space below that.
.toascii_digit:                ; do {
    xor    edx, edx
    div    ecx                   ; edx=remainder = low digit = 0..9.  eax/=10
                                 ;; DIV IS SLOW.  use a multiplicative inverse if performance is relevant.
    add    edx, '0'
    dec    rsi                 ; store digits in MSD-first printing order, working backwards from the end of the string
    mov    [rsi], dl

    test   eax,eax             ; } while(x);
    jnz  .toascii_digit
;;; rsi points to the first digit


    mov    eax, 1               ; __NR_write from /usr/include/asm/unistd_64.h
    mov    edi, 1               ; fd = STDOUT_FILENO
    ; pointer already in RSI    ; buf = last digit stored = most significant
    lea    edx, [rsp+16 + 1]    ; yes, it's safe to truncate pointers before subtracting to find length.
    sub    edx, esi             ; RDX = length = end-start, including the \n
    syscall                     ; write(1, string /*RSI*/,  digits + 1)

    add  rsp, 24                ; (in 32-bit: add esp,20) undo the push and the buffer reservation
    ret

公共领域。随意将其复制/粘贴到您正在从事的任何工作中。如果破裂，您就可以保留这两块。（如果性能很重要，请参见下面的链接；您需要一个乘法倒数而不是 div 。）

，此处的代码以循环计数为0（包括0）。将其放在同一文件中很方便。

ALIGN 16
global _start
_start:
    mov    ebx, 100
.repeat:
    lea    edi, [rbx + 0]      ; put +whatever constant you want here.
    call   print_uint32
    dec    ebx
    jge   .repeat


    xor    edi, edi
    mov    eax, 231
    syscall                             ; sys_exit_group(0)

与使用 strace 组装并链接，

yasm -felf64 -Worphan-labels -gdwarf2 print-integer.asm &&
ld -o print-integer print-integer.o

./print_integer
100
99
...
1
0

以查看该程序制作的唯一系统调用是 write（） and exit（）。（另请参见 x86 tag wiki和其他链接。

）

'x86'“ rel =” tag” aria-labelledby =“ tag-x86-tooltip- container .com/Question/28524535/add-2-numbers and-print-the-insult-used-insembly-x86/28524951＃28524951“> 32位版本的此 ，，使用 int 0x80 with 在最后调用。几乎相同的循环。
带有 printf - 如何在汇编中打印一个数字？ x86-64和i386答案。
nasm组装转换为整数吗？是其他方向吗？是其他方向吗？ string-＆gt; int 。
将整数打印为具有AT＆amp; t语法的字符串，使用Linux系统调用而不是printf - at＆amp; t版本的同一东西（但适用于64位整数）。有关性能的更多评论，请参见此处，以及 div 与编译器生成的代码的基准，使用 mul 。
与此非常相似的32位版本。
this code-review q＆amp; a 会。它将字符串累积到8字节寄存器中，而不是进入内存中，您希望字符串启动而无需额外复制。
如何将二进制整数编号转换为十六进制字符串？ - POWER？ -2基地很特别。包括标量循环（Branchy和Table-lookup）和Simd（SSE2，SSE3，AVX2和AVX512，这是惊人的

答案

。 “ https://lemire.me/blog/2021/11/11/11/11/11/11/11/converting-integers-to-fix-fix-digit-prementations-presentations-quickly/” rel =“ nofollow noreferrer”>没有AVX-512更快 with avx -512 IFMA
neon simd在apple m1上
和一些较旧的文章：如何打印整数非常快博客文章比较了C中的某些策略。
例如 x％100 以创建更多的ILP（指令级别的并行性），以及查找表或更简单的乘法逆（仅必须在有限的范围内工作，例如 this Answer ）以将0..99剩余时间分解为2个小数位。
例如，使用（x * 103）＆gt;＆gt; 10 使用一个 imul r，r，imm8 / shr r，10 ，如另一个答案所示。可能以某种方式将其折叠到其余的计算本身中。
类似的文章。

You need to turn a binary integer into a string/array of ASCII decimal digits manually. ASCII digits are represented by 1-byte integers in the range '0' (0x30) to '9' (0x39). http://www.asciitable.com/

For power-of-2 bases like hex, see How to convert a binary integer number to a hex string? Converting between binary and a power-of-2 base allows many more optimizations and simplifications because each group of bits maps separately to a hex / octal digit.

Most operating systems / environments don't have a system call that accepts integers and converts them to decimal for you. You have to do that yourself before sending the bytes to the OS, or copying them to video memory yourself, or drawing the corresponding font glyphs in video memory...

By far the most efficient way is to make a single system call that does the whole string at once, because a system call that writes 8 bytes is basically the same cost as writing 1 byte.

This means we need a buffer, but that doesn't add to our complexity much at all. 2^32-1 is only 4294967295, which is only 10 decimal digits. Our buffer doesn't need to be large, so we can just use the stack.

The usual algorithm produces digits LSD-first (Least Significant Digit first). Since printing order is MSD-first, we can just start at the end of the buffer and work backwards. For printing or copying elsewhere, just keep track of where it starts, and don't bother about getting it to the start of a fixed buffer. No need to mess with push/pop to reverse anything, just produce it backwards in the first place.

char *itoa_end(unsigned long val, char *p_end) {
  const unsigned base = 10;
  char *p = p_end;
  do {
    *--p = (val % base) + '0';
    val /= base;
  } while(val);                  // runs at least once to print '0' for val=0.

  // write(1, p,  p_end-p);
  return p;  // let the caller know where the leading digit is
}

gcc/clang do an excellent job, using a magic constant multiplier instead of div to divide by 10 efficiently. (Godbolt compiler explorer for asm output).

This code-review Q&A has a nice efficient NASM version of that which accumulates the string into an 8-byte register instead of into memory, ready store where you want the string to start without extra copying.

To handle signed integers:

Use this algorithm on the unsigned absolute value. (val = val<0 ? 0U-val : val;, i.e. xor-zero / sub / cmovs which keeps the original value around; Godbolt). If the original input was negative, stick a '-' in front at the end, when you're done. So for example, -10 runs this with 10, producing 2 ASCII bytes. Then you store a '-' in front, as a third byte of the string.

Here's a simple commented NASM version of that, using div (slow but shorter code) for 32-bit unsigned integers and a Linux write system call. It should be easy to port this to 32-bit-mode code just by changing the registers to ecx instead of rcx. But add rsp,24 will become add esp, 20 because push ecx is only 4 bytes, not 8. (You should also save/restore esi for the usual 32-bit calling conventions, unless you're making this into a macro or internal-use-only function.)

The system-call part is specific to 64-bit Linux. Replace that with whatever is appropriate for your system, e.g. call the VDSO page for efficient system calls on 32-bit Linux, or use int 0x80 directly for inefficient system calls. See calling conventions for 32 and 64-bit system calls on Unix/Linux. Or see rkhb's answer on another question for a 32-bit int 0x80 version that works the same way.

If you just need the string without printing it, rsi points to the first digit after leaving the loop. You can copy it from the tmp buffer to the start of wherever you actually need it. Or if you generated it into the final destination directly (e.g. pass a pointer arg), you can pad with leading zeros until you reach the front of the space you left for it. There's no simple way to find out how many digits it's going to be before you start unless you always pad with zeros up to a fixed width.

ALIGN 16
; void print_uint32(uint32_t edi)
; x86-64 System V calling convention.  Clobbers RSI, RCX, RDX, RAX.
; optimized for simplicity and compactness, not speed (DIV is slow)
global print_uint32
print_uint32:
    mov    eax, edi              ; function arg

    mov    ecx, 0xa              ; base 10
    push   rcx                   ; ASCII newline '\n' = 0xa = base
    mov    rsi, rsp
    sub    rsp, 16               ; not needed on 64-bit Linux, the red-zone is big enough.  Change the LEA below if you remove this.

;;; rsi is pointing at '\n' on the stack, with 16B of "allocated" space below that.
.toascii_digit:                ; do {
    xor    edx, edx
    div    ecx                   ; edx=remainder = low digit = 0..9.  eax/=10
                                 ;; DIV IS SLOW.  use a multiplicative inverse if performance is relevant.
    add    edx, '0'
    dec    rsi                 ; store digits in MSD-first printing order, working backwards from the end of the string
    mov    [rsi], dl

    test   eax,eax             ; } while(x);
    jnz  .toascii_digit
;;; rsi points to the first digit


    mov    eax, 1               ; __NR_write from /usr/include/asm/unistd_64.h
    mov    edi, 1               ; fd = STDOUT_FILENO
    ; pointer already in RSI    ; buf = last digit stored = most significant
    lea    edx, [rsp+16 + 1]    ; yes, it's safe to truncate pointers before subtracting to find length.
    sub    edx, esi             ; RDX = length = end-start, including the \n
    syscall                     ; write(1, string /*RSI*/,  digits + 1)

    add  rsp, 24                ; (in 32-bit: add esp,20) undo the push and the buffer reservation
    ret

Public domain. Feel free to copy/paste this into whatever you're working on. If it breaks, you get to keep both pieces. (If performance matters, see the links below; you'll want a multiplicative inverse instead of div.)

And here's code to call it in a loop counting down to 0 (including 0). Putting it in the same file is convenient.

ALIGN 16
global _start
_start:
    mov    ebx, 100
.repeat:
    lea    edi, [rbx + 0]      ; put +whatever constant you want here.
    call   print_uint32
    dec    ebx
    jge   .repeat


    xor    edi, edi
    mov    eax, 231
    syscall                             ; sys_exit_group(0)

Assemble and link with

yasm -felf64 -Worphan-labels -gdwarf2 print-integer.asm &&
ld -o print-integer print-integer.o

./print_integer
100
99
...
1
0

Use strace to see that the only system calls this program makes are write() and exit(). (See also the gdb / debugging tips at the bottom of the x86 tag wiki, and the other links there.)

32-bit version of this, using int 0x80 for the write system call at the end. Pretty much the same loop.
With printf - How to print a number in assembly NASM? has x86-64 and i386 answers.
NASM Assembly convert input to integer? is the other direction, string->int.
Printing an integer as a string with AT&T syntax, with Linux system calls instead of printf - AT&T version of the same thing (but for 64-bit integers). See that for more comments about performance, and a benchmark of div vs. compiler-generated code using mul.
Add 2 numbers and print the result using Assembly x86 32-bit version that's very similar to this.
This code-review Q&A uses a multiplicative inverse like a compiler would. And it accumulates the string into an 8-byte register instead of into memory, ready store where you want the string to start without extra copying.
How to convert a binary integer number to a hex string? - power-of-2 bases are special. Answer includes scalar loop (branchy and table-lookup) and SIMD (SSE2, SSSE3, AVX2, and AVX512 which is amazing for this.)

High-performance versions

Some optimized decimal atoi versions from Daniel Lemire's blog: without AVX-512, and much faster with AVX-512 IFMA
With NEON SIMD on Apple M1
and some older articles: How to print integers really fast blog post comparing some strategies in C.
Such as x % 100 to create more ILP (Instruction Level Parallelism), and either a lookup table or a simpler multiplicative inverse (that only has to work for a limited range, like in this answer) to break up the 0..99 remainder into 2 decimal digits.
e.g. with (x * 103) >> 10 using one imul r,r,imm8 / shr r,10 as shown in another answer. Possibly somehow folding that in to the remainder calculation itself.
https://tia.mat.br/posts/2014/06/23/integer_to_string_conversion.html a similar article.

回复收藏 0 原文

那支青花 2025-01-29 09:29:12

无法发表评论，所以我发布了以这种方式回复。
@ira baxter，完美的答案，我只想补充说，您不需要在您发布的10次将注册CX设置为10次时分配10次。只需在ax中划分ax == 0

loop1: call dividebyten
       ...
       cmp ax,0
       jnz loop1

。原始数字中有多少位数字。

       mov cx,0
loop1: call dividebyten
       inc cx

无论如何，您ira baxter帮助了我，只有几种方法可以如何优化代码:)

这不仅是关于优化的，而且是格式化的。当您要打印数字54时，您想要打印54不是0000000054 :)

Can't comment so I post reply this way.
@Ira Baxter, perfect answer I just want to add that you don't need to divide 10 times as you posted that you set register cx to value 10. Just divide number in ax until "ax==0"

loop1: call dividebyten
       ...
       cmp ax,0
       jnz loop1

You also have to store how many digits was there in original number.

       mov cx,0
loop1: call dividebyten
       inc cx

Anyway you Ira Baxter helped me there is just few ways how to optimize code :)

This is not only about optimization but also formatting. When you want to print number 54 you want print 54 not 0000000054 :)

回复收藏 0 原文

哆啦不做梦 2025-01-29 09:29:12

1 -9是1 -9。之后，我也必须进行一些转换。假设您在AX（EAX）中有41H，并且您想在不进行服务电话的情况下打印一个65而不是“ A”。我认为您需要打印6和5的角色表示。必须添加一个恒定数字才能到达那里。您需要一个模量运算符（但要在汇编中这样做），并且要循环所有数字。

不确定，但这是我的猜测。

回复收藏 0 原文

~没有更多了~

关于作者

江心雾

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

如何在没有C库中的PRONTF的情况下在装配级别编程中打印一个整数？（ITOA，整数到小数ASCII字符串）

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

处理签名的整数：

）

答案

To handle signed integers:

Related:

High-performance versions

关于作者

相关话题

热门标签

推荐作者

饮湿

明月

02

hs1283

风向决定发型

落花浅忆

友情链接

如何在没有C库中的PRONTF的情况下在装配级别编程中打印一个整数？ （ITOA，整数到小数ASCII字符串）

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

处理签名的整数：

）

答案

To handle signed integers:

Related:

High-performance versions

关于作者

相关话题

热门标签

推荐作者

饮湿

明月

02

hs1283

风向决定发型

落花浅忆

友情链接

如何在没有C库中的PRONTF的情况下在装配级别编程中打印一个整数？（ITOA，整数到小数ASCII字符串）

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。