当前位置：文江博客话题详情

BASH 中的数字格式与千位分隔符

发布于 2025-01-07 18:27:34 字数 99 浏览 5 评论 0原文

我有一个号码 12343423455.23353。我想用千位分隔符格式化数字。所以输出将是 12,343,423,455.23353

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

≈。彩虹 2025-01-14 18:27:34

$ printf "%'.3f\n" 12345678.901
12,345,678.901

$ printf "%'.3f\n" 12345678.901
12,345,678.901

回复收藏 0 原文

雨后咖啡店 2025-01-14 18:27:34

tl;dr

使用 numfmt，如果 GNU 实用程序可用，例如默认在 Linux 上：
- numfmt --grouping 12343423455.23353 # -> 12,343,423,455.23353（区域设置 en_US）
否则，使用 printf 以及封装在 shell 函数中的 ' 字段标志< /strong> 保留输入小数位数（不会对输出小数位数进行硬编码）。
groupDigits 12343423455.23353 # -> 12,343,423,455.23353（区域设置 en_US）
请参阅此答案的底部，了解 groupDigits() 的定义，它还支持多个输入数字。
涉及子 shell 的临时替代方案，还保留输入小数位数（假设输入小数标记为 . 或<代码>，）：
- 一种模块化但效率较低的变体，通过stdin接受输入数字（因此也可以与管道输入一起使用）：
  (n=$(
或者，考虑使用我的 Linux/macOS grp CLI （可使用 npm 安装安装-g grp-cli):
- grp -n 12343423455.23353

在所有情况下都有注意事项；见下文。

Ignacio Vazquez-Abrams 的答案包含与 printf 一起使用的关键指针：' 字段标志（在 % 之后）使用活动语言环境的千位分隔符格式化数字：

请注意 man printf (man 1 printf) 本身不包含此信息：实用程序 / shell 内置 printf 最终调用库函数 printf( )，并且只有 man 3 printf 给出了受支持格式的完整情况。
环境变量 LC_NUMERIC 以及间接的 LANG 或 LC_ALL 控制与数字格式相关的活动区域设置。
numfmt 和 printf 都尊重活动区域设置，无论是千位分隔符还是小数点（“小数点”）。
单独使用 printf 本身（如 Ignacio 的答案）要求您对输出小数位数进行硬编码，而不是保留多少位输入的小数位；下面的groupDigits() 正是克服了这个限制。
然而，printf "%'.f" 确实比 numfmt --grouping 有一个优势：
- numfmt 仅接受十进制 数字，而 printf 的 %f 也接受十六进制< /em> 整数（例如，0x3e8）和十进制数科学记数法（例如，1e3）。

注意事项

不进行分组的区域设置：某些区域设置，特别是 C 和 POSIX，根据定义不应用分组，因此使用 ' 在该事件中无效。
跨平台的实际区域设置不一致：
- (LC_ALL='de_DE.UTF-8'; printf "%'.1f\n" 1000) # 应输出：1.000,0
- Linux：产生 1.000,0，如预期。
- macOS/BSD：意外产生 1000,0 - 没有分组(!)。
输入数字格式：当您将数字传递给 numfmt 或 printf 时，它：
- 不得已包含数字分组
- 必须已使用活动区域设置的小数点
- 例如：
  - (LC_ALL='lt_LT.UTF-8'; printf "%'.1f\n" 1000,1) # -> '1 000,1'
  - 确定：输入数字未分组并使用立陶宛小数点（逗号）。
可移植性：POSIX 不需要printf 实用程序（与 C printf() 库相反function）来支持浮点格式字符，例如 %f，假设 POSIX[-like] shell 仅支持整数；然而，在实践中，我不知道有任何 shell/平台不这样做。
舍入错误和溢出：
- 当按照所述使用 numfmt 和 printf 时，会发生往返转换（字符串 -> 数字 -> 字符串），这会产生舍入错误；换句话说：使用数字分组重新格式化可能会导致不同的数字。
- 使用格式字符f来采用IEEE-754双精度精度浮点值，仅最多 15 显着数字（无论小数点位置如何的数字）保证被准确保留（尽管对于特定数字，它可能适用于更多数字）。 ^{实际上，numfmt 和 GNU printf 可以准确地处理更多；见下文。如果有人知道如何以及为什么，请告诉我。}
- 由于存在太多有效数字或太大的值，numfmt 和 printf 之间的行为通常有所不同，并且 >跨平台的 printf 实现；例如：

numft：

[已在 coreutils 8.24 中修复，根据 @pixelbeat] 从 20 个有效数字开始，该值会悄悄溢出（！） - 可能是一个错误（从 GNU coreutils 8.23 开始）：

# 20 significant digits cause quiet overflow:
$ (fractPart=0000000000567890; num="1000.${fractPart}"; numfmt --grouping "$num")
-92.23372036854775807    # QUIET OVERFLOW

相比之下，数字太大默认情况下确实生成错误。

printf：

Linux printf 可以准确处理最多 20 位有效数字，而 BSD/macOS 实现仅限于 17 位：

# Linux: 21 significant digits cause rounding error:
$  (fractPart=00000000005678901; num="1000.${fractPart}"; printf "%'.${#fractPart}f\n" "$num")
1,000.00000000005678902  # ROUNDING ERROR

# BSD/macOS: 18 significant digits cause rounding error:
$  (fractPart=00000000005678; num="1000.${fractPart}"; printf "%'.${#fractPart}f\n" "$num")
1,000.00000000005673  # ROUNDING ERROR

Linux 版本似乎永远不会溢出，而 BSD/macOS 版本则报告数字太大的错误。

Bash shell 函数`groupDigits()`：

# SYNOPSIS
#   groupDigits num ...
# DESCRIPTION
#   Formats the specified number(s) according to the rules of the
#   current locale in terms of digit grouping (thousands separators).
#   Note that input numbers
#     - must not already be digit-grouped themselves,
#     - must use the *current* locale's decimal mark.
#   Numbers can be integers or floats.
#   Processing stops at the first number that can't be formatted, and a
#   non-zero exit code is returned.
# CAVEATS
#   - No input validation is performed.
#   - printf(1) is not guaranteed to support non-integer formats by POSIX,
#     though not doing so is rare these days.
#   - Round-trip number conversion is involved (string > double > string)
#     so rounding errors can occur.
# EXAMPLES
#   groupDigits 1000 # -> '1,000'
#   groupDigits 1000.5 # -> '1,000.5'
#   (LC_ALL=lt_LT.UTF-8; groupDigits 1000,5) # -> '1 000,5'
groupDigits() {
  local decimalMark fractPart
  decimalMark=$(printf "%.1f" 0); decimalMark=${decimalMark:1:1}
  for num; do
    fractPart=${num##*${decimalMark}}; [[ "$num" == "$fractPart" ]] && fractPart=''
    printf "%'.${#fractPart}f\n" "$num" || return
  done
}

tl;dr

Use numfmt, if GNU utilities are available, such as on Linux by default:
- numfmt --grouping 12343423455.23353 # -> 12,343,423,455.23353 in locale en_US
Otherwise, use printf with the ' field flag wrapped in a shell function that preserves the number of input decimal places (does not hard-code the number of output decimal places).
- groupDigits 12343423455.23353 # -> 12,343,423,455.23353 in locale en_US
- See the bottom of this answer for the definition of groupDigits(), which also supports multiple input numbers.
Ad-hoc alternatives involving subshells that also preserve the number of input decimal places (assumes that the input decimal mark is either . or ,):
- A modular, but somewhat inefficient variant that accepts the input number via stdin (and can therefore also be used with pipeline input):
  (n=$(</dev/stdin); f=${n#*[.,]}; printf "%'.${#f}f\n" "$n") <<<12343423455.23353
- Significantly faster, but less modular alternative that uses intermediate variable $n: n=12343423455.23353; (f=${n#*[.,]} printf "%'.${#f}f\n" "$n")
Alternatively, consider use of my Linux/macOS grp CLI (installable with npm install -g grp-cli):
- grp -n 12343423455.23353

In all cases there are caveats; see below.

Ignacio Vazquez-Abrams's answer contains the crucial pointer for use with printf: the ' field flag (following the %) formats a number with the active locale's thousand separator:

Note that man printf (man 1 printf) does not contain this information itself: the utility / shell builtin printf ultimately calls the library function printf(), and only man 3 printf gives the full picture with respect to supported formats.
Environment variables LC_NUMERIC and, indirectly, LANG or LC_ALL control the active locale with respect to number formatting.
Both numfmt and printf respect the active locale, both with respect to the thousands separator and the decimal mark ("decimal point").
Using just printf by itself, as in Ignacio's answer, requires that you hard-code the number of output decimal places, rather than preserving however many decimal places the input has; it is this limitation that groupDigits() below overcomes.
printf "%'.<numDecPlaces>f" does have one advantage over numfmt --grouping, however:
- numfmt only accepts decimal numbers, whereas printf's %f also accepts hexadecimal integers (e.g., 0x3e8) and numbers in decimal scientific notation (e.g., 1e3).

Caveats

Locales without grouping: Some locales, notably C and POSIX, by definition do NOT apply grouping, so use of ' has no effect in that event.
Real-world locale inconsistencies across platforms:
- (LC_ALL='de_DE.UTF-8'; printf "%'.1f\n" 1000) # SHOULD yield: 1.000,0
- Linux: yields 1.000,0, as expected.
- macOS/BSD: Unexpectedly yields 1000,0 - NO grouping(!).
Input number format: When you pass a number to numfmt or printf, it:
- mustn't already contain digit grouping
- must already use the active locale's decimal mark
- For example:
  - (LC_ALL='lt_LT.UTF-8'; printf "%'.1f\n" 1000,1) # -> '1 000,1'
  - OK: input number is not grouped and uses Lithuanian decimal mark (comma).
Portability: POSIX doesn't require the printf utility (as opposed to the C printf() library function) to support floating-point format characters such as %f, given that POSIX[-like] shells are integer-only; in practice, however, I'm not aware of any shells/platforms that do not.
Rounding errors and overflow:
- When using numfmt and printf as described, round-trip conversion occurs (string -> number -> string), which is subject to rounding errors; in other words: reformatting with digit grouping can lead to a different number.
- Using format character f to employ IEEE-754 double-precision floating-point values, only up to 15 significant digits (digits irrespective of the location of the decimal mark) are guaranteed to be accurately preserved (though for specific numbers it may work with more digits). ^{In practice, numfmt and GNU printf can accurately handle more than that; see below. If anyone knows how and why, let me know.}
- With too many significant digits or too-large a value present, the behavior differs between numfmt and printf in general, and between printf implementations across platforms; for example:

numft:

[Fixed in coreutils 8.24, according to @pixelbeat] Starting with 20 significant digits, the value overflows quietly(!) - presumably a bug (as of GNU coreutils 8.23):

# 20 significant digits cause quiet overflow:
$ (fractPart=0000000000567890; num="1000.${fractPart}"; numfmt --grouping "$num")
-92.23372036854775807    # QUIET OVERFLOW

By contrast, a number that is too large does generate an error by default.

printf:

Linux printf handles up to 20 significant digits accurately, whereas the BSD/macOS implementation is limited to 17:

# Linux: 21 significant digits cause rounding error:
$  (fractPart=00000000005678901; num="1000.${fractPart}"; printf "%'.${#fractPart}f\n" "$num")
1,000.00000000005678902  # ROUNDING ERROR

# BSD/macOS: 18 significant digits cause rounding error:
$  (fractPart=00000000005678; num="1000.${fractPart}"; printf "%'.${#fractPart}f\n" "$num")
1,000.00000000005673  # ROUNDING ERROR

The Linux version never seems to overflow, whereas the BSD/macOS version reports an error with numbers that are too large.

Bash shell function `groupDigits()`:

# SYNOPSIS
#   groupDigits num ...
# DESCRIPTION
#   Formats the specified number(s) according to the rules of the
#   current locale in terms of digit grouping (thousands separators).
#   Note that input numbers
#     - must not already be digit-grouped themselves,
#     - must use the *current* locale's decimal mark.
#   Numbers can be integers or floats.
#   Processing stops at the first number that can't be formatted, and a
#   non-zero exit code is returned.
# CAVEATS
#   - No input validation is performed.
#   - printf(1) is not guaranteed to support non-integer formats by POSIX,
#     though not doing so is rare these days.
#   - Round-trip number conversion is involved (string > double > string)
#     so rounding errors can occur.
# EXAMPLES
#   groupDigits 1000 # -> '1,000'
#   groupDigits 1000.5 # -> '1,000.5'
#   (LC_ALL=lt_LT.UTF-8; groupDigits 1000,5) # -> '1 000,5'
groupDigits() {
  local decimalMark fractPart
  decimalMark=$(printf "%.1f" 0); decimalMark=${decimalMark:1:1}
  for num; do
    fractPart=${num##*${decimalMark}}; [[ "$num" == "$fractPart" ]] && fractPart=''
    printf "%'.${#fractPart}f\n" "$num" || return
  done
}

回复收藏 0 原文

~没有更多了~