使用 GAWK 打印千个独立的浮点数
我必须用 gawk 处理一些大文件。 我的主要问题是我必须使用千位分隔符打印一些浮点数。 例如:10000
在输出中应显示为 10.000
,10000,01
应显示为 10.000,01
。
我(和谷歌)想出了这个函数,但是对于浮点数来说这失败了:
function commas(n) {
gsub(/,/,"",n)
point = index(n,".") - 1
if (point < 0) point = length(n)
while (point > 3) {
point -= 3
n = substr(n,1,point)"."substr(n,point + 1)
}
sub(/-\./,"-",n)
return d n
}
但是它对于浮点数失败了。
现在我正在考虑将输入拆分为整数和 << 1 部分,然后格式化整数后再次粘合它们,但是没有更好的方法吗?
免责声明:
- 我不是程序员,
- 我通过一些 SHELL env 知道这一点。 可以设置千位分隔符的变量,但它必须在具有不同语言和/或区域设置的不同环境中工作。
- 英语是我的第二语言,如果我使用不正确,请抱歉
I must process some huge file with gawk. My main problem is that I have to print some floats using thousand separators. E.g.: 10000
should appear as 10.000
and 10000,01
as 10.000,01
in the output.
I (and Google) come up with this function, but this fails for floats:
function commas(n) {
gsub(/,/,"",n)
point = index(n,".") - 1
if (point < 0) point = length(n)
while (point > 3) {
point -= 3
n = substr(n,1,point)"."substr(n,point + 1)
}
sub(/-\./,"-",n)
return d n
}
But it fails with floats.
Now I'm thinking of splitting the input to an integer and a < 1 part, then after formatting the integer gluing them again, but isn't there a better way to do it?
Disclaimer:
- I'm not a programmer
- I know that via some SHELL env. variables the thousand separators can be set, but it must be working in different environments with different lang and/or locale settings.
- English is my 2nd language, sorry if I'm using it incorrectly
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
它因浮点数而失败,因为您传递的是欧洲类型数字(1.000.000,25 表示一百万又四分之一)。 如果您只是更改逗号和句点,您给出的函数应该可以工作。 我首先使用 1000000.25 测试当前版本,看看它是否适用于非欧洲号码。
可以使用“echo 1 | awk -f xx.gawk”来调用以下 awk 脚本,它将向您显示正在运行的“正常”版本和欧洲版本。 它输出:
显然,您只对函数感兴趣,现实世界的代码将使用输入流将值传递给函数,而不是固定字符串。
除了处理逗号和句点之外,这些函数是相同的。 在下面的描述中,我们将它们称为分隔符和小数点:
It fails with floats because you're passing in European type numbers (1.000.000,25 for a million and a quarter). The function you've given should work if you just change over commas and periods. I'd test the current version first with 1000000.25 to see if it works with non-European numbers.
The following awk script can be called with
"echo 1 | awk -f xx.gawk"
and it will show you both the "normal" and European version in action. It outputs:Obviously, you're only interested in the functions, real-world code would use the input stream to pass values to the functions, not a fixed string.
The functions are identical except in their handling of commas and periods. We'll call them separators and decimals in the following description:
要了解Pax的回答:
阅读GNU awk 手册的“转换”部分 明确讨论了以下效果数字类型字符串表示形式的
LOCALE
环境变量。To go with Pax's answer:
Read the "Conversion" section of the GNU awk manual which talks explicitly about the effect of your
LOCALE
environment variable on the string representation of numeric types.