将 Python 浮点数转换为字符串而不丢失精度

发布于 2024-09-14 15:23:36 字数 923 浏览 8 评论 0原文

我正在维护一个 Python 脚本，它使用 xlrd 从 Excel 电子表格中检索值，然后用它们执行各种操作。电子表格中的某些单元格是高精度数字，它们必须保持原样。当检索这些单元格之一的值时，xlrd 会给出一个float，例如 0.38288746115497402。

但是，我需要稍后在代码中将此值放入字符串中。执行 str(value) 或 unicode(value) 将返回类似“0.382887461155”的内容。要求说这是不可接受的；需要保持精度。

到目前为止我已经尝试了几件事但没有成功。第一个是使用字符串格式化东西：

data = "%.40s" % (value) 
data2 = "%.40r" % (value)

但两者都会产生相同的舍入数字“0.382887461155”。

在 SO 和互联网上其他地方搜索有类似问题的人后，一个常见的建议是使用 Decimal 类。但我无法改变向我提供数据的方式（除非有人知道使 xlrd 返回小数的秘密方法）。当我尝试这样做时：

data = Decimal(value)

我收到一个 TypeError: Cannot conversion float to Decimal。首先将浮点转换为字符串。 但显然我不能将其转换为字符串，否则我会失去精度。

所以，是的，我愿意接受任何建议——如果有必要的话，甚至是非常粗俗/黑客的建议。我对 Python 没有太多经验（我自己更像是一个 Java/C# 人），所以如果我在这里有某种基本的误解，请随时纠正我。

编辑：只是想补充一点，我正在使用Python 2.6.4。我不认为有任何正式的要求阻止我更改版本；它只需要不弄乱任何其他代码即可。

原文

I am maintaining a Python script that uses xlrd to retrieve values from Excel spreadsheets, and then do various things with them. Some of the cells in the spreadsheet are high-precision numbers, and they must remain as such. When retrieving the values of one of these cells, xlrd gives me a float such as 0.38288746115497402.

However, I need to get this value into a string later on in the code. Doing either str(value) or unicode(value) will return something like "0.382887461155". The requirements say that this is not acceptable; the precision needs to be preserved.

I've tried a couple things so far to no success. The first was using a string formatting thingy:

data = "%.40s" % (value) 
data2 = "%.40r" % (value)

But both produce the same rounded number, "0.382887461155".

Upon searching around for people with similar problems on SO and elsewhere on the internet, a common suggestion was to use the Decimal class. But I can't change the way the data is given to me (unless somebody knows of a secret way to make xlrd return Decimals). And when I try to do this:

data = Decimal(value)

I get a TypeError: Cannot convert float to Decimal. First convert the float to a string. But obviously I can't convert it to a string, or else I will lose the precision.

So yeah, I'm open to any suggestions -- even really gross/hacky ones if necessary. I'm not terribly experienced with Python (more of a Java/C# guy myself) so feel free to correct me if I've got some kind of fundamental misunderstanding here.

EDIT: Just thought I would add that I am using Python 2.6.4. I don't think there are any formal requirements stopping me from changing versions; it just has to not mess up any of the other code.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

完美的未来在梦里 2024-09-21 15:23:36

我是xlrd的作者。其他答案和评论中有太多混乱需要在评论中反驳，所以我在答案中这样做。

@katriealex：“”“xlrd 的内部失去了精确性”“”——完全没有根据和不真实。 xlrd 准确地再现存储在 XLS 文件中的 64 位浮点数。

@katriealex：“”“也许可以修改本地xlrd安装来更改浮动演员”“” ---我不知道你为什么要这样做；浮动 16 位整数不会损失任何精度！在任何情况下，该代码仅在读取 Excel 2.X 文件（具有 INTEGER 类型的单元格记录）时使用。 OP 没有表明他正在阅读如此古老的文件。

@jloubert：你一定是弄错了。 "%.40r" % a_float 只是一种获得与 repr(a_float) 相同答案的巴洛克方式。

@EVERYBODY：您不需要将浮点数转换为小数来保留精度。 repr() 函数的全部要点是保证以下内容：

float(repr(a_float)) == a_float

Python 2.X (X <= 6) repr 给出一个常量 17 位十进制数字的精度，因为它保证能够重现原始值。后来的 Python（2.7、3.1）给出了能够重现原始值的最小十进制数字位数。

Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit (Intel)] on win32
>>> f = 0.38288746115497402
>>> repr(f)
'0.38288746115497402'
>>> float(repr(f)) == f
True

Python 2.7 (r27:82525, Jul  4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)] on win32
>>> f = 0.38288746115497402
>>> repr(f)
'0.382887461154974'
>>> float(repr(f)) == f
True

因此，底线是，如果您想要一个保留 float 对象所有精度的字符串，请使用 preserved = repr(the_float_object) ...稍后通过 float 恢复该值（保留）。就这么简单。不需要十进制模块。

I'm the author of xlrd. There is so much confusion in other answers and comments to rebut in comments so I'm doing it in an answer.

@katriealex: """precision being lost in the guts of xlrd""" --- entirely unfounded and untrue. xlrd reproduces exactly the 64-bit float that's stored in the XLS file.

@katriealex: """It may be possible to modify your local xlrd installation to change the float cast""" --- I don't know why you would want to do this; you don't lose any precision by floating a 16-bit integer!!! In any case that code is used only when reading Excel 2.X files (which had an INTEGER-type cell record). The OP gives no indication that he is reading such ancient files.

@jloubert: You must be mistaken. "%.40r" % a_float is just a baroque way of getting the same answer as repr(a_float).

@EVERYBODY: You don't need to convert a float to a decimal to preserve the precision. The whole point of the repr() function is that the following is guaranteed:

float(repr(a_float)) == a_float

Python 2.X (X <= 6) repr gives a constant 17 decimal digits of precision, as that is guaranteed to reproduce the original value. Later Pythons (2.7, 3.1) give the minimal number of decimal digits that will reproduce the original value.

Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit (Intel)] on win32
>>> f = 0.38288746115497402
>>> repr(f)
'0.38288746115497402'
>>> float(repr(f)) == f
True

Python 2.7 (r27:82525, Jul  4 2010, 09:01:59) [MSC v.1500 32 bit (Intel)] on win32
>>> f = 0.38288746115497402
>>> repr(f)
'0.382887461154974'
>>> float(repr(f)) == f
True

So the bottom line is that if you want a string that preserves all the precision of a float object, use preserved = repr(the_float_object) ... recover the value later by float(preserved). It's that simple. No need for the decimal module.

回复收藏 0 原文

会发光的星星闪亮亮i 2024-09-21 15:23:36

您可以使用 repr() 转换为字符串而不丢失精度，然后转换为 Decimal：

>>> from decimal import Decimal
>>> f = 0.38288746115497402
>>> d = Decimal(repr(f))
>>> print d
0.38288746115497402

You can use repr() to convert to a string without losing precision, then convert to a Decimal:

>>> from decimal import Decimal
>>> f = 0.38288746115497402
>>> d = Decimal(repr(f))
>>> print d
0.38288746115497402

回复收藏 0 原文

初与友歌 2024-09-21 15:23:36

编辑：我错了。我将把这个答案留在这里，以便线程的其余部分有意义，但这不是真的。请参阅上面约翰·梅钦的回答。谢谢大家 =)。

如果上述答案有效，那就太好了——它将为您节省大量令人讨厌的黑客攻击。然而，至少在我的系统上，他们不会。您可以使用例如 ~~进行检查。~~

import sys
print( "%.30f" % sys.float_info.epsilon )

该数字是您的系统可以区分零的最小浮点数。当您执行操作时，任何小于该值的值都可能会从浮点数中随机添加或减去。这意味着，至少在我的 Python 设置中，xlrd 内部的精度会丢失。，而且如果不修改它似乎就无能为力。这很奇怪；我以前就预料到这种情况会发生，但显然没有！

可以修改本地 xlrd 安装来更改 float 转换。打开 site-packages\xlrd\sheet.py 并转到第 1099 行：

...
elif rc == XL_INTEGER:
                    rowx, colx, cell_attr, d = local_unpack('<HH3sH', data)
                    self_put_number_cell(rowx, colx, float(d), self.fixed_BIFF2_xfindex(cell_attr, rowx, colx))
...

注意 float 转换 - 您可以尝试将其更改为 十进制。十进制 看看会发生什么。

EDIT: I am wrong. I shall leave this answer here so the rest of the thread makes sense, but it's not true. Please see John Machin's answer above. Thanks guys =).

If the above answers work that's great -- it will save you a lot of nasty hacking. However, at least on my system, they won't. You can check this with e.g.

import sys
print( "%.30f" % sys.float_info.epsilon )

~~That number is the smallest float that your system can distinguish from zero. Anything smaller than that may be randomly added or subtracted from any float when you perform an operation.~~ This means that, at least on my Python setup, the precision is lost inside the guts of xlrd, and there seems to be nothing you can do without modifying it. Which is odd; I'd have expected this case to have occurred before, but apparently not!

It may be possible to modify your local xlrd installation to change the float cast. Open up site-packages\xlrd\sheet.py and go down to line 1099:

...
elif rc == XL_INTEGER:
                    rowx, colx, cell_attr, d = local_unpack('<HH3sH', data)
                    self_put_number_cell(rowx, colx, float(d), self.fixed_BIFF2_xfindex(cell_attr, rowx, colx))
...

Notice the float cast -- you could try changing that to a decimal.Decimal and see what happens.

回复收藏 0 原文

豆芽 2024-09-21 15:23:36

编辑：清除了我之前的答案，因为它无法正常工作。

我使用的是 Python 2.6.5，这对我有用：

a = 0.38288746115497402
print repr(a)
type(repr(a))    #Says it's a string

注意：这只是转换为字符串。如果需要，您稍后需要自行转换为十进制。

EDIT: Cleared my previous answer b/c it didn't work properly.

I'm on Python 2.6.5 and this works for me:

a = 0.38288746115497402
print repr(a)
type(repr(a))    #Says it's a string

Note: This just converts to a string. You'll need to convert to Decimal yourself later if needed.

回复收藏 0 原文

ゞ花落谁相伴 2024-09-21 15:23:36

正如已经说过的，浮点数根本不精确 - 因此保持精度可能会产生一定的误导。

这是一种从浮点对象中获取最后一点信息的方法：

>>> from decimal import Decimal
>>> str(Decimal.from_float(0.1))
'0.1000000000000000055511151231257827021181583404541015625'

另一种方法是这样的。

>>> 0.1.hex()
'0x1.999999999999ap-4'

两个字符串都代表浮点数的确切内容。几乎所有其他东西都会将浮点数解释为 python 认为它可能是有意的（大多数时候是正确的）。

As has already been said, a float isn't precise at all - so preserving precision can be somewhat misleading.

Here's a way to get every last bit of information out of a float object:

>>> from decimal import Decimal
>>> str(Decimal.from_float(0.1))
'0.1000000000000000055511151231257827021181583404541015625'

Another way would be like so.

>>> 0.1.hex()
'0x1.999999999999ap-4'

Both strings represent the exact contents of the float. Allmost anything else interprets the float as python thinks it was probably intended (which most of the time is correct).

回复收藏 0 原文

~没有更多了~