Numpy 将数组从浮点转换为字符串

发布于 2024-10-24 09:56:17 字数 599 浏览 2 评论 0原文

我有一个浮点数组，已将其标准化为 1（即数组中的最大数字为 1），并且我想将其用作图形的颜色索引。在使用 matplotlib 使用灰度时，这需要使用 0 到 1 之间的字符串，因此我想将浮点数组转换为字符串数组。我试图通过使用“astype('str')”来做到这一点，但这似乎创建了一些与原始值不同（甚至接近）的值。

我注意到这一点是因为 matplotlib 抱怨在数组中找到数字 8，这很奇怪，因为它被归一化为 1！

简而言之，我有一个 float64 的数组 phis，使得：

numpy.where(phis.astype('str').astype('float64') != phis)

不为空。这令人费解，因为（希望是天真地）它似乎是 numpy 中的一个错误，是否有什么我可能做错了导致这种情况？

编辑：经过调查，这似乎是由于字符串函数处理高精度浮点数的方式造成的。使用向量化的 toString 函数（来自 robbles 的答案），情况也是如此，但是如果 lambda 函数是：

lambda x: "%.2f" % x

那么绘图工作 - 越来越好奇。（显然数组不再相等！）

原文

I have an array of floats that I have normalised to one (i.e. the largest number in the array is 1), and I wanted to use it as colour indices for a graph. In using matplotlib to use grayscale, this requires using strings between 0 and 1, so I wanted to convert the array of floats to an array of strings. I was attempting to do this by using "astype('str')", but this appears to create some values that are not the same (or even close) to the originals.

I notice this because matplotlib complains about finding the number 8 in the array, which is odd as it was normalised to one!

In short, I have an array phis, of float64, such that:

numpy.where(phis.astype('str').astype('float64') != phis)

is non empty. This is puzzling as (hopefully naively) it appears to be a bug in numpy, is there anything that I could have done wrong to cause this?

Edit: after investigation this appears to be due to the way the string function handles high precision floats. Using a vectorized toString function (as from robbles answer), this is also the case, however if the lambda function is:

lambda x: "%.2f" % x

Then the graphing works - curiouser and curiouser. (Obviously the arrays are no longer equal however!)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

你是暖光i 2024-10-31 09:56:17

您似乎对 numpy 数组在幕后如何工作感到有点困惑。数组中的每个项目的大小必须相同。

浮点数的字符串表示形式不能以这种方式工作。例如，repr(1.3) 生成 '1.3'，但 repr(1.33) 生成 '1.3300000000000001'。

浮点数的精确字符串表示会生成可变长度字符串。

由于 numpy 数组由大小相同的元素组成，因此当您使用字符串数组时，numpy 要求您指定数组中字符串的长度。

如果您使用x.astype('str')，它总是会将内容转换为长度为1的字符串数组。

例如，使用x = np.array(1.344566) code>, x.astype('str') 产生 '1'！

您需要更明确地使用 '|Sx' dtype 语法，其中 x 是数组中每个元素的字符串长度。

例如，使用 x.astype('|S10') 将数组转换为长度为 10 的字符串。

更好的是，完全避免使用字符串的 numpy 数组。这通常是一个坏主意，从你对问题的描述中我没有理由首先使用它们......

回复收藏 0 原文

陪我终i 2024-10-31 09:56:17

如果您有一个数字数组，并且想要一个字符串数组，您可以这样写：

strings = ["%.2f" % number for number in numbers]

如果您的数字是浮点数，则该数组将是一个与以下数字相同的数组：具有两位小数的字符串。

>>> a = [1,2,3,4,5]
>>> min_a, max_a = min(a), max(a)
>>> a_normalized = [float(x-min_a)/(max_a-min_a) for x in a]
>>> a_normalized
[0.0, 0.25, 0.5, 0.75, 1.0]
>>> a_strings = ["%.2f" % x for x in a_normalized]
>>> a_strings
['0.00', '0.25', '0.50', '0.75', '1.00']

请注意，它也适用于 numpy 数组：

>>> a = numpy.array([0.0, 0.25, 0.75, 1.0])
>>> print ["%.2f" % x for x in a]
['0.00', '0.25', '0.50', '0.75', '1.00']

如果您有多维数组，则可以使用类似的方法：

new_array = numpy.array(["%.2f" % x for x in old_array.reshape(old_array.size)])
new_array = new_array.reshape(old_array.shape)

示例：

>>> x = numpy.array([[0,0.1,0.2],[0.3,0.4,0.5],[0.6, 0.7, 0.8]])
>>> y = numpy.array(["%.2f" % w for w in x.reshape(x.size)])
>>> y = y.reshape(x.shape)
>>> print y
[['0.00' '0.10' '0.20']
 ['0.30' '0.40' '0.50']
 ['0.60' '0.70' '0.80']]

如果您检查您正在使用的函数的 Matplotlib 示例，您会注意到他们使用类似的方法：构建空矩阵并用构建的字符串填充它插值法。引用代码的相关部分是：

colortuple = ('y', 'b')
colors = np.empty(X.shape, dtype=str)
for y in range(ylen):
    for x in range(xlen):
        colors[x, y] = colortuple[(x + y) % len(colortuple)]

surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, facecolors=colors,
        linewidth=0, antialiased=False)

If you have an array of numbers and you want an array of strings, you can write:

strings = ["%.2f" % number for number in numbers]

If your numbers are floats, the array would be an array with the same numbers as strings with two decimals.

>>> a = [1,2,3,4,5]
>>> min_a, max_a = min(a), max(a)
>>> a_normalized = [float(x-min_a)/(max_a-min_a) for x in a]
>>> a_normalized
[0.0, 0.25, 0.5, 0.75, 1.0]
>>> a_strings = ["%.2f" % x for x in a_normalized]
>>> a_strings
['0.00', '0.25', '0.50', '0.75', '1.00']

Notice that it also works with numpy arrays:

>>> a = numpy.array([0.0, 0.25, 0.75, 1.0])
>>> print ["%.2f" % x for x in a]
['0.00', '0.25', '0.50', '0.75', '1.00']

A similar methodology can be used if you have a multi-dimensional array:

new_array = numpy.array(["%.2f" % x for x in old_array.reshape(old_array.size)])
new_array = new_array.reshape(old_array.shape)

Example:

>>> x = numpy.array([[0,0.1,0.2],[0.3,0.4,0.5],[0.6, 0.7, 0.8]])
>>> y = numpy.array(["%.2f" % w for w in x.reshape(x.size)])
>>> y = y.reshape(x.shape)
>>> print y
[['0.00' '0.10' '0.20']
 ['0.30' '0.40' '0.50']
 ['0.60' '0.70' '0.80']]

If you check the Matplotlib example for the function you are using, you will notice they use a similar methodology: build empty matrix and fill it with strings built with the interpolation method. The relevant part of the referenced code is:

colortuple = ('y', 'b')
colors = np.empty(X.shape, dtype=str)
for y in range(ylen):
    for x in range(xlen):
        colors[x, y] = colortuple[(x + y) % len(colortuple)]

surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, facecolors=colors,
        linewidth=0, antialiased=False)

回复收藏 0 原文

痴梦一场 2024-10-31 09:56:17

当我的 pandas 时，我遇到了这个问题数据帧开始出现浮点精度问题，这些问题在执行df.round(2).astype(str)时渗透到其字符串表示形式中。

我最终选择了 np.char.mod("%.2f", phys) ，其中使用广播在数据帧的每个元素上运行 "%.2f".__mod__(el)，而不是在 Python 中迭代，如果你的数据帧足够大，这会产生相当大的差异。使用有限长度的字符串（就像接受的答案所暗示的那样）对我来说是行不通的，因为在我的情况下保留小数比精确的有效数字位数更重要。

我会尝试 numpy .format_float_positional，用于格式化的，是据说比 Python 使用的 stringf 等效项快得多，但那个不起作用 -在 ndarrays 上明智（或根本）和手动迭代是我想要避免的部分。

没有用于格式化的 ufunc，据我所知，这可能是最有效的方法。

回复收藏 0 原文

暮年 2024-10-31 09:56:17

这可能比您想要的慢，但您可以这样做：

>>> tostring = vectorize(lambda x: str(x))
>>> numpy.where(tostring(phis).astype('float64') != phis)
(array([], dtype=int64),)

看起来它在从 float64 转换为 str 时对值进行四舍五入，但这样您就可以根据需要自定义转换。

This is probably slower than what you want, but you can do:

>>> tostring = vectorize(lambda x: str(x))
>>> numpy.where(tostring(phis).astype('float64') != phis)
(array([], dtype=int64),)

It looks like it rounds off the values when it converts to str from float64, but this way you can customize the conversion however you like.

回复收藏 0 原文

花海 2024-10-31 09:56:17

如果主要问题是从浮点数转换为字符串时精度损失，一种可能的方法是将浮点数转换为十进制S：http://docs.python.org/library/decimal.html。

在 python 2.7 及更高版本中，您可以直接将浮点数转换为十进制对象。

回复收藏 0 原文

~没有更多了~

关于作者

简单爱

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

Numpy 将数组从浮点转换为字符串

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

关于作者

相关话题

热门标签

推荐作者

烙印

singlesman

给自己一个微笑

独孤求败

晨钟暮鼓

我是自愿种绣球花的

友情链接

Numpy 将数组从浮点转换为字符串

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

关于作者

相关话题

热门标签

推荐作者

烙印

singlesman

给自己一个微笑

独孤求败

晨钟暮鼓

我是自愿种绣球花的

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。