如何在Python中将一系列浮点值合并到直方图中?
我有一组浮点值(始终小于 0)。我想将其合并到直方图中, IE。直方图中的每个条形都包含值范围 [0,0.150)
我拥有的数据如下所示:
0.000
0.005
0.124
0.000
0.004
0.000
0.111
0.112
在下面的代码中,我期望得到的结果看起来像
[0, 0.005) 5
[0.005, 0.011) 0
...etc..
我尝试用我的这段代码进行这样的分箱。 但这似乎不起作用。正确的做法是什么?
#! /usr/bin/env python
import fileinput, math
log2 = math.log(2)
def getBin(x):
return int(math.log(x+1)/log2)
diffCounts = [0] * 5
for line in fileinput.input():
words = line.split()
diff = float(words[0]) * 1000;
diffCounts[ str(getBin(diff)) ] += 1
maxdiff = [i for i, c in enumerate(diffCounts) if c > 0][-1]
print maxdiff
maxBin = max(maxdiff)
for i in range(maxBin+1):
lo = 2**i - 1
hi = 2**(i+1) - 1
binStr = '[' + str(lo) + ',' + str(hi) + ')'
print binStr + '\t' + '\t'.join(map(str, (diffCounts[i])))
~
I have set of value in float (always less than 0). Which I want to bin into histogram,
i,e. each bar in histogram contain range of value [0,0.150)
The data I have looks like this:
0.000
0.005
0.124
0.000
0.004
0.000
0.111
0.112
Whith my code below I expect to get result that looks like
[0, 0.005) 5
[0.005, 0.011) 0
...etc..
I tried to do do such binning with this code of mine.
But it doesn't seem to work. What's the right way to do it?
#! /usr/bin/env python
import fileinput, math
log2 = math.log(2)
def getBin(x):
return int(math.log(x+1)/log2)
diffCounts = [0] * 5
for line in fileinput.input():
words = line.split()
diff = float(words[0]) * 1000;
diffCounts[ str(getBin(diff)) ] += 1
maxdiff = [i for i, c in enumerate(diffCounts) if c > 0][-1]
print maxdiff
maxBin = max(maxdiff)
for i in range(maxBin+1):
lo = 2**i - 1
hi = 2**(i+1) - 1
binStr = '[' + str(lo) + ',' + str(hi) + ')'
print binStr + '\t' + '\t'.join(map(str, (diffCounts[i])))
~
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果可能的话,不要重新发明轮子。 NumPy 拥有您需要的一切:
When possible, don't reinvent the wheel. NumPy has everything you need:
第一个错误是:
当需要 str 时,为什么要将 int 转换为 str?解决这个问题,然后我们得到:
因为你只做了 5 个桶。我不明白你的存储方案,但让我们将其设置为 50 个存储桶,看看会发生什么:
maxdiff
是整数列表中的单个值,那么max
是多少?在这里做什么?删除它,现在我们得到:果然,您使用单个值作为
map
的第二个参数。让我们将最后两行从这样简化:到这样:
Now it prints:
我不知道在这里还能做什么,因为我不太了解您希望使用的分桶。它似乎涉及二进制权力,但对我来说没有意义......
The first error is:
Why are you converting an int to a str when a str is needed? Fix that, then we get:
because you've only made 5 buckets. I don't understand your bucketing scheme, but let's make it 50 buckets and see what happens:
maxdiff
is a single value out of your list of ints, so what ismax
doing here? Remove it, now we get:Sure enough, you're using a single value as the second argument to
map
. Let's simplify the last two lines from this:to this:
Now it prints:
I'm not sure what else to do here, since I don't really understand the bucketing you are hoping to use. It seems to involve binary powers, but isn't making sense to me...