压缩正弦波表

发布于 2024-11-28 02:41:05 字数 808 浏览 2 评论 0原文

我有一个包含 1024 个条目的大型数组，这些条目在 range(14, 86) 中具有 7 位值，

这意味着存在多个具有相同值的索引范围。

例如，

consider the index range 741 to 795. It maps to 14
consider the index range 721 to 740. It maps to 15
consider the index range 796 to 815. It maps to 15

我想将此映射提供给一个 python 程序，该程序会输出以下内容：

if((index >= 741) and (index <= 795)) return 14;
if((index >= 721) and (index <= 740)) return 15;
if((index >= 796) and (index <= 815)) return 15;

groupby 映射值的一些代码已准备就绪，但我很难使用 pairwise< 编写表达式/代码>。

以前有人做过类似的事情吗？

我以两种形式上传了数据集：

通常，按索引排序。

按映射值分组。

原文

I have a large array with 1024 entries that have 7 bit values in range(14, 86)

This means there are multiple range of indices that have the same value.

For example,

consider the index range 741 to 795. It maps to 14
consider the index range 721 to 740. It maps to 15
consider the index range 796 to 815. It maps to 15

I want to feed this map to a python program that would spew out the following:

if((index >= 741) and (index <= 795)) return 14;
if((index >= 721) and (index <= 740)) return 15;
if((index >= 796) and (index <= 815)) return 15;

Some code to groupby mapped value is ready but I am having difficulty coding up an expression using pairwise.

Anyone has done something similar before?

I have uploaded the dataset in two forms:

Usual, ordered by index.

Grouped by mapped value.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

木槿暧夏七纪年 2024-12-05 02:41:05

如果您不介意由于舍入而导致的略有不同值，我可以为您很好压缩该值。

from math import pi, sin
interval=2*pi/1024
sinval=lambda i:int(round(sin(i*interval)*36))+50

这是实际执行您想要的操作的代码；它作为测试数据使用

vals = sorted((sinval(i), i) for i in range(1024))

。如果您在第一列中有索引，则需要在此处的 for 循环中切换 val 和 index 的顺序。

ranges, oldval, oldidx = [[0, 0]], 0, 0
for val, index in vals:
    if not (val == oldval and index == oldidx + 1):
        ranges[-1].append(oldidx)
        ranges.append([val, index])
    oldval, oldidx = val, index
ranges[-1].append(oldidx)
ranges.pop(0)
ifs = ('if((index >= {1}) and (index <= {2})) return {0};\n'.format(val, start, end)
            for val, start, end in ranges)
print ''.join(ifs)

编辑：哎呀，我漏掉了一行。固定的。另外，你的乘数实际上是 36 而不是 35，我脑子里肯定已经将 (14, 86) 舍入为 (15, 85)。

编辑 2：向您展示如何仅存储表的四分之一。

from math import pi, sin

full = 1024
half = 512
quarter = 256
mag = 72
offset = 50

interval = 2 * pi / full

def sinval(i):
    return int(round(sin(i * interval) * (mag // 2))) + offset

vals = [sinval(i) for i in range(quarter)]

def sintable(i):
    if  i >= half + quarter:
        return 2 * offset - vals[full - i - 1]
    elif  i >= half:
        return 2 * offset - vals[i - half]
    elif i >= quarter:
        return vals[half - i - 1]
    else:
        return vals[i]

for i in range(full):
    assert -1 <= sinval(i) - sintable(i) <= 1

如果从表中减去偏移量，只需使用前两个 -vals[...] 即可。

另外，底部的比较是模糊的，因为我为此得到了 72 个相差一的错误。这只是因为您的值四舍五入为整数；它们都是位于两个值之间的位置，因此准确性几乎没有下降。

If you don't mind slightly different values due to rounding, I can compress that really well for you.

from math import pi, sin
interval=2*pi/1024
sinval=lambda i:int(round(sin(i*interval)*36))+50

Here is code to actually do what you want; it works with

vals = sorted((sinval(i), i) for i in range(1024))

as test data. You would need to switch the order of val and index in the for loop here if you've got indexes in the first column.

ranges, oldval, oldidx = [[0, 0]], 0, 0
for val, index in vals:
    if not (val == oldval and index == oldidx + 1):
        ranges[-1].append(oldidx)
        ranges.append([val, index])
    oldval, oldidx = val, index
ranges[-1].append(oldidx)
ranges.pop(0)
ifs = ('if((index >= {1}) and (index <= {2})) return {0};\n'.format(val, start, end)
            for val, start, end in ranges)
print ''.join(ifs)

Edit: Whoops, I was missing a line. Fixed. Also, you're multiplier was actually 36 not 35, I must have rounded (14, 86) to (15, 85) in my head.

Edit 2: To show you how to store only a quarter of the table.

from math import pi, sin

full = 1024
half = 512
quarter = 256
mag = 72
offset = 50

interval = 2 * pi / full

def sinval(i):
    return int(round(sin(i * interval) * (mag // 2))) + offset

vals = [sinval(i) for i in range(quarter)]

def sintable(i):
    if  i >= half + quarter:
        return 2 * offset - vals[full - i - 1]
    elif  i >= half:
        return 2 * offset - vals[i - half]
    elif i >= quarter:
        return vals[half - i - 1]
    else:
        return vals[i]

for i in range(full):
    assert -1 <= sinval(i) - sintable(i) <= 1

If you subtract the offset out of the table just make the first two -vals[...] instead.

Also, the compare at the bottom is fuzzy because I get 72 off-by-one errors for this. This is simply because your values are rounded to integers; they're all places that you're halfway in between two values so there is very little decrease in accuracy.

回复收藏 0 原文

清醇 2024-12-05 02:41:05

关闭后，我迟来地找到了这个解决方案 “识别列表中连续重复项的最 Pythonic 方法是什么？”。

注意：对于像 sine 这样的周期性 fn，您可以只存储表的四分之一（即 256 个值）或一半，然后对索引执行一些（定点）算术查找时间。正如我所评论的，如果您进一步不存储 +50 的偏移量，则需要少一位，代价是在查找时间后增加一个整数。因此，79% 的压缩率很容易实现。 RLE 会给你更多。即使 fn 有噪音，您仍然可以通过这种通用方法获得不错的压缩效果。

正如 agf 指出的，你的 f(n) = 50 + 36*sin(72*pi*n/1024) = 50 + g(n) 比如说。

因此，将 g(n) = 36*sin(72*pi*n/1024) 的 256 个值制成表格，仅适用于 n=0..255 的范围

然后 f(n) 很容易计算为：

if 0 <= n < 256, f(n) = 50 + g(n)
if 256 <= n < 512, f(n) = 50 + g(511-n)
if 512 <= n < 768, f(n) = 50 - g(n-512)
if 768 <= n < 1024, f(n) = 50 - g(1023-n)

无论如何，这是一个通用表压缩器解决方案，它将生成（istart，iend，value）三元组。

我绞尽脑汁如何使用列表推导式和 itertools.takewhile() 更 Python 地做到这一点；需要抛光。

#import itertools

table_="""
    0       50
    1       50
    ...
    1021    49
    1022    50
    1023    50""".split()

# Convert values to int. Throw away the indices - will recover them with enumerate()
table = [int(x) for x in table_[1::2]]

compressed_table = []
istart = 0
for i,v in enumerate(table):
    if v != table[i-1]:
        iend = i-1
        compressed_table.append((istart,iend,table[i-1]))
        istart = i
    else:
        continue # skip identical values
# Slightly ugly: append the last value, when the iterator was exhausted
compressed_table.append((istart,i,table[i]))

（注意，在 agf 改变他的方法之前，我开始使用表压缩器方法......试图获得 itertools 或列表理解解决方案）

After closing, I belatedly found this solution "What's the most Pythonic way to identify consecutive duplicates in a list?".

NB: with a periodic fn like sine, you can get by by only storing a quarter (i.e. 256 values) or half of the table, then perform a little (fixed-point) arithmetic on the index at lookup time. As I commented, if you further don't store the offset of +50, you need one bit less, at the cost of one integer addition after lookup time. Hence, 79% compression easily achievable. RLE will give you more. Even if the fn has noise, you can still get decent compression with this general approach.

As agf pointed out, your f(n) = 50 + 36*sin(72*pi*n/1024) = 50 + g(n), say.

So tabulate the 256 values of g(n) = 36*sin(72*pi*n/1024), only for the range n=0..255

Then f(n) is easily computed by:

if 0 <= n < 256, f(n) = 50 + g(n)
if 256 <= n < 512, f(n) = 50 + g(511-n)
if 512 <= n < 768, f(n) = 50 - g(n-512)
if 768 <= n < 1024, f(n) = 50 - g(1023-n)

Anyway here's a general table compressor solution which will generate (istart,iend,value) triples.

I knocked my head off how to do this more Pythonically using list comprehensions and itertools.takewhile() ; needs polishing.

#import itertools

table_="""
    0       50
    1       50
    ...
    1021    49
    1022    50
    1023    50""".split()

# Convert values to int. Throw away the indices - will recover them with enumerate()
table = [int(x) for x in table_[1::2]]

compressed_table = []
istart = 0
for i,v in enumerate(table):
    if v != table[i-1]:
        iend = i-1
        compressed_table.append((istart,iend,table[i-1]))
        istart = i
    else:
        continue # skip identical values
# Slightly ugly: append the last value, when the iterator was exhausted
compressed_table.append((istart,i,table[i]))

(NB I started the table-compressor approach before agf changed his approach... was trying to get an itertools or list-comprehension solution)

回复收藏 0 原文

~没有更多了~