压缩正弦波表
我有一个包含 1024 个条目的大型数组,这些条目在 range(14, 86)
中具有 7 位值,
这意味着存在多个具有相同值的索引范围。
例如,
consider the index range 741 to 795. It maps to 14
consider the index range 721 to 740. It maps to 15
consider the index range 796 to 815. It maps to 15
我想将此映射提供给一个 python 程序,该程序会输出以下内容:
if((index >= 741) and (index <= 795)) return 14;
if((index >= 721) and (index <= 740)) return 15;
if((index >= 796) and (index <= 815)) return 15;
groupby
映射值的一些代码已准备就绪,但我很难使用 pairwise< 编写表达式/代码>。
以前有人做过类似的事情吗?
我以两种形式上传了数据集:
通常,按索引排序。
I have a large array with 1024 entries that have 7 bit values in range(14, 86)
This means there are multiple range of indices that have the same value.
For example,
consider the index range 741 to 795. It maps to 14
consider the index range 721 to 740. It maps to 15
consider the index range 796 to 815. It maps to 15
I want to feed this map to a python program that would spew out the following:
if((index >= 741) and (index <= 795)) return 14;
if((index >= 721) and (index <= 740)) return 15;
if((index >= 796) and (index <= 815)) return 15;
Some code to groupby
mapped value is ready but I am having difficulty coding up an expression using pairwise
.
Anyone has done something similar before?
I have uploaded the dataset in two forms:
Usual, ordered by index.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您不介意由于舍入而导致的略有不同值,我可以为您很好压缩该值。
这是实际执行您想要的操作的代码;它作为测试数据使用
。如果您在第一列中有索引,则需要在此处的
for
循环中切换val
和index
的顺序。编辑:哎呀,我漏掉了一行。固定的。另外,你的乘数实际上是 36 而不是 35,我脑子里肯定已经将 (14, 86) 舍入为 (15, 85)。
编辑 2:向您展示如何仅存储表的四分之一。
如果从表中减去偏移量,只需使用前两个
-vals[...]
即可。另外,底部的比较是模糊的,因为我为此得到了 72 个相差一的错误。这只是因为您的值四舍五入为整数;它们都是位于两个值之间的位置,因此准确性几乎没有下降。
If you don't mind slightly different values due to rounding, I can compress that really well for you.
Here is code to actually do what you want; it works with
as test data. You would need to switch the order of
val
andindex
in thefor
loop here if you've got indexes in the first column.Edit: Whoops, I was missing a line. Fixed. Also, you're multiplier was actually 36 not 35, I must have rounded (14, 86) to (15, 85) in my head.
Edit 2: To show you how to store only a quarter of the table.
If you subtract the offset out of the table just make the first two
-vals[...]
instead.Also, the compare at the bottom is fuzzy because I get 72 off-by-one errors for this. This is simply because your values are rounded to integers; they're all places that you're halfway in between two values so there is very little decrease in accuracy.
关闭后,我迟来地找到了这个解决方案 “识别列表中连续重复项的最 Pythonic 方法是什么?”。
注意:对于像 sine 这样的周期性 fn,您可以只存储表的四分之一(即 256 个值)或一半,然后对索引执行一些(定点)算术查找时间。正如我所评论的,如果您进一步不存储 +50 的偏移量,则需要少一位,代价是在查找时间后增加一个整数。因此,79% 的压缩率很容易实现。 RLE 会给你更多。即使 fn 有噪音,您仍然可以通过这种通用方法获得不错的压缩效果。
正如 agf 指出的,你的
f(n) = 50 + 36*sin(72*pi*n/1024)
=50 + g(n)
比如说。因此,将
g(n) = 36*sin(72*pi*n/1024)
的 256 个值制成表格,仅适用于 n=0..255 的范围然后 f(n) 很容易计算为:
无论如何,这是一个通用表压缩器解决方案,它将生成(istart,iend,value)三元组。
我绞尽脑汁如何使用列表推导式和 itertools.takewhile() 更 Python 地做到这一点;需要抛光。
(注意,在 agf 改变他的方法之前,我开始使用表压缩器方法......试图获得 itertools 或列表理解解决方案)
After closing, I belatedly found this solution "What's the most Pythonic way to identify consecutive duplicates in a list?".
NB: with a periodic fn like sine, you can get by by only storing a quarter (i.e. 256 values) or half of the table, then perform a little (fixed-point) arithmetic on the index at lookup time. As I commented, if you further don't store the offset of +50, you need one bit less, at the cost of one integer addition after lookup time. Hence, 79% compression easily achievable. RLE will give you more. Even if the fn has noise, you can still get decent compression with this general approach.
As agf pointed out, your
f(n) = 50 + 36*sin(72*pi*n/1024)
=50 + g(n)
, say.So tabulate the 256 values of
g(n) = 36*sin(72*pi*n/1024)
, only for the range n=0..255Then f(n) is easily computed by:
Anyway here's a general table compressor solution which will generate (istart,iend,value) triples.
I knocked my head off how to do this more Pythonically using list comprehensions and itertools.takewhile() ; needs polishing.
(NB I started the table-compressor approach before agf changed his approach... was trying to get an itertools or list-comprehension solution)