如何无循环的火炬张量总和?
我有一系列垃圾箱的边界,我需要在这些垃圾箱内获得一笔价值。 现在看起来如下:
output = torch.zeros((16, 10)) #10 corresponds to the number of bins
for l in range(10):
output[:,l] = data[:, bin_edges[l]:bin_edges[l+1]].sum(axis=-1)
是否有可能避免循环并改善性能?
I've got an array of bins' borders and I need to get a sum of values inside these bins.
Now it looks as follows:
output = torch.zeros((16, 10)) #10 corresponds to the number of bins
for l in range(10):
output[:,l] = data[:, bin_edges[l]:bin_edges[l+1]].sum(axis=-1)
Is it possible to avoid loops and improve the performance?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
通常,要通过矢量化优化代码,您希望构建一个单个大张量,您可以在该张量中计算单个操作中的结果。
但是在这里,您的垃圾箱的长度可能不同,因此您无法从中构建张量。
不过,这是时间序列处理中的通常情况,因此Pytorch有一些公用事业可以克服此问题,例如
使用该实用程序,我能够稍微优化该功能,但是差异取决于数据形状以及垃圾箱的数量和长度,有时性能甚至会降低。
请注意,
pad_sequence
假设您想从数据的第一个维度制作垃圾箱,并且从最后一个DIM制作垃圾箱,因此,如果您可以相应地重新组织数据,则优化会更好。代码
实现
结果
断言,这两个功能都是等效的:
不同形状和边缘的时间:
Normally to optimize code by vectorization you would like to construct a single big tensor on which you compute the result in a single operation.
But here your bins might have different lengths, so you can't construct a tensor from that.
Though, that's a usual case in time-series processing, so PyTorch has some utilities to overcome this issue, such as
torch.nn.utils.rnn.pad_sequence
.Using that utility I was able to optimize the function a bit, but the difference depends on the data shape and the number and length of bins, and sometimes performance even decreases.
Please note that
pad_sequence
assumes that you want to make bins from the first dimension of your data, and you make bins from the last dim, so the optimization would be better if you can reorganize your data accordingly.Code
Implementations
Results
Assert that both functions are equivalent:
Time for different shapes and edges: