是否有一种有效的方法来传递“全部”作为数字索引?

发布于 2025-02-09 17:32:41 字数 972 浏览 1 评论 0原文

我有代码生成一个布尔数组,该阵列用作numpy阵列上的掩码,沿着:

def func():
    a    = numpy.arange(10)
    mask = a % 2 == 0
    return a[mask]

现在,我需要将其分为创建掩码的情况,并且未创建它并且使用了所有值反而。这可以实现如下:

def func(use_mask):
    a    = numpy.arange(10)
    if use_mask:
        mask = a % 2 == 0
    else:
        mask = numpy.ones(10, dtype=bool)
    return a[mask]

但是,对于大型阵列而言,这变得极为浪费,因为必须首先创建同样大的布尔阵列。

因此,我的问题是:是否可以将某些内容作为“索引”来重现此类无处不在的数组的行为

?对于涉及一些索引魔术等的其他内容是一个有效的解决方案,但是只需通过扩展的案例区别避免掩盖掩蔽,或者不需要更改代码结构的其他内容,因为它会损害可读性和可维护性(请参阅下一段)。


为了完整的目的,这是我目前正在考虑的事情,尽管这会使代码变得更加混乱,而且更简化,因为它扩展了IF/elly在技术上需要的位置(实际上,实际使用了一次掩码,不止一次使用,因此,每个发生的情况都需要在情况区分中包含;

def func(use_mask):
    a    = numpy.arange(10)
    if use_mask:
        mask = a % 2 == 0
        r   = f1(a[mask])
        q   = f2(a[mask], r)
        return q
    else:
        r   = f1(a)
        q   = f2(a, r)
        return q

I have code that generates a boolean array that acts as a mask on numpy arrays, along the lines of:

def func():
    a    = numpy.arange(10)
    mask = a % 2 == 0
    return a[mask]

Now, I need to separate this into a case where the mask is created, and one where it is not created and all values are used instead. This could be achieved as follows:

def func(use_mask):
    a    = numpy.arange(10)
    if use_mask:
        mask = a % 2 == 0
    else:
        mask = numpy.ones(10, dtype=bool)
    return a[mask]

However, this becomes extremely wasteful for large arrays, since an equally large boolean array must first be created.

My question is thus: Is there something I can pass as an "index" to recreate the behavior of such an everywhere-true array?

Systematically changing occurrences of a[mask] to something else involving some indexing magic etc. is a valid solution, but just avoiding the masking entirely via an expanded case distinction or something else that changes the structure of the code is not desired, as it would impair readability and maintainability (see next paragraph).


For the sake of completeness, here's what I'm currently considering doing, though this makes the code messier and less streamlined since it expands the if/else beyond where it technically needs to be (in reality, the mask is used more than once, hence every occurrence would need to be contained within the case distinction; I used f1 and f2 as examples here):

def func(use_mask):
    a    = numpy.arange(10)
    if use_mask:
        mask = a % 2 == 0
        r   = f1(a[mask])
        q   = f2(a[mask], r)
        return q
    else:
        r   = f1(a)
        q   = f2(a, r)
        return q

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

聚集的泪 2025-02-16 17:32:41

回想一下a [:]返回a的内容(即使a是多维的,即使是多维)。我们不能将存储在mask变量中,但是我们可以等效地使用slice对象:

def func(use_mask):
    a = numpy.arange(10)
    if use_mask:
        mask = a % 2 == 0
    else:
        mask = slice(None)
    return a[mask]

这不使用任何内存来创建索引大批。我不确定a [slice(none)]操作的CPU用法是什么。

Recall that a[:] returns the contents of a (even if a is multidimensional). We cannot store the : in the mask variable, but we can use a slice object equivalently:

def func(use_mask):
    a = numpy.arange(10)
    if use_mask:
        mask = a % 2 == 0
    else:
        mask = slice(None)
    return a[mask]

This does not use any memory to create the index array. I'm not sure what the CPU usage of the a[slice(None)] operation is, though.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文