如何防止 perl 读取超出访问时缩小的绑定数组的末尾？

发布于 2024-08-15 04:30:03 字数 1485 浏览 4 评论 0原文

有什么方法可以强制 Perl 在每次调用 FETCH 之前对绑定数组调用 FETCHSIZE 吗？我的绑定数组知道其最大大小，但可能会根据早期 FETCH 调用的结果从该大小缩小。这是一个人为的示例，它使用惰性求值将列表过滤为仅偶数元素：

use warnings;
use strict;

package VarSize;

sub TIEARRAY { bless $_[1] => $_[0] }
sub FETCH {
    my ($self, $index) = @_;
    splice @$self, $index, 1 while $$self[$index] % 2;
    $$self[$index]
}
sub FETCHSIZE {scalar @{$_[0]}}

my @source = 1 .. 10;

tie my @output => 'VarSize', [@source];

print "@output\n";  # array changes size as it is read, perl only checks size
                    # at the start, so it runs off the end with warnings
print "@output\n";  # knows correct size from start, no warnings

为了简洁起见，我省略了一堆错误检查代码（例如如何处理从 0 以外的索引开始的访问）

编辑：而不是上面两个打印语句，如果使用以下两行之一，第一行将正常工作，第二行将引发警告。

print "$_ " for @output;   # for loop "iterator context" is fine,
                           # checks FETCHSIZE before each FETCH, ends properly

print join " " => @output; # however a list context expansion 
                           # calls FETCHSIZE at the start, and runs off the end

更新：

实现可变大小绑定数组的实际模块称为 List::Gen 已在 CPAN 上发布。该函数是 filter，其行为类似于 grep，但与 List::Gen 的惰性生成器配合使用。有谁有任何想法可以使 filter 的实现更好？

（test 函数类似，但在失败的槽中返回 undef，保持数组大小不变，但它的使用语义当然与 grep 不同>)

原文

Is there any way to force Perl to call FETCHSIZE on a tied array before each call to FETCH? My tied array knows its maximum size, but could shrink from this size depending on the results of earlier FETCH calls. here is a contrived example that filters a list to only the even elements with lazy evaluation:

use warnings;
use strict;

package VarSize;

sub TIEARRAY { bless $_[1] => $_[0] }
sub FETCH {
    my ($self, $index) = @_;
    splice @$self, $index, 1 while $self[$index] % 2;
    $self[$index]
}
sub FETCHSIZE {scalar @{$_[0]}}

my @source = 1 .. 10;

tie my @output => 'VarSize', [@source];

print "@output\n";  # array changes size as it is read, perl only checks size
                    # at the start, so it runs off the end with warnings
print "@output\n";  # knows correct size from start, no warnings

for brevity I have omitted a bunch of error checking code (such as how to deal with accesses starting from an index other than 0)

EDIT: rather than the above two print statements, if ONE of the following two lines is used, the first will work fine, the second will throw warnings.

print "$_ " for @output;   # for loop "iterator context" is fine,
                           # checks FETCHSIZE before each FETCH, ends properly

print join " " => @output; # however a list context expansion 
                           # calls FETCHSIZE at the start, and runs off the end

Update:

The actual module that implements a variable sized tied array is called List::Gen which is up on CPAN. The function is filter which behaves like grep, but works with List::Gen's lazy generators. Does anyone have any ideas that could make the implementation of filter better?

(the test function is similar, but returns undef in failed slots, keeping the array size constant, but that of course has different usage semantics than grep)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

抚你发端 2024-08-22 04:30:03

sub FETCH {
    my ($self, $index) = @_;
    my $size = $self->FETCHSIZE;
    ...
}

哒哒！

我怀疑你缺少的是它们只是方法。由 tie magic 调用的方法，但仍然只是您可以自己调用的方法。

列出绑定数组的内容基本上可以归结为：

my @array;
my $tied_obj = tied @array;
for my $idx (0..$tied_obj->FETCHSIZE-1) {
    push @array, $tied_obj->FETCH($idx);
}

return @array;

因此您没有任何机会控制迭代次数。 FETCH 也无法可靠地判断它是从 @array 调用还是 $array[$idx] 还是 @array[@idxs]。这太糟糕了。领带有点糟糕，而且速度真的很慢。比普通方法调用慢大约 3 倍，比普通数组慢 10 倍。

您的示例已经打破了对数组的期望（10 个元素进入，5 个元素出来）。当用户请求 $array[3] 时会发生什么？他们得到undef吗？替代方案包括仅使用对象 API，如果您的事物的行为与数组的行为不完全一样，那么假装它只会增加混乱。或者您可以使用数组解引用重载的对象。

所以，你正在做的事情可以做，但是很难让它发挥作用。你真正想实现什么目标？

sub FETCH {
    my ($self, $index) = @_;
    my $size = $self->FETCHSIZE;
    ...
}

Ta da!

I suspect what you're missing is they're just methods. Methods called by tie magic, but still just methods you can call yourself.

Listing out the contents of a tied array basically boils down to this:

my @array;
my $tied_obj = tied @array;
for my $idx (0..$tied_obj->FETCHSIZE-1) {
    push @array, $tied_obj->FETCH($idx);
}

return @array;

So you don't get any opportunity to control the number of iterations. Nor can FETCH reliably tell if its being called from @array or $array[$idx] or @array[@idxs]. This sucks. Ties kinda suck, and they're really slow. About 3 times slower than a normal method call and 10 times than a regular array.

Your example already breaks expectations about arrays (10 elements go in, 5 elements come out). What happen when a user asks for $array[3]? Do they get undef? Alternatives include just using the object API, if your thing doesn't behave exactly like an array pretending it does will only add confusion. Or you can use an object with array deref overloaded.

So, what you're doing can be done, but its difficult to get it to work well. What are you really trying to accomplish?

回复收藏 0 原文

混吃等死 2024-08-22 04:30:03

我认为 perl 调用 FETCH/FETCHSIZE 方法的顺序无法更改。这是 perls 的内部部分。
为什么不直接删除警告：

sub FETCH {
    my ($self, $index) = @_;
    splice @$self, $index, 1 while ($self[$index] || 0) % 2;
    exists $self[$index] ? $self[$index] : '' ## replace '' with default value
}

I think that order in which perl calls FETCH/FETCHSIZE methods can't be changed. It's perls internal part.
Why not just explicitly remove warnings:

sub FETCH {
    my ($self, $index) = @_;
    splice @$self, $index, 1 while ($self[$index] || 0) % 2;
    exists $self[$index] ? $self[$index] : '' ## replace '' with default value
}

回复收藏 0 原文

~没有更多了~