Perl IPC 不使用共享内存?
部分代码是:
sub _getPages {
my $self = shift;
my $rel_url = lc(shift);
my @turls = ();
my $urls = [];
my $ipc_share = tie $urls, 'IPC::Shareable',undef, { destroy => 1 };
foreach my $stag (@{$self->{SUPP_TAGS}}) {
push(@{$urls}, map { lc($self->_normalizeSupportURL($_->url(),
$self->{MECH_O}->getGlobalMechInstance()->uri->authority,
$self->{MECH_O}->getGlobalMechInstance()->uri->scheme)) }
grep { ((index($_->url,$rel_url) > -1) || ($_->url =~ m{^/})) &&
$_->url !~ m/answer|mailto:/i }
$self->{MECH_O}->getGlobalMechInstance()->find_all_links( text_regex => qr/$stag/i ),
$self->{MECH_O}->getGlobalMechInstance()->find_all_links( name_regex => qr/$stag/i ),
$self->{MECH_O}->getGlobalMechInstance()->find_all_links( url_abs_regex => qr/$stag/i ));
}
@{$urls} = uniq(@{$urls});
foreach my $url (@{$urls}) {
if (!exists($self->{UNQ_URLS}->{lc($url)})) {
$self->{UNQ_URLS}->{lc($url)} = 1;
$self->{SUPP_PROC}->start and next;
if (eval {$self->{MECH_O}->getGlobalMechInstance()->get($url); } ) {
push(@{$urls}, map { lc($self->_normalizeSupportURL($_->url(),
$self->{MECH_O}->getGlobalMechInstance()->uri->authority,
$self->{MECH_O}->getGlobalMechInstance()->uri->scheme)) }
grep { ((index($_->url,$rel_url) > -1) || ($_->url =~ m{^/}) ||
$_->url =~ m/\d+\.\d+\.\d+\.\d+/ ) &&
$_->url !~ m/answer|mailto:/i }
$self->{MECH_O}->getGlobalMechInstance()->find_all_links( text_regex => qr/chat/i ),
$self->{MECH_O}->getGlobalMechInstance()->find_all_links( name_regex => qr/chat/i ),
$self->{MECH_O}->getGlobalMechInstance()->find_all_links( url_abs_regex => qr/chat/i ));
}
$self->{SUPP_PROC}->finish;
}
}
$self->{SUPP_PROC}->wait_all_children;
return uniq(@{$urls});
}
基本上,我想做的是在进程之间共享 $urls
,这样我就可以向其中添加 url,但我不断收到:
无法创建信号量集:设备上没有剩余空间
,这是用于执行内核 (Ubuntu 10.04 LTS) 参数 (SEMMNI、SEMMNS) 的操作。 我增加了它们,但它仍然没有真正用处,所以我可能在这里做错了什么。
是否有另一种方式(可能是与可存储相关的解决方案...)在进程之间共享数组?
谢谢,
Part of the code is:
sub _getPages {
my $self = shift;
my $rel_url = lc(shift);
my @turls = ();
my $urls = [];
my $ipc_share = tie $urls, 'IPC::Shareable',undef, { destroy => 1 };
foreach my $stag (@{$self->{SUPP_TAGS}}) {
push(@{$urls}, map { lc($self->_normalizeSupportURL($_->url(),
$self->{MECH_O}->getGlobalMechInstance()->uri->authority,
$self->{MECH_O}->getGlobalMechInstance()->uri->scheme)) }
grep { ((index($_->url,$rel_url) > -1) || ($_->url =~ m{^/})) &&
$_->url !~ m/answer|mailto:/i }
$self->{MECH_O}->getGlobalMechInstance()->find_all_links( text_regex => qr/$stag/i ),
$self->{MECH_O}->getGlobalMechInstance()->find_all_links( name_regex => qr/$stag/i ),
$self->{MECH_O}->getGlobalMechInstance()->find_all_links( url_abs_regex => qr/$stag/i ));
}
@{$urls} = uniq(@{$urls});
foreach my $url (@{$urls}) {
if (!exists($self->{UNQ_URLS}->{lc($url)})) {
$self->{UNQ_URLS}->{lc($url)} = 1;
$self->{SUPP_PROC}->start and next;
if (eval {$self->{MECH_O}->getGlobalMechInstance()->get($url); } ) {
push(@{$urls}, map { lc($self->_normalizeSupportURL($_->url(),
$self->{MECH_O}->getGlobalMechInstance()->uri->authority,
$self->{MECH_O}->getGlobalMechInstance()->uri->scheme)) }
grep { ((index($_->url,$rel_url) > -1) || ($_->url =~ m{^/}) ||
$_->url =~ m/\d+\.\d+\.\d+\.\d+/ ) &&
$_->url !~ m/answer|mailto:/i }
$self->{MECH_O}->getGlobalMechInstance()->find_all_links( text_regex => qr/chat/i ),
$self->{MECH_O}->getGlobalMechInstance()->find_all_links( name_regex => qr/chat/i ),
$self->{MECH_O}->getGlobalMechInstance()->find_all_links( url_abs_regex => qr/chat/i ));
}
$self->{SUPP_PROC}->finish;
}
}
$self->{SUPP_PROC}->wait_all_children;
return uniq(@{$urls});
}
Basically, what I am trying to do, is to share the $urls
between the processes, so I can add urls to it, but I keep getting:
Could not create semaphore set: No space left on device
which is something to do the kernel (Ubuntu 10.04 LTS) parameters (SEMMNI,SEMMNS).
I Increased them, but its still doesnt really useful, so I probably doing something wrong here.
It there another way (Probably Storable
related solution...) to share an array between processes?
Thanks,
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可能已经完成了此操作,但最好能够准确确认失败的原因以及您所做的更改是否生效。要确认这确实是 semget() 返回 ENOSPC,您可以使用以下命令运行它:
strace -ooutfile CMD
,然后在
outfile
中查找 ENOSPC 以确认哪个系统调用返回了它。要确认调整 SEMMNI 和 SEMMNS 是否有效,您可以:
cat /proc/sys/kernel/sem
(如 man proc 表示,SEMMNI 是第四个字段,SEMMNS 是第二个字段)。
现在,解决您的问题“我还能使用什么?”直接,这里有一些选择:
我的第一选择:使用线程。您没有显示启动其他进程的代码,但由于您在它们之间共享一个 Perl 数组,我怀疑所有进程都运行相同的 Perl 脚本(或者代码可以这样编写)。因此,不要使用分叉进程,而是使用 线程 并使用线程锁定原语来共享
线程之间的@urls
数组。我说这是我的第一选择,因为多线程现在更常见的是使用线程而不是分叉来完成,因此有很多很棒的示例,并且可用的模块已经有很多用途(并且它们通常不依赖于 Sys V接口)。我的第二个选择是使用 File ::Map 在进程之间共享数据。同样,这避免了 Sys V 接口,并且可能与共享内存一样快,因为系统当然会在 RAM 中缓存共享文件的页面(您甚至可以要求系统将文件固定到内存,如果你愿意的话)。就像上面的线程注释一样,不要忘记使用适当的锁定。
最后,我在您的代码中没有看到任何锁定调用,因此是否有可能有一个进程生成 URL,而其他进程以只读方式访问数据结构?如果是这样,另一个选择是通过管道将 URL 提供给子流程。但根据您通常拥有的 URL 数量以及这些 URL 是否真的对孩子来说是只读的,这个想法可能不适用。
希望能为您提供一些可行的替代方案。
You may have already done this, but it is always good to confirm exactly what is failing and if the change you made took effect. To confirm that this is really semget() returning ENOSPC, you could run it with:
strace -ooutfile CMD
and then look for ENOSPC in
outfile
to confirm which system call returned it.To confirm adjusting SEMMNI and SEMMNS worked, you can:
cat /proc/sys/kernel/sem
(as man proc says, SEMMNI is the fourth field, SEMMNS is the second field).
Now, to address your question of "what else could I use?" directly, here are some choices:
My first choice: use threads. You don't show the code that starts other processes, but since you are sharing a Perl array between them, I suspect all processes are running the same Perl script (or the code could be written that way). So instead of forking processes, use threads and use the thread locking primitives to shared the
@urls
array between threads. I say this is my first choice because multi-threading is more commonly done with threads than with forking these days, so there are lots of great examples and the modules available have had lots of use (and they don't typically depend on Sys V interfaces).My second choice would be to use File::Map to share the data between processes. Again this avoids the Sys V interfaces and is likely every bit as fast as shared memory since the system will of course cache pages of the shared file in RAM (you can even ask the system to pin the file into RAM if you like). Like the threads comment above, don't forget to use the appropriate locking.
Finally, I don't see any locking calls in your code, so is it possible that you have one process generating the URLs and the other processes access the data structure read-only? If so, another option is to feed the URLs to the sub-processes via pipes. But based on the scale of how many URLs you typically have and if they really are read-only by the children, this idea may not apply.
Hope that gives you some viable alternatives.