伯姆 GC++垃圾收集器:堆部分太多增加 MAXHINCR 或 MAX_HEAP_SECTS
我在应用程序中使用 Boehm C++ 垃圾收集器。该应用程序使用 Levenshtein 确定性有限自动机 Python 程序来计算两个字符串之间的 Levenshtein 距离。我已使用 gcc 4.1.2 将 Python 程序移植到 Centos Linux 版本上的 C++。
最近,我注意到运行应用程序超过 10 分钟后,收到 SIGABRT 错误消息:堆部分过多:增加 MAXHINCR 或 MAX_HEAP_SECTS
。我想知道是否有人知道如何解决或解决这个问题。
这是我的 gdb 堆栈跟踪。谢谢。
Program received signal SIGABRT, Aborted.
(gdb) bt
#0 0x002ed402 in __kernel_vsyscall ()
#1 0x00b1bdf0 in raise () from /lib/libc.so.6
#2 0x00b1d701 in abort () from /lib/libc.so.6
#3 0x00e28db4 in GC_abort (msg=0xf36de0 "Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS")
at ../Source/misc.c:1079
#4 0x00e249a0 in GC_add_to_heap (p=0xb7cb7000, bytes=65536) at ../Source/alloc.c:812
#5 0x00e24e45 in GC_expand_hp_inner (n=16) at ../Source/alloc.c:966
#6 0x00e24fc5 in GC_collect_or_expand (needed_blocks=1, ignore_off_page=0) at ../Source/alloc.c:1032
#7 0x00e2519a in GC_allocobj (sz=6, kind=1) at ../Source/alloc.c:1087
#8 0x00e31e90 in GC_generic_malloc_inner (lb=20, k=1) at ../Source/malloc.c:138
#9 0x00e31fde in GC_generic_malloc (lb=20, k=1) at ../Source/malloc.c:194
#10 0x00e322b8 in GC_malloc (lb=20) at ../Source/malloc.c:319
#11 0x00df5ab5 in gc::operator new (size=20) at ../Include/gc_cpp.h:275
#12 0x00de7cb7 in __automata_combined_test2__::DFA::levenshtein_automata (this=0xb7b49080, term=0xb7cb5d20, k=1)
at ../Source/automata_combined_test2.cpp:199
#13 0x00e3a085 in cDedupe::AccurateNearCompare (this=0x8052cd8,
Str1_=0x81f1a1d "GEMMA OSTRANDER GEM 10
DICARLO", ' ' <repeats 13 times>, "01748SUE WOLFE SUE 268 POND", ' ' <repeats 16 times>,
"01748REGINA SHAKIN REGI16 JAMIE", ' ' <repeats 15 times>, "01748KATHLEEN MAZUR CATH10 JAMIE "
...,
Str2_=0x81f2917 "LINDA ROBISON LIN 53 CHESTNUT", ' ' <repeats 12 times>,
"01748MICHELLE LITAVIS MICH15 BLUEBERRY", ' ' <repeats 11 times>, "01748JOAN TITUS JO 6 SMITH",
' ' <repeats 15 times>, "01748MELINDA MCDOWELL MEL 24 SMITH "..., Size_=10,
更新:
我查看了 Boehm Garbage Collector 源代码和头文件,并意识到:Too much heapsections:Increase MAXHINCR or MAX_HEAP_SECTS
错误消息可以通过添加 ‑DLARGE_CONFIG 来修复我的 GNUmakefile 中的 CFLAGS 部分。
我测试了对 GNUMakfile 的这一更改,发现不再出现 Too much heapsections:Increase MAXHINCR or MAX_HEAP_SECTS
错误消息。但是我遇到了一个新的分段错误(核心转储)。使用 gdb,我发现 GDB 分段错误发生在第 20 行的以下函数中(我已对其进行了注释):
set<tuple2<__ss_int, __ss_int> *> *NFA::next_state(set<tuple2<__ss_int, __ss_int> *> *states, str *input) {
tuple2<__ss_int, __ss_int> *state;
set<tuple2<__ss_int, __ss_int> *>::for_in_loop __3;
set<tuple2<__ss_int, __ss_int> *> *__0, *dest_states;
dict<str *, set<tuple2<__ss_int, __ss_int> *> *> *state_transitions;
__iter<tuple2<__ss_int, __ss_int> *> *__1;
__ss_int __2;
dest_states = (new set<tuple2<__ss_int, __ss_int> *>());
FOR_IN_NEW(state,states,0,2,3)
state_transitions = (this->transitions)->get(state, ((dict<str *, set<tuple2<__ss_int, __ss_int> *> *> *)((new dict<void *, void *>()))));
dest_states->update(state_transitions->get(input, new set<tuple2<__ss_int, __ss_int> *>()));
dest_states->update(state_transitions->get(NFA::ANY, new set<tuple2<__ss_int, __ss_int> *>()));
END_FOR
return (new set<tuple2<__ss_int, __ss_int> *>(this->_expand(dest_states),1));//line20
}
我想知道是否可以修改此函数来修复分段错误。谢谢。
I am using the Boehm C++ Garbage collector in an application. The application uses the Levenshtein Deterministic Finite Automata Python program to calculate the Levenshtein distance between two string. I have ported the Python program to C++ on version of Centos Linux using gcc 4.1.2.
Recently, I noticed that after I run the application more than 10 minutes, I get the SIGABRT error message: Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
. I was wondering if anyone knew how to fix or work around this problem.
Here is my gdb stack trace. Thank you.
Program received signal SIGABRT, Aborted.
(gdb) bt
#0 0x002ed402 in __kernel_vsyscall ()
#1 0x00b1bdf0 in raise () from /lib/libc.so.6
#2 0x00b1d701 in abort () from /lib/libc.so.6
#3 0x00e28db4 in GC_abort (msg=0xf36de0 "Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS")
at ../Source/misc.c:1079
#4 0x00e249a0 in GC_add_to_heap (p=0xb7cb7000, bytes=65536) at ../Source/alloc.c:812
#5 0x00e24e45 in GC_expand_hp_inner (n=16) at ../Source/alloc.c:966
#6 0x00e24fc5 in GC_collect_or_expand (needed_blocks=1, ignore_off_page=0) at ../Source/alloc.c:1032
#7 0x00e2519a in GC_allocobj (sz=6, kind=1) at ../Source/alloc.c:1087
#8 0x00e31e90 in GC_generic_malloc_inner (lb=20, k=1) at ../Source/malloc.c:138
#9 0x00e31fde in GC_generic_malloc (lb=20, k=1) at ../Source/malloc.c:194
#10 0x00e322b8 in GC_malloc (lb=20) at ../Source/malloc.c:319
#11 0x00df5ab5 in gc::operator new (size=20) at ../Include/gc_cpp.h:275
#12 0x00de7cb7 in __automata_combined_test2__::DFA::levenshtein_automata (this=0xb7b49080, term=0xb7cb5d20, k=1)
at ../Source/automata_combined_test2.cpp:199
#13 0x00e3a085 in cDedupe::AccurateNearCompare (this=0x8052cd8,
Str1_=0x81f1a1d "GEMMA OSTRANDER GEM 10
DICARLO", ' ' <repeats 13 times>, "01748SUE WOLFE SUE 268 POND", ' ' <repeats 16 times>,
"01748REGINA SHAKIN REGI16 JAMIE", ' ' <repeats 15 times>, "01748KATHLEEN MAZUR CATH10 JAMIE "
...,
Str2_=0x81f2917 "LINDA ROBISON LIN 53 CHESTNUT", ' ' <repeats 12 times>,
"01748MICHELLE LITAVIS MICH15 BLUEBERRY", ' ' <repeats 11 times>, "01748JOAN TITUS JO 6 SMITH",
' ' <repeats 15 times>, "01748MELINDA MCDOWELL MEL 24 SMITH "..., Size_=10,
Update:
I looked at Boehm Garbage Collector source and header files and realized that the: Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
error message could be fixed by adding ‑DLARGE_CONFIG to the CFLAGS section in my GNUmakefile.
I tested this change to my GNUmakfile and found that the Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
error message no longer occuured. However I am getting a new segmentation fault (core dump). Using gdb I found that the GDB segmentation fault occured in the following function at line 20 (which I have annotated):
set<tuple2<__ss_int, __ss_int> *> *NFA::next_state(set<tuple2<__ss_int, __ss_int> *> *states, str *input) {
tuple2<__ss_int, __ss_int> *state;
set<tuple2<__ss_int, __ss_int> *>::for_in_loop __3;
set<tuple2<__ss_int, __ss_int> *> *__0, *dest_states;
dict<str *, set<tuple2<__ss_int, __ss_int> *> *> *state_transitions;
__iter<tuple2<__ss_int, __ss_int> *> *__1;
__ss_int __2;
dest_states = (new set<tuple2<__ss_int, __ss_int> *>());
FOR_IN_NEW(state,states,0,2,3)
state_transitions = (this->transitions)->get(state, ((dict<str *, set<tuple2<__ss_int, __ss_int> *> *> *)((new dict<void *, void *>()))));
dest_states->update(state_transitions->get(input, new set<tuple2<__ss_int, __ss_int> *>()));
dest_states->update(state_transitions->get(NFA::ANY, new set<tuple2<__ss_int, __ss_int> *>()));
END_FOR
return (new set<tuple2<__ss_int, __ss_int> *>(this->_expand(dest_states),1));//line20
}
I was wondering if it was possible to modify this function to fix the segmentation fault. Thank you.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我终于想出了修复 GC 内存不足分段错误的方法。我替换了 python 程序中的 setdefault 和 get 结构。例如,我将 self.transitions.setdefault(src, {}).setdefault(input, set()).add(dest) python 语句替换为:
另外,我将 python 语句替换
为
:最后,我确定将
__shedkin__.init()
放在 for 或 while 循环之外。__shedskin__.init()
调用 GC 分配器。所有这些改变的目的都是为了减轻GC分配器的压力。我已经通过 300 万次 GC 分配器调用测试了这些更改,但尚未出现分段错误。谢谢。
I finally figured out to fix the GC out of memory segmentation fault. I replaced the setdefault and the get constructs in the python program. For example, I replaced the self.transitions.setdefault(src, {}).setdefault(input, set()).add(dest) python statement with:
Also, I replaced the python statement:
with:
Finally, I made sure to put
__shedkin__.init()
outside of the for or while loop.__shedskin__.init()
calls the GC allocator. The purpose of all of these changes is to reduce the pressure on the GC allocator.I have tested these changes with 3 million calls to the GC allocator and I have yet to get a segmentation fault. Thank you.