OpenMP:堆损坏的原因,有人知道吗?
编辑:我可以同时运行同一个程序两次,没有任何问题 - 如何使用 OpenMP 或其他方法复制它?
这就是问题的基本框架。
//Defined elsewhere
class SomeClass
{
public:
void Function()
{
// Allocate some memory
float *Data;
Data = new float[1024];
// Declare a struct which will be used by functions defined in the DLL
SomeStruct Obj;
Obj = MemAllocFunctionInDLL(Obj);
// Call it
FunctionDefinedInDLL(Data,Obj);
// Clean up
MemDeallocFunctionInDLL(Obj);
delete [] Data;
}
}
void Bar()
{
#pragma omp parallel for
for(int j = 0;j<10;++j)
{
SomeClass X;
X.Function();
}
}
我已经验证,当尝试通过 MemDeallocFunctionInDLL()
释放某些内存时,_CrtIsValidHeapPointer()
断言失败。
这是因为两个线程都写入同一内存吗?
因此,为了解决这个问题,我想我应该将 SomeClass
设为私有(这对我来说完全陌生,因此感谢任何帮助)。
void Bar()
{
SomeClass X;
#pragma omp parallel for default(shared) private(X)
for(int j = 0;j<10;++j)
{
X.Function();
}
}
现在,当它尝试在开始时为数据
分配内存时,它会失败。
注意:如果需要,我可以对 DLL 进行更改
注意:它可以在没有 #pragma omp parallel for
的情况下完美运行
编辑: 现在 Bar
看起来像这样:
void Bar()
{
int j
#pragma omp parallel for default(none) private(j)
for(j = 0;j<10;++j)
{
SomeClass X;
X.Function();
}
}
仍然没有运气。
EDIT: I can run the same program twice, simultaneously without any problem - how can I duplicate this with OpenMP or with some other method?
This is the basic framework of the problem.
//Defined elsewhere
class SomeClass
{
public:
void Function()
{
// Allocate some memory
float *Data;
Data = new float[1024];
// Declare a struct which will be used by functions defined in the DLL
SomeStruct Obj;
Obj = MemAllocFunctionInDLL(Obj);
// Call it
FunctionDefinedInDLL(Data,Obj);
// Clean up
MemDeallocFunctionInDLL(Obj);
delete [] Data;
}
}
void Bar()
{
#pragma omp parallel for
for(int j = 0;j<10;++j)
{
SomeClass X;
X.Function();
}
}
I've verified that when some memory is attempted to be deallocated through MemDeallocFunctionInDLL()
, the _CrtIsValidHeapPointer()
assertion fails.
Is this because both threads are writing to the same memory?
So to fix this, I thought I'd make SomeClass
private (this is totally alien to me, so any help is appreciated).
void Bar()
{
SomeClass X;
#pragma omp parallel for default(shared) private(X)
for(int j = 0;j<10;++j)
{
X.Function();
}
}
And now it fails when it tries to allocate memory in the beginning for Data
.
Note: I can make changes to the DLL if required
Note: It runs perfectly without #pragma omp parallel for
EDIT: Now Bar
looks like this:
void Bar()
{
int j
#pragma omp parallel for default(none) private(j)
for(j = 0;j<10;++j)
{
SomeClass X;
X.Function();
}
}
Still no luck.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
检查 MemAllocFunctionInDLL、FunctionDefinedInDLL、MemDeallocFunctionInDLL 是否是线程安全或可重入。换句话说,这些函数是静态变量还是共享变量?在这种情况下,您需要确保这些变量不会被其他线程损坏。
没有 omp-for 很好的事实可能意味着您没有正确编写一些线程安全的函数。
我想看看 Mem(Alloc|Dealloc)FunctionInDLL 中使用了什么样的内存分配/释放函数。
添加:我很确定 DLL 中的函数不是线程安全的。您可以毫无问题地同时运行该程序。是的,应该没问题,除非你的程序使用系统范围的共享资源(例如全局内存或进程间共享内存),这种情况很少见。在这种情况下,线程中没有共享变量,因此您的程序可以正常工作。
但是,在多线程(即在单个进程中)中调用这些函数会导致程序崩溃。这意味着线程之间存在一些共享变量,并且它可能已被损坏。
这不是 OpenMP 的问题,而只是多线程错误。解决这个问题可能很简单。请检查DLL函数是否可以安全地被多个线程并发调用。
如何私有化静态变量
假设我们有这样的全局变量:
私有化只不过是为每个线程创建一个私有副本。
并且,对此类变量的任何引用都是
“是的”,这非常简单。但是,此类代码可能存在错误共享问题。但是,错误共享是性能问题,而不是正确性问题。
首先,尝试将所有静态变量和全局变量私有化。然后,检查其正确性。接下来,查看您将获得的加速。如果加速比是可扩展的(比如在四核上快 3.7 倍),那就没问题了。但是,如果加速比较低(例如四核上的加速比为 2 倍),那么您可能会遇到错误共享问题。要解决错误共享问题,您只需在数据结构中添加一些填充即可。
Check out MemAllocFunctionInDLL, FunctionDefinedInDLL, MemDeallocFunctionInDLL are thread-safe, or re-entrant. In other words, do these functions static variables or shared variables? In such case, you need to make it sure these variables are not corrupted by other threads.
The fact without omp-for is fine could mean you didn't correctly write some functions to be thread-safe.
I'd like to see what kind of memory allocation/free functions has been used in Mem(Alloc|Dealloc)FunctionInDLL.
Added: I'm pretty sure your functions in DLL is not thread-safe. You can run this program concurrently without problem. Yes, it should be okay unless your program uses system-wide shared resources (such as global memory or shared memory among processes), which is very rare. In this case, no shared variables in threads, so your program works fine.
But, invoking these functions in mutithreads (that means in a single process) crashes your program. It means there are some shared variables among threads, and it could have been corrupted.
It's not a problem of OpenMP, but just a multithreading bug. It could be simple to solve this problem. Please take a look the DLL functions whether they are safe to be called in concurrent by many threads.
How to privatize static variables
Say that we have such global variables:
Privatization is nothing but a creating private copy for each thread.
And, then any references on such variables are
Yes, it's pretty simple. However, this sort of code may have a false sharing problem. But, false sharing is a matter of performance, not correctness.
First, just try to privatize any static and global variables. Then, check it correctness. Next, see the speedup you would get. If the speedup is scalable (say 3.7x faster on quad core), then it's okay. But, in case of low speedup (such as 2x speedup on quad core), then you probably look at the false sharing problem. To solve false sharing problem, all you need to do is just putting some padding in data structures.
而不是
必须编写
无论您在何处执行 new [],都必须使用 delete [],
。看起来你的问题并不是 openmp 特有的。您是否尝试在不包含 #pragma parallel 的情况下运行您的应用程序?
Instead of
you must write
Wherever you do new [], make sure to use delete [].
It looks like your problem is not specific to openmp. Did you try to run your application without including #pragma parallel?
default(shared) 意味着所有变量在线程之间共享,这不是您想要的。将其更改为默认值(无)。
Private(X) 将为每个线程制作 X 的副本,但是,它们都不会被初始化,因此不一定会执行任何构造。
我认为您最好采用最初的方法,在 Dealloc 调用中放置一个断点,然后查看内存指针是什么以及它包含什么。您可以看到保护字节来判断内存是否已被覆盖单个调用结束或线程之后。
顺便说一句,我假设如果您运行一次而不使用 omp 循环,这会起作用吗?
default(shared) means all variables are shared between threads, which is not what you want. Change that to default(none).
Private(X) will make a copy of X for each thread, however, none of them will be initialised so any construction will not necessarily be performed.
I think you'd be better with your initial approach, put a breakpoint in the Dealloc call, and see what the memory pointer is and what it contains. You can see the guard bytes to tell if the memory has been overwritten at the end of a single call, or after a thread.
Incidentally, I am assuming this works if you run it once, without the omp loop?