Fortran 90 中的堆栈溢出

发布于 2024-11-03 15:03:57 字数 4070 浏览 6 评论 0原文

我用 Fortran 90 编写了一个相当大的程序。它已经工作得很好很长一段时间了,但今天我试图将它提高一个档次并增加问题的规模(它是一个研究型非标准有限元求解器,如果那样的话)帮助任何人...)现在我收到“堆栈溢出”错误消息,程序自然终止,没有给我任何有用的东西。

该程序首先设置所有相关的数组和矩阵,完成后它会将与此相关的几行统计信息打印到日志文件中。即使对于我的新的、更大的问题,这也可以正常工作(尽管有点慢),但随着“数字运算”的进行,它会失败。

让我困惑的是,此时的所有内容都已分配(并且工作正常且没有错误)。我不完全确定堆栈是什么(维基百科和这里的几个步骤并没有多大作用,因为我对计算机的“幕后”工作只有相当基本的了解)。

假设我有一些数组初始化为:

INTEGER,DIMENSION(64) :: IA
REAL(8),DIMENSION(:,:),ALLOCATABLE :: AA, BB

在一些初始化例程(即从文件读取输入等)之后分配为(我存储一些大小整数以便更容易传递到固定大小的 IA 中的子例程):

ALLOCATE( AA(N1,N2) , BB(N1,N2) )
IA(1) = N1
IA(2) = N2

这基本上是最初部分发生了什么,到目前为止一切顺利。但是,当我调用子例程时

CALL ROUTINE_ONE(AA,BB,IA)

,例程看起来像(没什么花哨的):

SUBROUTINE ROUTINE_ONE(AA,BB,IA)
IMPLICIT NONE
INTEGER,DIMENSION(64) :: IA
REAL(8),DIMENSION(IA(1),IA(2)) :: AA, BB
...
do lots of other stuff
...
END SUBROUTINE ROUTINE_ONE

现在我收到错误!屏幕上的输出显示:

forrtl: severe (170): Program Exception - stack overflow

但是,当我使用调试器运行该程序时,它在名为 winsig.c 的文件(不是我的文件,但可能是编译器的一部分?)中的第 419 行处中断。它似乎是名为 sigreterror: 的例程的一部分,并且它是已调用的默认情况,返回文本无效信号或错误。附加了一个注释行,奇怪的是 /* 永远不应该发生,但编译器无法告诉 */ ...?

所以我想我的问题是,为什么会发生这种情况以及实际发生了什么?我想只要我能分配所有相关的内存就应该没问题吗?对子例程的调用是否会复制参数,或者只是指向它们的指针?如果答案是副本,那么我可以看到问题可能出在哪里,如果是这样:关于如何解决它的任何想法?

我试图解决的问题很大,但无论如何都不疯狂。标准有限元求解器可以处理比我当前的问题更大的问题。我在 Dell PowerEdge 1850 上运行该程序,操作系统是 Microsoft Server 2008 R2 Enterprise。根据 cmd 提示符下的 systeminfo,我有 8GB 物理内存和近 16GB 虚拟内存。据我了解,我的所有数组和矩阵的总和不应超过 100MB - 大约 5.5Minteger(4) 和 2.5Mreal(8) > (根据我的说法,应该只有 44MB 左右,但公平地说,再添加 50MB 的开销)。

我使用与 Microsoft Visual Studio 2008 集成的 Intel Fortran 编译器。


添加一些实际的源代码以澄清一点

! Update continuum state
CALL UpdateContinuumState(iTask,iArray,posc,dof,dof_k,nodedof,elm,&
                    bmtrx,detjac,w,mtrlprops,demtrx,dt,stress,strain,effstrain,&
                    effstress,aa,fi,errmsg)

是对例程的实际调用。大数组有 poscbmtrxaa - 所有其他数组都至少小一个数量级(如果不是更多的话)。 poscINTEGER(4)bmtrxaaREAL(8)

SUBROUTINE UpdateContinuumState(iTask,iArray,posc,dof,dof_k,nodedof,elm,bmtrx,&
                    detjac,w,mtrlprops,demtrx,dt,stress,strain,effstrain,&
                    effstress,aa,fi,errmsg)

    IMPLICIT NONE

    !I/O
    INTEGER(4) :: iTask, errmsg
    INTEGER(4) :: iArray(64)
    INTEGER(4),DIMENSION(iArray(15),iArray(15),iArray(5)) :: posc
    INTEGER(4),DIMENSION(iArray(22),iArray(21)+1) :: nodedof
    INTEGER(4),DIMENSION(iArray(29),iArray(3)+2) :: elm
    REAL(8),DIMENSION(iArray(14)) :: dof, dof_k
    REAL(8),DIMENSION(iArray(12)*iArray(17),iArray(15)*iArray(5)) :: bmtrx
    REAL(8),DIMENSION(iArray(5)*iArray(17)) :: detjac
    REAL(8),DIMENSION(iArray(17)) :: w
    REAL(8),DIMENSION(iArray(23),iArray(19)) :: mtrlprops
    REAL(8),DIMENSION(iArray(8),iArray(8),iArray(23)) :: demtrx
    REAL(8) :: dt
    REAL(8),DIMENSION(2,iArray(12)*iArray(17)*iArray(5)) :: stress
    REAL(8),DIMENSION(iArray(12)*iArray(17)*iArray(5)) :: strain
    REAL(8),DIMENSION(2,iArray(17)*iArray(5)) :: effstrain, effstress
    REAL(8),DIMENSION(iArray(25)) :: aa
    REAL(8),DIMENSION(iArray(14)) :: fi 

    !Locals
    INTEGER(4) :: i, e, mtrl, i1, i2, j1, j2, k1, k2, dim, planetype, elmnodes, &
        Nec, elmpnodes, Ndisp, Nstr, Ncomp, Ngpt, Ndofelm
    INTEGER(4),DIMENSION(iArray(15)) :: doflist
    REAL(8),DIMENSION(iArray(12)*iArray(17),iArray(15)) :: belm
    REAL(8),DIMENSION(iArray(17)) :: jelm
    REAL(8),DIMENSION(iArray(12)*iArray(17)*iArray(5)) :: dstrain
    REAL(8),DIMENSION(iArray(12)*iArray(17)) :: s
    REAL(8),DIMENSION(iArray(17)) :: ep, es, dep
    REAL(8),DIMENSION(iArray(15),iArray(15)) :: kelm
    REAL(8),DIMENSION(iArray(15)) :: felm

    dim       = iArray(1)
...

它在上面最后一行之前失败了。

I have written a fairly large program in Fortran 90. It has been working beautifully for quite a while, but today I tried to step it up a notch and increase the problem size (it is a research non-standard FE-solver, if that helps anyone...) Now I get the "stack overflow" error message and naturally the program terminates without giving me anything useful to work with.

The program starts with setting up all relevant arrays and matrices, and after that is done it prints a few lines of stats regarding this to a log-file. Even with my new, larger problem, this works fine (albeit a little slow), but then it fails as the "number crunching" gets going.

What confuses me is that everything at that point is already allocated (and that worked without errors). I'm not entirely sure what the stack is (Wikipedia and several treads here didn't do much since I have only a quite basic knowledge of the "behind the scenes" workings of a computer).

Assume that I for instance have some arrays initialized as:

INTEGER,DIMENSION(64) :: IA
REAL(8),DIMENSION(:,:),ALLOCATABLE :: AA, BB

which after some initialization routines (i.e. read input from file and such) are allocated as (I store some size-integers for easier passing to subroutines in IA of fixed size):

ALLOCATE( AA(N1,N2) , BB(N1,N2) )
IA(1) = N1
IA(2) = N2

This is basically what happens in the initial portion, and so far so good. But when I then call a subroutine

CALL ROUTINE_ONE(AA,BB,IA)

And the routine looks like (nothing fancy):

SUBROUTINE ROUTINE_ONE(AA,BB,IA)
IMPLICIT NONE
INTEGER,DIMENSION(64) :: IA
REAL(8),DIMENSION(IA(1),IA(2)) :: AA, BB
...
do lots of other stuff
...
END SUBROUTINE ROUTINE_ONE

Now I get an error! The output to the screen says:

forrtl: severe (170): Program Exception - stack overflow

However, when I run the program with the debugger it breaks at line 419 in a file called winsig.c (not my file, but probably part of the compiler?). It seems to be part of a routine called sigreterror: and it is the default case that has been invoked, returning the text Invalid signal or error. There is a comment line attached to this which strangely says /* should never happen, but compiler can't tell */ ...?

So I guess my question is, why does this happen and what is actually happening? I thought that as long as I can allocate all the relevant memory I should be fine? Does the call to the subroutine make copies of the arguments, or just pointers to them? If the answer is copies then I can see where the problem might be, and if so: any ideas on how to get around it?

The problem I try to solve is big, but not insane in any way. Standard FE-solvers can handle bigger problems than my current one. I run the program on a Dell PowerEdge 1850 and the OS is Microsoft Server 2008 R2 Enterprise. According to systeminfo at the cmd prompt I have 8GB of physical memory and almost 16GB virtual. As far as I understand the total of all my arrays and matrices should not add up to more than maybe 100MB - about 5.5M integer(4) and 2.5M real(8) (which according to me should be only about 44MB, but let's be fair and add another 50MB for overhead).

I use the Intel Fortran compiler integrated with Microsoft Visual Studio 2008.


Adding some actual source code to clarify a bit

! Update continuum state
CALL UpdateContinuumState(iTask,iArray,posc,dof,dof_k,nodedof,elm,&
                    bmtrx,detjac,w,mtrlprops,demtrx,dt,stress,strain,effstrain,&
                    effstress,aa,fi,errmsg)

is the actual call to the routine. Big arrays are posc, bmtrx and aa - all other are at least an order of magnitude smaller (if not more). posc is INTEGER(4) and bmtrx and aa is REAL(8)

SUBROUTINE UpdateContinuumState(iTask,iArray,posc,dof,dof_k,nodedof,elm,bmtrx,&
                    detjac,w,mtrlprops,demtrx,dt,stress,strain,effstrain,&
                    effstress,aa,fi,errmsg)

    IMPLICIT NONE

    !I/O
    INTEGER(4) :: iTask, errmsg
    INTEGER(4) :: iArray(64)
    INTEGER(4),DIMENSION(iArray(15),iArray(15),iArray(5)) :: posc
    INTEGER(4),DIMENSION(iArray(22),iArray(21)+1) :: nodedof
    INTEGER(4),DIMENSION(iArray(29),iArray(3)+2) :: elm
    REAL(8),DIMENSION(iArray(14)) :: dof, dof_k
    REAL(8),DIMENSION(iArray(12)*iArray(17),iArray(15)*iArray(5)) :: bmtrx
    REAL(8),DIMENSION(iArray(5)*iArray(17)) :: detjac
    REAL(8),DIMENSION(iArray(17)) :: w
    REAL(8),DIMENSION(iArray(23),iArray(19)) :: mtrlprops
    REAL(8),DIMENSION(iArray(8),iArray(8),iArray(23)) :: demtrx
    REAL(8) :: dt
    REAL(8),DIMENSION(2,iArray(12)*iArray(17)*iArray(5)) :: stress
    REAL(8),DIMENSION(iArray(12)*iArray(17)*iArray(5)) :: strain
    REAL(8),DIMENSION(2,iArray(17)*iArray(5)) :: effstrain, effstress
    REAL(8),DIMENSION(iArray(25)) :: aa
    REAL(8),DIMENSION(iArray(14)) :: fi 

    !Locals
    INTEGER(4) :: i, e, mtrl, i1, i2, j1, j2, k1, k2, dim, planetype, elmnodes, &
        Nec, elmpnodes, Ndisp, Nstr, Ncomp, Ngpt, Ndofelm
    INTEGER(4),DIMENSION(iArray(15)) :: doflist
    REAL(8),DIMENSION(iArray(12)*iArray(17),iArray(15)) :: belm
    REAL(8),DIMENSION(iArray(17)) :: jelm
    REAL(8),DIMENSION(iArray(12)*iArray(17)*iArray(5)) :: dstrain
    REAL(8),DIMENSION(iArray(12)*iArray(17)) :: s
    REAL(8),DIMENSION(iArray(17)) :: ep, es, dep
    REAL(8),DIMENSION(iArray(15),iArray(15)) :: kelm
    REAL(8),DIMENSION(iArray(15)) :: felm

    dim       = iArray(1)
...

And it fails before the last line above.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

小镇女孩 2024-11-10 15:03:57

根据 Steabert 的要求,我将在此处的评论中总结对话,尽管 MSB 的答案已经触及问题的核心,但它更加明显。

在技​​术编程中,过程通常具有用于中间计算的大型本地数组,这种情况经常发生。局部变量通常存储在堆栈中,堆栈通常(而且相当合理地)只占整个系统内存的一小部分——通常约为 10MB 左右。当局部变量大小超过堆栈大小时,您会看到此处描述的症状 - 在调用相关子例程之后但在其第一个可执行语句之前发生堆栈溢出。

所以当这个问题发生时,最好的办法就是找到相关的大局部变量,并决定如何处理。在这种情况下,至少变量 belm 和 dstrain 变得相当大。

一旦找到了变量,并且您确认了问题所在,就有几个选项。正如 MSB 指出的那样,如果您可以缩小阵列,那么这就是一种选择。或者,您可以增大堆栈大小;在 Linux 下,这是通过 ulimit -s [newsize] 完成的。不过,这实际上只是推迟了问题的解决,并且您必须在 Windows 计算机上做一些不同的事情。

避免此问题的另一类方法是将大数据放在堆栈上,而是放在其余内存(“堆”)中。您可以通过为数组赋予 save 属性(在 C 语言中为 static)来实现这一点;这会将变量放在堆上,从而使值在调用之间保持不变。缺点是这可能会改变子例程的行为,并且意味着子例程不能递归使用,并且同样是非线程安全的(如果您处于多个线程同时进入例程的位置,它们每个人都会看到本地变量的相同副本,并且可能会覆盖彼此的结果)。好处是它简单且非常便携——它应该可以在任何地方使用。然而,这仅适用于固定大小的局部变量;如果临时数组的大小取决于输入,则无法执行此操作(因为不再需要保存单个变量;每次调用过程时它的大小可能不同)。

有一些特定于编译器的选项,可将所有数组(或大于某个给定大小的所有数组)放在堆上而不是放在堆栈上;我知道的每个 Fortran 编译器都有一个选项。对于 OPs 帖子中使用的 ifort,在 Linux 中为 -heap-arrays,在 Windows 中为 /heap-arrays。对于 gfortran,这实际上可能是默认值。这有助于确保您知道发生了什么,但这意味着您必须为每个编译器使用不同的咒语以确保您的代码正常工作。

最后,您可以使有问题的数组可分配。分配的内存位于堆上;但指向它们的变量位于堆栈上,因此您可以获得这两种方法的好处。而且,这是完全标准的 Fortran,因此完全可移植。缺点是需要更改代码。此外,分配过程可能会花费大量时间;因此,如果您要无数次调用例程,您可能会注意到这会稍微减慢速度。 (不过,这种可能的性能回归很容易修复;如果您将使用相同大小的数组调用它无数次,您可以有一个可选参数来传入预先分配的本地数组并使用它,这样您只需分配/取消分配一次)。

每次分配/解除分配看起来像:

SUBROUTINE UpdateContinuumState(iTask,iArray,posc,dof,dof_k,nodedof,elm,bmtrx,&
                    detjac,w,mtrlprops,demtrx,dt,stress,strain,effstrain,&
                    effstress,aa,fi,errmsg)

    IMPLICIT NONE

    !...arguments.... 


    !Locals
    !...
    REAL(8),DIMENSION(:,:), allocatable :: belm
    REAL(8),DIMENSION(:), allocatable :: dstrain

    allocate(belm(iArray(12)*iArray(17),iArray(15))  
    allocate(dstrain(iArray(12)*iArray(17)*iArray(5))

    !... work

    deallocate(belm)
    deallocate(dstrain)

请注意,如果子例程执行大量工作(例如,需要几秒钟执行),则几次分配/解除分配的开销应该可以忽略不计。如果没有,并且您想避免开销,则使用预分配工作空间的可选参数将类似于:

SUBROUTINE UpdateContinuumState(iTask,iArray,posc,dof,dof_k,nodedof,elm,bmtrx,&
                    detjac,w,mtrlprops,demtrx,dt,stress,strain,effstrain,&
                    effstress,aa,fi,errmsg,workbelm,workdstrain)

    IMPLICIT NONE

    !...arguments.... 
    real(8),dimension(:,:), optional, target :: workbelm
    real(8),dimension(:), optional, target :: workdstrain
    !Locals
    !...

    REAL(8),DIMENSION(:,:), pointer :: belm
    REAL(8),DIMENSION(:), pointer :: dstrain

    if (present(workbelm)) then
       belm => workbelm
    else
       allocate(belm(iArray(12)*iArray(17),iArray(15))
    endif
    if (present(workdstrain)) then
       dstrain => workdstrain
    else
       allocate(dstrain(iArray(12)*iArray(17)*iArray(5))
    endif

    !... work

    if (.not.(present(workbelm))) deallocate(belm)
    if (.not.(present(workdstrain))) deallocate(dstrain)

As per steabert's request, I'll just summarize the conversation in the comments here where it's a bit more visible, even though M.S.B.'s answer already gets right to the nub of the problem.

In technical programming, where procedures often have large local arrays for intermediate computation, this happens a lot. Local variables are generally stored on the stack, which typically (and quite reasonably) a small fraction of overall system memory -- usually of order 10MB or so. When the local variable sizes exceed the stack size, you see exactly the symptoms described here -- a stack overflow occuring after a call to the relevant subroutine but before its first executable statement.

So when this problem happens, the best thing to do is to find the relevant large local variables, and decide what to do. In this case, at least the variables belm and dstrain were getting quite sizable.

Once the variables are located, and you've confirmed that's the problem, there's a few options. As MSB points out, if you can make your arrays smaller, that's one option. Alternatively, you can make the stack size larger; under linux, that's done with ulimit -s [newsize]. That really just postpones the problem, though, and you have to do something different on windows machines.

The other class of ways to avoid this problem is not to put the large data on the stack, but in the rest of memory (the "heap"). You can do that by giving the arrays the save attribute (in C, static); this puts the variable on the heap and thus makes the values persistent between calls. The downside there is that this potentially changes the behavior of the subroutine, and means the subroutine can't be used recursively, and similarly is non-threadsafe (if you're ever in a position where multiple threads will enter the routine simulatneously, they'll each see the same copy of the local varaiable and potentially overwrite each other's results). The upside is that it's easy and very portable -- it should work everywhere. However, this will only work with fixed-size local variables; if the temporary arrays have sizes that depend on the inputs, you can't do this (since there'd no longer be a single variable to save; it could be different size every time the procedure is called).

There are compiler-specific options which put all arrays (or all arrays of larger than some given size) on the heap rather than on the stack; every Fortran compiler I know has an option for this. For ifort, used in the OPs post, it's -heap-arrays in linux, or /heap-arrays for windows. For gfortran, this may actually be the default. This is good for making sure you know what's going on, but it means you have to have different incantations for every compiler to make sure your code works.

Finally, you can make the offending arrays allocatable. Allocated memory goes on the heap; but the variable which points to them is on the stack, so you get the benefits of both approaches. Also, this is completely standard fortran and so totally portable. The downside is that it requires code changes. Also, the allocation process can take nontrivial amounts of time; so if you're going to be calling the routine zillions of times, you may notice this slows things down slightly. (This possible performance regression is easy to fix, though; if you'll be calling it zillions of times with the same size arrays, you can have an optional argument to pass in a pre-allocated local array and use that instead, so that you only allocate/deallocate once).

Allocating/deallocating each time would look like:

SUBROUTINE UpdateContinuumState(iTask,iArray,posc,dof,dof_k,nodedof,elm,bmtrx,&
                    detjac,w,mtrlprops,demtrx,dt,stress,strain,effstrain,&
                    effstress,aa,fi,errmsg)

    IMPLICIT NONE

    !...arguments.... 


    !Locals
    !...
    REAL(8),DIMENSION(:,:), allocatable :: belm
    REAL(8),DIMENSION(:), allocatable :: dstrain

    allocate(belm(iArray(12)*iArray(17),iArray(15))  
    allocate(dstrain(iArray(12)*iArray(17)*iArray(5))

    !... work

    deallocate(belm)
    deallocate(dstrain)

Note that if the subroutine does a lot of work (eg, takes seconds to execute), the overhead from a couple allocate/deallocates should be negligable. If not, and you want to avoid the overhead, using the optional arguments for preallocated worskpace would look something like:

SUBROUTINE UpdateContinuumState(iTask,iArray,posc,dof,dof_k,nodedof,elm,bmtrx,&
                    detjac,w,mtrlprops,demtrx,dt,stress,strain,effstrain,&
                    effstress,aa,fi,errmsg,workbelm,workdstrain)

    IMPLICIT NONE

    !...arguments.... 
    real(8),dimension(:,:), optional, target :: workbelm
    real(8),dimension(:), optional, target :: workdstrain
    !Locals
    !...

    REAL(8),DIMENSION(:,:), pointer :: belm
    REAL(8),DIMENSION(:), pointer :: dstrain

    if (present(workbelm)) then
       belm => workbelm
    else
       allocate(belm(iArray(12)*iArray(17),iArray(15))
    endif
    if (present(workdstrain)) then
       dstrain => workdstrain
    else
       allocate(dstrain(iArray(12)*iArray(17)*iArray(5))
    endif

    !... work

    if (.not.(present(workbelm))) deallocate(belm)
    if (.not.(present(workdstrain))) deallocate(dstrain)
无所谓啦 2024-11-10 15:03:57

并非所有内存都是在程序启动时创建的。当您调用子例程时,可执行文件正在创建子例程用于局部变量所需的内存。通常,具有简单声明且属于该子例程的数组(既不可分配,也不是指针)在堆栈上分配。当您到达这些声明时,您可能已经耗尽了堆栈空间。在具有某些阵列的 32 位操作系统上,您可能已达到 2GB 限制。有时可执行语句会在堆栈上隐式创建临时数组。

可能的解决方案:1)使数组更小(没有吸引力),2)使堆栈更大),3)一些编译器可以选择从将数组放在堆栈上切换到动态分配它们,类似于“分配”使用的方法, 4) 识别大型数组并使其可分配。

Not all of the memory is created when the program starts. When you call the subroutine the executable is creating the memory that the subroutine needs for local variables. Typically arrays with simple declarations that are local to that subroutine -- neither allocatable, nor pointer -- are allocated on the stack. You could have simply run of of stack space when you reached these declarations. You might have reached a 2GB limit on a 32-bit OS with some array. Sometimes executable statements implicitly create a temporary array on the stack.

Possible solutions: 1) make your arrays smaller (not attractive), 2) make the stack larger), 3) some compilers have options to switch from placing arrays on the stack to dynamically allocating them, similar to the method used for "allocate", 4) identify large arrays and make them allocatable.

糖果控 2024-11-10 15:03:57

堆栈是存储函数返回所需信息以及函数本地定义的信息的内存区域。因此,堆栈溢出可能表明您有一个函数调用另一个函数,而另一个函数又调用另一个函数,等等。

我(不再)熟悉 Fortran,但另一个原因可能是这些函数声明了大量局部变量,或者至少需要很多地方的变量。

最后一个:堆栈通常相当小,因此它与机器有多少内存无关。指示链接器增加堆栈大小应该非常简单,至少如果您确定这只是空间不足,而不是应用程序中的错误。

编辑:您在程序中使用递归吗?递归调用会很快耗尽堆栈。

编辑:看看 在此:(强调我的)

在 Windows 上,堆栈空间
为程序保留的设置使用
/Fn 编译器选项,其中 n 是
字节数。此外,
堆栈保留大小可以是
通过 Visual Studio 指定
添加了 Microsoft Linker 的 IDE
选项 /STACK:链接器命令
线。要设置此项,请转到属性
页面>配置
属性>链接器>系统>堆栈保留
尺寸。在那里你可以指定堆栈
大小(以十进制或字节为单位)
C 语言表示法。如果没有指定,
默认堆栈大小为 1MB

The stack is the memory area where the information needed to return from a function, and the information locally defined in a function is stored. So a stack overflow may indicate you have a function that calls another function which in its turn calls another function, etc.

I am not familiar with Fortran (anymore) but another cause might be that those functions declare tons of local variables, or at least variables that need a lot of place.

A last one: the stack is typically rather small, so it's not a priori relevant how much memory the machine has. It should be quite simple to instruct the linker to increase the stack size, at least if you are certain it's just a lack of space, and not a bug in your application.

Edit: do you use recursion in your program? Recursive calls can eat through the stack very quickly.

Edit: have a look at this: (emphasis mine)

On Windows, the stack space to
reserved for the program is set using
the /Fn compiler option, where n is
the number of bytes. Additionally,
the stack reserve size can be
specified through the Visual Studio
IDE which adds the Microsoft Linker
option /STACK: to the linker command
line. To set this, go to Property
Pages>Configuration
Properties>Linker>System>Stack Reserve
Size. There you can specify the stack
size in bytes in either decimal or
C-language notation. If not specified,
the default stack size is 1MB.

情话难免假 2024-11-10 15:03:57

我在使用类似的测试代码时遇到的唯一问题是 32 位编译的 2Gb 分配限制。 第 419 行收到一条错误消息

当我超过它时,我在 Winsig.c 2GB Allocation Limit Error

这是测试代码

program FortranCon

implicit none

! Variables
INTEGER :: IA(64), S1
REAL(8), DIMENSION(:,:), ALLOCATABLE :: AA, BB
REAL(4) :: S2
INTEGER, PARAMETER :: N = 10960
IA(1)=N
IA(2)=N

ALLOCATE( AA(N,N), BB(N,N) )
AA(1:N,1:N) = 1D0
BB(1:N,1:N) = 2D0

CALL TEST(AA,BB,IA)

S1 = SIZEOF(AA)                 !Size of each array
S2 = 2*DBLE(S1)/1024/1024       !Total size for 2 arrays in Mb

WRITE (*,100) S2, ' Mb'         ! When allocation reached 2Gb then
100 FORMAT (F8.1,A)                 ! exception occurs in Win32

DEALLOCATE( AA, BB )

end program FortranCon


SUBROUTINE TEST(AA,BB,IA)
IMPLICIT NONE
INTEGER, DIMENSION(64),INTENT(IN) :: IA    
REAL(8), DIMENSION(IA(1),IA(2)),INTENT(INOUT) :: AA,BB

... !Do stuff with AA,BB        
END SUBROUTINE

N=10960 运行正常,显示 1832.9 Mb。使用 N=11960 它会崩溃。当然,当我用 x64 编译时,它工作正常。每个数组有 8*N^2 字节存储。我不知道它是否有帮助,但我建议使用 INTENT() 关键字作为虚拟变量。

The only problem I ran into with a similar test code, is the 2Gb allocation limit for 32-bit compilation. When I exceed it I get an error message on line 419 in winsig.c

2GB Allocation Limit Error

Here is the test code

program FortranCon

implicit none

! Variables
INTEGER :: IA(64), S1
REAL(8), DIMENSION(:,:), ALLOCATABLE :: AA, BB
REAL(4) :: S2
INTEGER, PARAMETER :: N = 10960
IA(1)=N
IA(2)=N

ALLOCATE( AA(N,N), BB(N,N) )
AA(1:N,1:N) = 1D0
BB(1:N,1:N) = 2D0

CALL TEST(AA,BB,IA)

S1 = SIZEOF(AA)                 !Size of each array
S2 = 2*DBLE(S1)/1024/1024       !Total size for 2 arrays in Mb

WRITE (*,100) S2, ' Mb'         ! When allocation reached 2Gb then
100 FORMAT (F8.1,A)                 ! exception occurs in Win32

DEALLOCATE( AA, BB )

end program FortranCon


SUBROUTINE TEST(AA,BB,IA)
IMPLICIT NONE
INTEGER, DIMENSION(64),INTENT(IN) :: IA    
REAL(8), DIMENSION(IA(1),IA(2)),INTENT(INOUT) :: AA,BB

... !Do stuff with AA,BB        
END SUBROUTINE

When N=10960 it runs ok showing 1832.9 Mb. With N=11960 it crashes. Of course when I compile with x64 it works ok. Each array has 8*N^2 bytes storage. I don't know if it helps but I recommend using the INTENT() keywords for the dummy variables.

甚是思念 2024-11-10 15:03:57

你正在使用一些并行化吗?这可能是静态声明数组的问题。尝试将所有较大的数组设为 ALLOCATABLE,否则,它们将被放置在自动并行或 OpenMP 线程中的堆栈上。

Are you using some parallelization? This can be a problem with statically declared arrays. Try all bigger arrays make ALLOCATABLE, otherwise, they will be placed on the stack in autoparallel or OpenMP threads.

不离久伴 2024-11-10 15:03:57

对我来说,问题是堆栈保留大小。我将堆栈保留大小从 0 更改为 100000000 并重新编译了代码。现在代码运行顺利。

在此处输入图像描述

For me the issue was the stack reserve size. I went and changed the stack reserved size from 0 to 100000000 and recompiled the code. The code now runs smoothly.

enter image description here

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文