尝试围绕numpy.__exceptions._arrayMemoryError问题在我的代码中
我有一个数据框 - >具有形状的数据(10000,257)。我需要预处理此数据框架,以便可以在LSTM中使用它,该LSTM需要3维输入 - (Nrows,ntimesteps,nfeatures)我正在使用此处提供的代码段:
def univariate_processing(variable, window):
import numpy as np
# create empty 2D matrix from variable
V = np.empty((len(variable)-window+1, window))
# take each row/time window
for i in range(V.shape[0]):
V[i,:] = variable[i : i+window]
V = V.astype(np.float32) # set common data type
return V
def RNN_regprep(df, y, len_input, len_pred): #, test_size):
# create 3D matrix for multivariate input
X = np.empty((df.shape[0]-len_input+1, len_input, df.shape[1]))
# Iterate univariate preprocessing on all variables - store them in XM
for i in range(df.shape[1]):
X[ : , : , i ] = univariate_processing(df[:,i], len_input)
# create 2D matrix of y sequences
y = y.reshape((-1,)) # reshape to 1D if needed
Y = univariate_processing(y, len_pred)
## Trim dataframes as explained
X = X[ :-(len_pred + 1) , : , : ]
Y = Y[len_input:-1 , :]
# Set common datatype
X = X.astype(np.float32)
Y = Y.astype(np.float32)
return X, Y
X,y = RNN_regprep(data,label, len_ipnut=200,len_pred=1)
在运行以下错误时获得以下错误:
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 28.9 GiB for an array with shape (10000, 200, 257) and data type float64
I请了解,这是我在服务器中内存的问题。我想知道我可以在代码中更改的任何解决方案,以查看是否可以避免此内存错误或尝试减少此内存消耗?
I have a data frame -> data with the shape (10000,257). I need to preprocess this dataframe so that I can use it in LSTM which requires a 3 dimensional input - (nrows,ntimesteps,nfeatures)I am working with the code snippet that is provided here:
def univariate_processing(variable, window):
import numpy as np
# create empty 2D matrix from variable
V = np.empty((len(variable)-window+1, window))
# take each row/time window
for i in range(V.shape[0]):
V[i,:] = variable[i : i+window]
V = V.astype(np.float32) # set common data type
return V
def RNN_regprep(df, y, len_input, len_pred): #, test_size):
# create 3D matrix for multivariate input
X = np.empty((df.shape[0]-len_input+1, len_input, df.shape[1]))
# Iterate univariate preprocessing on all variables - store them in XM
for i in range(df.shape[1]):
X[ : , : , i ] = univariate_processing(df[:,i], len_input)
# create 2D matrix of y sequences
y = y.reshape((-1,)) # reshape to 1D if needed
Y = univariate_processing(y, len_pred)
## Trim dataframes as explained
X = X[ :-(len_pred + 1) , : , : ]
Y = Y[len_input:-1 , :]
# Set common datatype
X = X.astype(np.float32)
Y = Y.astype(np.float32)
return X, Y
X,y = RNN_regprep(data,label, len_ipnut=200,len_pred=1)
While running this the following error is obtained:
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 28.9 GiB for an array with shape (10000, 200, 257) and data type float64
I do understand that this is more of an issue with my memory within my server. I want to know any solution that I can change within my code to see if I can avoid this memory error or try reducing this memory consumption?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是窗口视图的目的。使用我的食谱在这里:
现在您有了
var
的窗口视图:但是,重要的是,它使用的数据与
var
完全相同,只是在窗口的窗口中查看它方法:您可以做:
这应该大大减少您的内存分配,无需“魔术” :)
您也可以尝试
几乎可以做同一件事。在这种情况下:您的答案将是:
This is what windowed views are for. Using my recipe here:
Now you have a windowed view over
var
:But, importantly, it's using the exact same data as
var
, just looking into it in a windowed way:So you can do:
That should significantly reduce your memory allocation, no "magic" required :)
You can also try
Which does almost the same thing. In this case: your answer would be: