将Relu函数的输入归一化遇到运行时错误

发布于 2025-02-06 20:17:38 字数 2616 浏览 2 评论 0原文

当我尝试将最大激活值从上一层传递到下一层中的relu的输入时,我会遇到下面的运行时错误。但是,当我传递固定值时,它可以很好地工作而没有任何错误。

File "/usr/local/lib/python3.7/dist-packages/torch/autograd/__init__.py", line 175, in backward allow_unreachable=True, accumulate_grad=True)  # Calls into the C++ engine to run the backward pass
RuntimeError: Trying to backward through the graph a second time (or directly access saved
tensors after they have already been freed). Saved intermediate values of the graph are freed
when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to
backward through the graph a second time or if you need to access saved tensors after calling
backward.

如您在下面的此代码中看到的那样,我将参数prev_layer_max从上一层传递,并遇到错误:

class th_norm_ReLU(nn.Module):
    def __init__(self, modify):
        super(th_norm_ReLU, self).__init__()
        self.therelu = F.relu
       

    def forward(self, input, prev_layer_max):
      output = input * (prev_layer_max / input.max())
      norm_output = self.therelu (output)
      
      return norm_output

但是,如果我使用固定值而不是传递prev_layer_max参数,则为下面的此代码我使其等于1它正常起作用而没有任何错误:

    def forward(self, input, prev_layer_max = 1):
        output = input * (1 / input.max())
        norm_output = self.therelu (output)

训练环如下:

for epoch in range(params.epochs):
    running_loss = 0

    start_time = time.time()
    for i, (images, labels) in enumerate(train_loader):
        model.train()
        model.zero_grad()  
        optimizer.zero_grad()  
        labels.to(device)
        images = images.float().to(device)
        outputs = model(images, epoch)
        loss = criterion(outputs.cpu(), labels)  
        running_loss += loss.item()  
        loss.backward()  
        optimizer.step()

这是我在列表中记录每个层的最大值(thresh_list ):

def forward(self, input, epoch):
    x = self.conv1(input)
    x = self.relu(x,1)
    self.thresh_list[0] = max(self.thresh_list[0], x.max()) #to get the max activation 
    x = self.conv_dropout(x)
    x = self.conv2(x)
    x = self.relu(x, self.thresh_list[0])
    self.thresh_list[1] = max(self.thresh_list[1], x.max())
    x = self.pool1(x)
    x = self.conv_dropout(x)
    x = self.conv3(x)
    x = self.relu(x, self.thresh_list[1] )
    self.thresh_list[2] = max(self.thresh_list[2], x.max())

我称之为的Relue函数是:

self.relu = th_norm_ReLU(True)
        

the_norm_relu模型如上所述。

when I try to pass the maximum activation value from previous layer to normalize the input of relu in next layer I encounter a runtime error as below. However, when I pass fixed value it works well without any error.

File "/usr/local/lib/python3.7/dist-packages/torch/autograd/__init__.py", line 175, in backward allow_unreachable=True, accumulate_grad=True)  # Calls into the C++ engine to run the backward pass
RuntimeError: Trying to backward through the graph a second time (or directly access saved
tensors after they have already been freed). Saved intermediate values of the graph are freed
when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to
backward through the graph a second time or if you need to access saved tensors after calling
backward.

As you see in this code below, I pass the argument prev_layer_max from the previous layer and encounter the error:

class th_norm_ReLU(nn.Module):
    def __init__(self, modify):
        super(th_norm_ReLU, self).__init__()
        self.therelu = F.relu
       

    def forward(self, input, prev_layer_max):
      output = input * (prev_layer_max / input.max())
      norm_output = self.therelu (output)
      
      return norm_output

But if I use a fixed value instead of passed prev_layer_max argument, as this code below I make it equal to 1 it works normally without any error:

    def forward(self, input, prev_layer_max = 1):
        output = input * (1 / input.max())
        norm_output = self.therelu (output)

the training loop is as below :

for epoch in range(params.epochs):
    running_loss = 0

    start_time = time.time()
    for i, (images, labels) in enumerate(train_loader):
        model.train()
        model.zero_grad()  
        optimizer.zero_grad()  
        labels.to(device)
        images = images.float().to(device)
        outputs = model(images, epoch)
        loss = criterion(outputs.cpu(), labels)  
        running_loss += loss.item()  
        loss.backward()  
        optimizer.step()

here is the forward in the model where I record the max of each layer in a list ( thresh_list ):

def forward(self, input, epoch):
    x = self.conv1(input)
    x = self.relu(x,1)
    self.thresh_list[0] = max(self.thresh_list[0], x.max()) #to get the max activation 
    x = self.conv_dropout(x)
    x = self.conv2(x)
    x = self.relu(x, self.thresh_list[0])
    self.thresh_list[1] = max(self.thresh_list[1], x.max())
    x = self.pool1(x)
    x = self.conv_dropout(x)
    x = self.conv3(x)
    x = self.relu(x, self.thresh_list[1] )
    self.thresh_list[2] = max(self.thresh_list[2], x.max())

The Relue function I call is :

self.relu = th_norm_ReLU(True)
        

and the_norm_ReLU model is shown above.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文