nn.Embeddings 未经过训练

发布于 2025-01-11 09:33:38 字数 2254 浏览 0 评论 0原文

当我创建新的神经网络时，我使用特征嵌入层作为输入层来嵌入分类特征。但是，我注意到当我执行训练时，这些嵌入（应该是可训练的）不会更新。除了嵌入之外，其他所有内容都会更新。

鉴于我有几个分类特征，我将我的 FeatureEmbedder 模块定义如下。

class FeatureEmbedder(nn.Module):
    # Discuss if we need this implementation or if the FeatureEmbedded from gluonts.torch.model.modules.feature works
    @validated()
    def __init__(
        self,
        cardinalities: List[int],
        embedding_dimensions: List[int],
    ):
        super().__init__()
        
        self._num_embedded_features = len(cardinalities)
        self.embedders = [
            torch.nn.Embedding(num_embeddings=card, embedding_dim=dim)
            for card, dim in zip(cardinalities, embedding_dimensions)
        ]

    def forward(self, features):
        """
        :param features: (-1, num_features)
        :return:
            Embedding with shape (-1, sum([self.embedding_dimensions]))
        """
        embedded_features = torch.cat(
            [
                embedder(features[:, i].long())
                for i, embedder in enumerate(self.embedders)
            ],
            dim=-1,
        )
        return embedded_features

我注意到，当我第一次调用我的训练函数并打印模型时，FeatureEmbedder 子模块似乎没有任何参数，我没有解释......

  | Name                      | Type                        | Params
--------------------------------------------------------------------------
0 | model                     | NNetwork                    | 193   
1 | model.embedder            | FeatureEmbedder             | 0     
2 | model.model               | Sequential                  | 160   
3 | model.model.0             | Linear                      | 50    
4 | model.model.1             | ReLU                        | 0     
5 | model.model.2             | Linear                      | 110   
6 | model.model.3             | ReLU                        | 0     
7 | model.output_layer        | Sequential                  | 33    
8 | model.output_layer.linear | Linear                      | 33    
9 | loss                      | ScaledNegativeLogLikelihood | 0     
--------------------------------------------------------------------------

有人可以帮助解释为什么我的嵌入没有通过网络进行训练吗？我已经检查过，在我看来，所有张量/嵌入器都需要 grad（xx.requires_grad 是 True）。

谢谢！

原文

As I am creating a new NN, I used a feature embedding layer as an input layer to embed categorical features. However, I noticed that those embeddings (which should be trainable) do not get updated when I perform the training. Everything else gets updated but the embeddings.

Given that I have several categorical features, I define my FeatureEmbedder module as below.

class FeatureEmbedder(nn.Module):
    # Discuss if we need this implementation or if the FeatureEmbedded from gluonts.torch.model.modules.feature works
    @validated()
    def __init__(
        self,
        cardinalities: List[int],
        embedding_dimensions: List[int],
    ):
        super().__init__()
        
        self._num_embedded_features = len(cardinalities)
        self.embedders = [
            torch.nn.Embedding(num_embeddings=card, embedding_dim=dim)
            for card, dim in zip(cardinalities, embedding_dimensions)
        ]

    def forward(self, features):
        """
        :param features: (-1, num_features)
        :return:
            Embedding with shape (-1, sum([self.embedding_dimensions]))
        """
        embedded_features = torch.cat(
            [
                embedder(features[:, i].long())
                for i, embedder in enumerate(self.embedders)
            ],
            dim=-1,
        )
        return embedded_features

I notice that, when I first call my training function and print the model, the FeatureEmbedder submodule does not seem to have any parameter, which I do not explain ...

  | Name                      | Type                        | Params
--------------------------------------------------------------------------
0 | model                     | NNetwork                    | 193   
1 | model.embedder            | FeatureEmbedder             | 0     
2 | model.model               | Sequential                  | 160   
3 | model.model.0             | Linear                      | 50    
4 | model.model.1             | ReLU                        | 0     
5 | model.model.2             | Linear                      | 110   
6 | model.model.3             | ReLU                        | 0     
7 | model.output_layer        | Sequential                  | 33    
8 | model.output_layer.linear | Linear                      | 33    
9 | loss                      | ScaledNegativeLogLikelihood | 0     
--------------------------------------------------------------------------

Can someone help explain why my Embeddings do not get trained with the network? I have checked and it seems to me that all tensors / embedders do require grad (xx.requires_grad is True).

Thanks!

分享到QQ

分享到微博