嵌入层 - Pytorch 中的 torch.nn.Embedding

发布于 2025-01-17 01:51:12 字数 746 浏览 0 评论 0原文

我对 NN 很陌生,如果我的问题很愚蠢,我很抱歉。我刚刚在 github 上阅读代码,发现专业人士使用嵌入(在这种情况下不是单词嵌入),但我可以问一下:

  1. 嵌入层是否具有可训练的变量,可以随着时间的推移学习以改进嵌入?
  2. 您能否提供对此的直觉以及使用什么情况,例如房价回归会从中受益吗?
  3. 如果是这样(它学习到的)与仅使用线性层有什么区别?
>>> embedding = nn.Embedding(10, 3)
>>> input = torch.LongTensor([[1,2,4,5],[4,3,2,9]])
>>> input
tensor([[1, 2, 4, 5],
        [4, 3, 2, 9]])

>>> embedding(input)
tensor([[[-0.0251, -1.6902,  0.7172],
         [-0.6431,  0.0748,  0.6969],
         [ 1.4970,  1.3448, -0.9685],
         [-0.3677, -2.7265, -0.1685]],

        [[ 1.4970,  1.3448, -0.9685],
         [ 0.4362, -0.4004,  0.9400],
         [-0.6431,  0.0748,  0.6969],
         [ 0.9124, -2.3616,  1.1151]]])

I'm quite new to NN and sorry if my question is quite dumb. I was just reading codes on github and found the pros use embedding (in that case not a word embedding) but may I please just ask in general:

  1. Does Embedding Layer has trainable variables that learn over time as to improve in embedding?
  2. May you please provide an intuition on it and what circumstances to use, like would the house price regression benefit from it ?
  3. If so (that it learns) what is the difference than just using Linear Layers?
>>> embedding = nn.Embedding(10, 3)
>>> input = torch.LongTensor([[1,2,4,5],[4,3,2,9]])
>>> input
tensor([[1, 2, 4, 5],
        [4, 3, 2, 9]])

>>> embedding(input)
tensor([[[-0.0251, -1.6902,  0.7172],
         [-0.6431,  0.0748,  0.6969],
         [ 1.4970,  1.3448, -0.9685],
         [-0.3677, -2.7265, -0.1685]],

        [[ 1.4970,  1.3448, -0.9685],
         [ 0.4362, -0.4004,  0.9400],
         [-0.6431,  0.0748,  0.6969],
         [ 0.9124, -2.3616,  1.1151]]])

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

月下客 2025-01-24 01:51:12

简而言之,嵌入层具有可学习的参数,并且该层的有用性取决于您想要对数据产生什么归纳偏差。

嵌入层是否具有可训练变量,可以随着时间的推移进行学习以改进嵌入?

是的,如变量部分下的 文档 中所述,它的嵌入权重在训练过程中会发生变化。

请您提供一下对此的直觉以及使用什么情况,例如房价回归会从中受益吗?

嵌入层通常用于输入被标记化的 NLP 任务。这意味着输入在某种意义上是离散的,可以用于索引权重(这基本上就是嵌入层在前向模式下的情况)。这种离散归因意味着像 1242 这样的输入是完全不同的(直到学习了语义相关性)。房价回归具有连续的输入空间,1.01.1 等值可能比值 1.042.0 相关性更高代码>.这种关于假设空间的假设称为归纳偏差,几乎每个机器学习架构都符合某种归纳偏差。我相信使用嵌入层来解决需要某种离散化的回归问题是可能的,但它不会从中受益。

如果是这样(它学习到的)与仅使用线性层有什么区别?

有一个很大的区别,线性层与权重执行矩阵乘法,而不是将其用作查找表。在嵌入层的反向传播期间,梯度只会传播到查找中使用的相应索引,并累积重复索引。

In short, the embedding layer has learnable parameters and the usefulness of the layer depends on what inductive bias you want on the data.

Does Embedding Layer has trainable variables that learn over time as to improve in embedding?

Yes, as stated in the docs under the Variables section, it has an embedding weight that is altered during the training process.

May you please provide an intuition on it and what circumstances to use, like would the house price regression benefit from it?

An embedding layer is commonly used in NLP tasks where the input is tokenized. This means that the input is discrete in a sense and can be used for indexing the weight (which is basically what the embedding layer is in forward mode). This discrete attribution implies that inputs like 1, 2, 42 are entirely different (until the semantic correlation has been learnt). House price regression has continuous input space and values such as 1.0 and 1.1 might be more correlated than the values 1.0 and 42.0. This kind of assumption about the hypothesis space is called an inductive bias and pretty much every machine learning architecture conforms to some sort of inductive bias. I believe it is possible to use embedding layers for regression problems which would require some kind of discretization, but it would not benefit from it.

If so (that it learns) what is the difference than just using Linear Layers?

There is a big difference, the linear layer performs matrix multiplication with the weight as opposed to using it as a lookup table. During backpropagation for the embedding layer, the gradients will only propagate threw the corresponding indices used in the lookup and duplicate indices are accumulated.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文