Flair模型可以给出任何单词的表示(它可以处理OOV问题),而BERT模型将未知的单词分为几个子字。
例如,“ hjik”一词将在Flair中代表一个向量,而在Bert中,它将分为几个单词(因为它是OOV),因此我们将为每个子单词都有几个向量。因此,从弗莱尔(Flair)那里,我们将有一个向量,而从伯特(Bert)我们可能有两个或多个向量。
这里的问题是 flairnlp 库库处理此问题?
注意:如果您不知道,您至少可以建议我一种适当的处理方法吗?
The flair model can give a representation of any word (it can handle the OOV problem), while the BERT model splits the unknown word into several sub-words.
For example, the word "hjik" will have one vector represented in flair, while in BERT it will be divided into several words (because it's OOV) and therefore we will have several vectors for each sub word. So from flair we'll have one vector while from BERT we might have two or more vectors.
The question here is how did the flairNLP library handle this issue?
NOTE:If you have no idea, can you at least suggest me a proper way to handle it?
发布评论
评论(1)
Transformerwordembeddings类的默认文字分为多个子字,您可以使用subtoken_pooling参数控制多个子字(您的选择是“ first”,“ first”,“ last”,“ first_last”和“ mean”),请参见此处的信息:
The TransformerWordEmbeddings class has default handling for words split into multiple subwords which you control with the subtoken_pooling parameter (your choices are "first", "last", "first_last" and "mean"), see the info here: https://github.com/flairNLP/flair/blob/master/resources/docs/embeddings/TRANSFORMER_EMBEDDINGS.md#pooling-operation