将保留令牌添加到“tft.vocabulary”
我想将单词附加到由tft.vocabulary
创建的词汇中,这些词汇不是培训样本的一部分(即; 令牌)。
我在文档中看到tft.vocabulary
函数可以采用一个参数key_fn
,该文档说:
供应key_fn如果您想生成带有特定键覆盖的词汇。
但是,在下面的键_FN的情况下,它仍然不会附加< bask>
和< pad>
令牌。
def _key_fn(x):
return tf.constant(['<mask>', '<pad>'])
vocab = tft.vocabulary(
words,
key_fn = lambda x : _key_fn(x),
top_k = config.VOCAB_SIZE
)
I would like to append words to the vocabulary created by tft.vocabulary
that are not a part of the training samples (i.e. <mask>
and <pad>
tokens).
I see in the docs that the tft.vocabulary
function can take an argument key_fn
which the docs says:
Supply key_fn if you would like to generate a vocabulary with coverage over specific keys.
but with the key_fn below it still does not append the <mask>
and <pad>
tokens to the vocabulary.
def _key_fn(x):
return tf.constant(['<mask>', '<pad>'])
vocab = tft.vocabulary(
words,
key_fn = lambda x : _key_fn(x),
top_k = config.VOCAB_SIZE
)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
你想要实现什么目标?
我不认为
key_fn
相关,因为它只影响词汇表的顺序(以及提供的前 k 个)。您可以在附加添加的信息后计算词汇表吗?
tft.vocabulary(tf.strings.join([words,,]), ...)
这将导致词汇表包含添加的后缀
What is it that you're trying to achieve?
I don't think that
key_fn
is related as it only affects the ordering of the vocabulary (and top k when provided)Could you compute the vocabulary after appending the added information?
tft.vocabulary(tf.strings.join([words, <mask>, <pad>]), ...)
This would result in the vocabulary including the added suffix