tfhub的Roberta示例产生了错误的“在变体主机 - >设备复制中:张量类型的非DMA-COPY尝试:字符串”

发布于 2025-01-31 12:34:09 字数 1616 浏览 6 评论 0 原文

我想使用TFHUB的 Roberta-Base 模型。我正在尝试运行句子馈送到模型作为输入时。我会收到以下错误无效的参数:在变体主机 - >设备复制中:张量类型的非DMA-COPY尝试:String 。我使用的是Python 3.7,Tensorflow 2.5和Tensorflow_hub 0.12。

如果我尝试用相应的bert版本替换预处理器和编码器,则以上代码可以正常工作。但是,我也希望它也为罗伯塔(Roberta)工作(如下所示)。

preprocess = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3")
encoder = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/4", trainable=True)
# define a text embedding model
text_input = tf.keras.layers.Input(shape=(), dtype=tf.string)
preprocessor = hub.KerasLayer("https://tfhub.dev/jeongukjae/roberta_en_cased_preprocess/1")
encoder_inputs = preprocessor(text_input)

encoder = hub.KerasLayer("https://tfhub.dev/jeongukjae/roberta_en_cased_L-12_H-768_A-12/1", trainable=True)
encoder_outputs = encoder(encoder_inputs)
pooled_output = encoder_outputs["pooled_output"]      # [batch_size, 768].
sequence_output = encoder_outputs["sequence_output"]  # [batch_size, seq_length, 768].

model = tf.keras.Model(text_input, pooled_output)

# You can embed your sentences as follows
sentences = tf.constant(["(your text here)"])
print(model(sentences))

此外,如果我使用CPU而不是GPU,则与Roberta预处理器/编码器上面的代码似乎可以工作因为我需要在大量数据上微调模型。

I would like to use the roberta-base model from tfhub. I am trying to run the example below, although I get an error when I try to feed sentences to model as input. I get the following error Invalid argument: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string. I am using python 3.7, tensorflow 2.5, and tensorflow_hub 0.12.

If I try to replace preprocessor and encoder with the corresponding BERT versions, the code above works. However, I would like it to work for RoBERTa as well (as shown below).

preprocess = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3")
encoder = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/4", trainable=True)
# define a text embedding model
text_input = tf.keras.layers.Input(shape=(), dtype=tf.string)
preprocessor = hub.KerasLayer("https://tfhub.dev/jeongukjae/roberta_en_cased_preprocess/1")
encoder_inputs = preprocessor(text_input)

encoder = hub.KerasLayer("https://tfhub.dev/jeongukjae/roberta_en_cased_L-12_H-768_A-12/1", trainable=True)
encoder_outputs = encoder(encoder_inputs)
pooled_output = encoder_outputs["pooled_output"]      # [batch_size, 768].
sequence_output = encoder_outputs["sequence_output"]  # [batch_size, seq_length, 768].

model = tf.keras.Model(text_input, pooled_output)

# You can embed your sentences as follows
sentences = tf.constant(["(your text here)"])
print(model(sentences))

enter image description here

Additionally, the code above with the RoBERTa preprocessor/encoder seems to work if I use CPU instead of GPU (adding with tf.device('/cpu:0')), but this is not feasible because I need to fine-tune a model on lots of data.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文