如何使用HF中的TF模型构建用于BERT和变体的蒸馏器类

发布于 2025-02-11 01:15:59 字数 4866 浏览 1 评论 0原文

我正在尝试使用拥抱脸的TF模型来建立一个“蒸馏”类,以进行知识蒸馏。我从试图对其进行修改。

我构建了一个看起来像这样的数据集:

DatasetDict({
train: Dataset({
    features: ['text', 'labels', 'input_ids', 'token_type_ids', 'attention_mask'],
    num_rows: 512
})
validation: Dataset({
    features: ['text', 'labels', 'input_ids', 'token_type_ids', 'attention_mask'],
    num_rows: 128
})
test: Dataset({
    features: ['text', 'labels', 'input_ids', 'token_type_ids', 'attention_mask'],
    num_rows: 160
})})

并在我收到的“火车”上调用功能:

{'labels': ClassLabel(num_classes=4, names=['A', 'B', 'C', 'D'], id=None), 'text': Value(dtype='string', id=None)}

我从中创建了火车(但也有效)如下:

tf_train_dataset = my_tokenized_dataset["train"].to_tf_dataset(
columns=["attention_mask", "input_ids", "token_type_ids"],
label_cols=["labels"],
shuffle=True,
collate_fn=data_collator,
batch_size=8,)

假设正确(我遵循 hf教程),我打电话给

teacher = TFAutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=4) 
student = TFAutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=4) 
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True) 
teacher.compile(optimizer=opt, loss=loss, metrics=["accuracy"]) # where opt is Adam with a lr_scheduler

并安装了3个时代的教师模型,其验证精度约为95%。

从TF教程中留下不变的蒸馏类,当需要计算Student_Loss时出现问题:详细说明,我检查了这一点

x = {'input_ids': <tf.Tensor 'IteratorGetNext:1' shape=(None, None) dtype=int64>, 'token_type_ids': <tf.Tensor 'IteratorGetNext:2' shape=(None, None) dtype=int64>, 'attention_mask': <tf.Tensor 'IteratorGetNext:0' shape=(None, None) dtype=int64>}

y = Tensor("IteratorGetNext:3", shape=(None,), dtype=int64)

虽然

Teacher predictions: TFSequenceClassifierOutput(loss=None, logits=<tf.Tensor 'tf_bert_for_sequence_classification_1/classifier/BiasAdd:0' shape=(None, 4) dtype=float32>, hidden_states=None, attentions=None)
Student predictions: TFSequenceClassifierOutput(loss=None, logits=<tf.Tensor 'tf_bert_for_sequence_classification_2/classifier/BiasAdd:0' shape=(None, 4) dtype=float32>, hidden_states=None, attentions=None)

错误是以下内容:

---------------------------------------------------------------------------
TypeError                                 Traceback         (most recent call last)
<ipython-input-62-a15be00a8e5e> in <module>()
11 
12 # Distill teacher to student
---> 13 distiller.fit(tf_train_dataset, validation_data=tf_validation_dataset, epochs=3)
14 
15 # Evaluate student on test dataset

1 frames
/usr/local/lib/python3.7/dist-   packages/keras/utils/traceback_utils.py in     error_handler(*args, **kwargs)
65     except Exception as e:  # pylint: disable=broad-except
66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
68     finally:
69       del filtered_tb

/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py in autograph_handler(*args, **kwargs)
1145           except Exception as e:  # pylint:disable=broad-except
1146             if hasattr(e, "ag_error_metadata"):
-> 1147               raise e.ag_error_metadata.to_exception(e)
1148             else:
1149               raise

TypeError: in user code:

File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1021, in train_function  *
    return step_function(self, iterator)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1010, in step_function  **
    outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1000, in run_step  **
    outputs = model.train_step(data)
File "<ipython-input-61-219884772d38>", line 52, in train_step
    student_loss = self.student_loss_fn(y, student_predictions)
File "/usr/local/lib/python3.7/dist-packages/keras/losses.py", line 141, in __call__
    losses = call_fn(y_true, y_pred)
File "/usr/local/lib/python3.7/dist-packages/keras/losses.py", line 245, in call  **
    return ag_fn(y_true, y_pred, **self._fn_kwargs)
File "/usr/local/lib/python3.7/dist-packages/keras/losses.py", line 1860, in sparse_categorical_crossentropy
    y_pred = tf.convert_to_tensor(y_pred)

TypeError: Expected any non-tensor type, but got a tensor instead.

如果问​​题和错误是微不足道的,我深表歉意,我仍然有一个不幸的是要学习很多东西。更好的是,如果您可以给我一个逐步的答案而又没有任何理所当然的答案,也许我更容易理解。

事先感谢大家。

I'm trying to build a "Distillation" class for Knowledge Distillation using TF models from Hugging Face. I started with this and tried to modify it.

I built a dataset that looks like this:

DatasetDict({
train: Dataset({
    features: ['text', 'labels', 'input_ids', 'token_type_ids', 'attention_mask'],
    num_rows: 512
})
validation: Dataset({
    features: ['text', 'labels', 'input_ids', 'token_type_ids', 'attention_mask'],
    num_rows: 128
})
test: Dataset({
    features: ['text', 'labels', 'input_ids', 'token_type_ids', 'attention_mask'],
    num_rows: 160
})})

and calling features on the "train" I receive:

{'labels': ClassLabel(num_classes=4, names=['A', 'B', 'C', 'D'], id=None), 'text': Value(dtype='string', id=None)}

from which I created train (but also valid and test) as follows:

tf_train_dataset = my_tokenized_dataset["train"].to_tf_dataset(
columns=["attention_mask", "input_ids", "token_type_ids"],
label_cols=["labels"],
shuffle=True,
collate_fn=data_collator,
batch_size=8,)

Supposing this correct (I followed HF tutorials), I called

teacher = TFAutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=4) 
student = TFAutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=4) 
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True) 
teacher.compile(optimizer=opt, loss=loss, metrics=["accuracy"]) # where opt is Adam with a lr_scheduler

and fitted the teacher model in 3 epochs with about 95% validation accuracy.

Leaving unchanged the Distill class from the TF tutorial, the problem arise when it is needed to calculate the student_loss: in detail, I checked that

x = {'input_ids': <tf.Tensor 'IteratorGetNext:1' shape=(None, None) dtype=int64>, 'token_type_ids': <tf.Tensor 'IteratorGetNext:2' shape=(None, None) dtype=int64>, 'attention_mask': <tf.Tensor 'IteratorGetNext:0' shape=(None, None) dtype=int64>}

and

y = Tensor("IteratorGetNext:3", shape=(None,), dtype=int64)

while

Teacher predictions: TFSequenceClassifierOutput(loss=None, logits=<tf.Tensor 'tf_bert_for_sequence_classification_1/classifier/BiasAdd:0' shape=(None, 4) dtype=float32>, hidden_states=None, attentions=None)
Student predictions: TFSequenceClassifierOutput(loss=None, logits=<tf.Tensor 'tf_bert_for_sequence_classification_2/classifier/BiasAdd:0' shape=(None, 4) dtype=float32>, hidden_states=None, attentions=None)

The error is the following:

---------------------------------------------------------------------------
TypeError                                 Traceback         (most recent call last)
<ipython-input-62-a15be00a8e5e> in <module>()
11 
12 # Distill teacher to student
---> 13 distiller.fit(tf_train_dataset, validation_data=tf_validation_dataset, epochs=3)
14 
15 # Evaluate student on test dataset

1 frames
/usr/local/lib/python3.7/dist-   packages/keras/utils/traceback_utils.py in     error_handler(*args, **kwargs)
65     except Exception as e:  # pylint: disable=broad-except
66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
68     finally:
69       del filtered_tb

/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py in autograph_handler(*args, **kwargs)
1145           except Exception as e:  # pylint:disable=broad-except
1146             if hasattr(e, "ag_error_metadata"):
-> 1147               raise e.ag_error_metadata.to_exception(e)
1148             else:
1149               raise

TypeError: in user code:

File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1021, in train_function  *
    return step_function(self, iterator)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1010, in step_function  **
    outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1000, in run_step  **
    outputs = model.train_step(data)
File "<ipython-input-61-219884772d38>", line 52, in train_step
    student_loss = self.student_loss_fn(y, student_predictions)
File "/usr/local/lib/python3.7/dist-packages/keras/losses.py", line 141, in __call__
    losses = call_fn(y_true, y_pred)
File "/usr/local/lib/python3.7/dist-packages/keras/losses.py", line 245, in call  **
    return ag_fn(y_true, y_pred, **self._fn_kwargs)
File "/usr/local/lib/python3.7/dist-packages/keras/losses.py", line 1860, in sparse_categorical_crossentropy
    y_pred = tf.convert_to_tensor(y_pred)

TypeError: Expected any non-tensor type, but got a tensor instead.

I apologize if the questions and mistakes are trivial, I still have a lot to learn unfortunately. Better yet, if you can give me a step-by-step answer without taking anything for granted maybe it will be easier for me to understand.

Thanks in advance to everyone.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文