有关TensorFlow Keras层的一些问题
Output token_ids:
[101, 753, 7439, 671, 736, 2399, 5018, 1724, 1453, 1920, 7942, 6044, 1469, 2166, 2147, 6845, 4495, 6821, 697, 6956, 2512, 4275, 4638, 4873, 2791, 2600, 1304, 3683, 3221, 1914, 2208, 1435, 102, 11, 2512, 4275, 1399, 4917, 102, 12, 1453, 4873, 2791, 102, 12, 4873, 2791, 1304, 3683, 102, 12, 1767, 1772, 782, 3613, 102]
Output segment_ids:
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Output header_ids:
[33 39 44 50]
nput_token_ids : shape(2, 57)
[[ 101 753 7439 671 736 2399 5018 1724 1453 1920 7942 6044 1469 2166
2147 6845 4495 6821 697 6956 2512 4275 4638 4873 2791 2600 1304 3683
3221 1914 2208 1435 102 12 1767 1772 782 3613 102 12 4873 2791
1304 3683 102 12 1453 4873 2791 102 11 2512 4275 1399 4917 102
[ 101 872 1962 8024 872 4761 6887 791 2399 5018 1724 1453 2166 2147
6845 4495 8024 6820 3300 6929 6956 1920 7942 6044 2124 812 4873 2791
2600 4638 1304 3683 1408 102 12 1767 1772 782 3613 102 12 4873
2791 1304 3683 102 12 1453 4873 2791 102 11 2512 4275 1399 4917
input_segment_ids : shape(2, 57)
[[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]
input_header_ids : shape(2, 4)
[[33 39 45 50]
[34 40 46 51]]
input_header_mask : shape(2, 4)
[[1 1 1 1]
[1 1 1 1]]
output_sel_agg : shape(2, 4, 1)
output_cond_conn_op : shape(2, 1)
output_cond_op : shape(2, 4, 1)
def seq_gather(x):
seq, idxs = x
idxs = K.cast(idxs, 'int32')
return tf.gather_nd(seq, idxs)
bert_model = load_trained_model_from_checkpoint(paths.config, paths.checkpoint, seq_len=None)
for l in bert_model.layers:
l.trainable = True
inp_token_ids = Input(shape=(None,), name='input_token_ids', dtype='int32')
inp_segment_ids = Input(shape=(None,), name='input_segment_ids', dtype='int32')
inp_header_ids = Input(shape=(None,), name='input_header_ids', dtype='int32')
inp_header_mask = Input(shape=(None, ), name='input_header_mask')
x = bert_model([inp_token_ids, inp_segment_ids]) # (None, seq_len, 768)
# predict cond_conn_op
x_for_cond_conn_op = Lambda(lambda x: x[:, 0])(x) # (None, 768)
p_cond_conn_op = Dense(num_cond_conn_op, activation='softmax', name='output_cond_conn_op')(x_for_cond_conn_op)
# predict sel_agg
x_for_header = Lambda(seq_gather)([x, inp_header_ids]) # (None, h_len, 768)
header_mask = Lambda(lambda x: K.expand_dims(x, axis=-1))(inp_header_mask) # (None, h_len, 1)
#x_for_header = tf.keras.layers.Multiply()([x_for_header,header_mask])
#x_for_header = Masking()(x_for_header)
p_sel_agg = Dense(num_sel_agg, activation='softmax', name='output_sel_agg')(x_for_header)
x_for_cond_op = Concatenate(axis=-1)([x_for_header, p_sel_agg])
p_cond_op = Dense(num_cond_op, activation='softmax', name='output_cond_op')(x_for_cond_op)
model = Model(
[inp_token_ids, inp_segment_ids, inp_header_ids, inp_header_mask],
[p_cond_conn_op, p_sel_agg, p_cond_op]
ValueError Traceback (most recent call last)
<ipython-input-30-b0a3d0700bdf> in <module>()
10 #x_for_header = Masking()(x_for_header)
---> 12 p_sel_agg = Dense(num_sel_agg, activation='softmax', name='output_sel_agg')(x_for_header)
14 x_for_cond_op = Concatenate(axis=-1)([x_for_header, p_sel_agg])
1 frames
/usr/local/lib/python3.7/dist-packages/keras/layers/core/dense.py in build(self, input_shape)
137 last_dim = tf.compat.dimension_value(input_shape[-1])
138 if last_dim is None:
--> 139 raise ValueError('The last dimension of the inputs to a Dense layer '
140 'should be defined. Found None. '
141 f'Full input shape received: {input_shape}')
ValueError: The last dimension of the inputs to a Dense layer should be defined. Found None. Full input shape received: <unknown>
I am a freshman in NLP. I encountered some problems when I was going through someone's code in github. So the code is about NL2SQL.The author processed dataset like this.
Output token_ids:
[101, 753, 7439, 671, 736, 2399, 5018, 1724, 1453, 1920, 7942, 6044, 1469, 2166, 2147, 6845, 4495, 6821, 697, 6956, 2512, 4275, 4638, 4873, 2791, 2600, 1304, 3683, 3221, 1914, 2208, 1435, 102, 11, 2512, 4275, 1399, 4917, 102, 12, 1453, 4873, 2791, 102, 12, 4873, 2791, 1304, 3683, 102, 12, 1767, 1772, 782, 3613, 102]
Output segment_ids:
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Output header_ids:
[33 39 44 50]
He added the columns after the questions. Then sent the whole sentences into bert to be encoded. And after processing, the input dataset looks like this. There is an example of batch size of 2.
nput_token_ids : shape(2, 57)
[[ 101 753 7439 671 736 2399 5018 1724 1453 1920 7942 6044 1469 2166
2147 6845 4495 6821 697 6956 2512 4275 4638 4873 2791 2600 1304 3683
3221 1914 2208 1435 102 12 1767 1772 782 3613 102 12 4873 2791
1304 3683 102 12 1453 4873 2791 102 11 2512 4275 1399 4917 102
[ 101 872 1962 8024 872 4761 6887 791 2399 5018 1724 1453 2166 2147
6845 4495 8024 6820 3300 6929 6956 1920 7942 6044 2124 812 4873 2791
2600 4638 1304 3683 1408 102 12 1767 1772 782 3613 102 12 4873
2791 1304 3683 102 12 1453 4873 2791 102 11 2512 4275 1399 4917
input_segment_ids : shape(2, 57)
[[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]
input_header_ids : shape(2, 4)
[[33 39 45 50]
[34 40 46 51]]
input_header_mask : shape(2, 4)
[[1 1 1 1]
[1 1 1 1]]
output_sel_agg : shape(2, 4, 1)
output_cond_conn_op : shape(2, 1)
output_cond_op : shape(2, 4, 1)
And in the prediction part, he extracted the the columns embedding vectors through the ids he preprocessed, and doing multiply then sent them into the Dense layer. The Model structure looks like this.
def seq_gather(x):
seq, idxs = x
idxs = K.cast(idxs, 'int32')
return tf.gather_nd(seq, idxs)
bert_model = load_trained_model_from_checkpoint(paths.config, paths.checkpoint, seq_len=None)
for l in bert_model.layers:
l.trainable = True
inp_token_ids = Input(shape=(None,), name='input_token_ids', dtype='int32')
inp_segment_ids = Input(shape=(None,), name='input_segment_ids', dtype='int32')
inp_header_ids = Input(shape=(None,), name='input_header_ids', dtype='int32')
inp_header_mask = Input(shape=(None, ), name='input_header_mask')
x = bert_model([inp_token_ids, inp_segment_ids]) # (None, seq_len, 768)
# predict cond_conn_op
x_for_cond_conn_op = Lambda(lambda x: x[:, 0])(x) # (None, 768)
p_cond_conn_op = Dense(num_cond_conn_op, activation='softmax', name='output_cond_conn_op')(x_for_cond_conn_op)
# predict sel_agg
x_for_header = Lambda(seq_gather)([x, inp_header_ids]) # (None, h_len, 768)
header_mask = Lambda(lambda x: K.expand_dims(x, axis=-1))(inp_header_mask) # (None, h_len, 1)
#x_for_header = tf.keras.layers.Multiply()([x_for_header,header_mask])
#x_for_header = Masking()(x_for_header)
p_sel_agg = Dense(num_sel_agg, activation='softmax', name='output_sel_agg')(x_for_header)
x_for_cond_op = Concatenate(axis=-1)([x_for_header, p_sel_agg])
p_cond_op = Dense(num_cond_op, activation='softmax', name='output_cond_op')(x_for_cond_op)
model = Model(
[inp_token_ids, inp_segment_ids, inp_header_ids, inp_header_mask],
[p_cond_conn_op, p_sel_agg, p_cond_op]
But when I ran the code, it raised error which means that doing the multiply needs the specific Input Shape, and the number of columns varies in different data, when I removed that line and continued, to check if the results could be the same even without the masking, it raised another error that the last dimension to the dense layer should be defined.
ValueError Traceback (most recent call last)
<ipython-input-30-b0a3d0700bdf> in <module>()
10 #x_for_header = Masking()(x_for_header)
---> 12 p_sel_agg = Dense(num_sel_agg, activation='softmax', name='output_sel_agg')(x_for_header)
14 x_for_cond_op = Concatenate(axis=-1)([x_for_header, p_sel_agg])
1 frames
/usr/local/lib/python3.7/dist-packages/keras/layers/core/dense.py in build(self, input_shape)
137 last_dim = tf.compat.dimension_value(input_shape[-1])
138 if last_dim is None:
--> 139 raise ValueError('The last dimension of the inputs to a Dense layer '
140 'should be defined. Found None. '
141 f'Full input shape received: {input_shape}')
ValueError: The last dimension of the inputs to a Dense layer should be defined. Found None. Full input shape received: <unknown>
So I wonder if somewhere in the code is wrong, maybe the gather function? or just because of the version of keras since this code was pull to github in 2019. And is there some other ways to achieve this requirements?
Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。