您可以将keras.layers输出与TensorFlow操作混合吗?
我知道Keras层的输出(例如keras.layers.dense()
)产生所谓的“ keras张量”。此外,还有由TensorFlow操作产生的“ TensorFlow Tensors”(例如tf.Reduce_sum()
)。
某些功能只能通过其中一种方法传递,因此很明显,有时我必须将它们混合才能进行计算。我认为,将Keras层代码与TF OPS混合看起来远非美丽,但这就是另一个话题。
我的问题是 - 混合凯拉斯层和张量吗?并且是否存在与将Keras和Tensorflow张量混合有关的问题?
让我们考虑一个自定义类的示例,不要深入研究计算,这是没有意义的。关键是要展示我在说什么。
class Calculation(layers.Layer):
def __init__(self, units):
super().__init__()
self.units = units
self.conv3d = layers.Conv3D(units, kernel_size=3, strides=2)
self.norm = layers.LayerNormalization(units)
def call(self, x):
x = activations.gelu(x)
x = self.norm(x) # keras layer
x = tf.transpose(x, (0, 4, 1, 2, 3))
x = self.conv3d(x) # keras layer
x = tf.transpose(x, (0, 2, 3, 4, 1))
x = tf.reshape(x, (-1, 1, 2 * self.units))
return x
另一个示例,这次不是在自定义层中:
encoder_input = keras.Input(shape=(28, 28, 1), name="img")
x = layers.Conv2D(16, 3, activation="relu")(encoder_input)
x = layers.Conv2D(32, 3, activation="relu")(x)
x = layers.MaxPooling2D(3)(x)
x = layers.Conv2D(32, 3, activation="relu")(x)
x = tf.einsum('...ijk->...jik', x) # tensorflow op, just swap height and width for no reason
x = layers.Conv2D(16, 3, activation="relu")(x)
我知道有keras.layers.lambda
layer之类的东西,但是有时我需要使用TensorFlow的 90% OPS和仅 10%按KERAS层进行的计算(因为没有TF OP替代方案,我不想每次都能实现自己的)。在这种情况下,使用lambda层毫无意义。
在TensorFlow 2.x中编写复杂模型(实现不仅仅堆叠Keras现有层)是否有一个好的方法?
I know that output of keras layers (like keras.layers.Dense()
) produce so-called 'keras tensors'. Also, there are 'tensorflow tensors' that are produced by tensorflow operations (like tf.reduce_sum()
).
Some functionality can be delivered by only one of those approaches, so it is obvious that I will have to mix them sometimes to do the calculation. In my opinion, mixing keras layers code with tf ops looks far from beatiful but thats another topic.
My question is - is it doable to mix keras layers and tensorflow ops? And will there be any problems related to mixing keras and tensorflow tensors that they produce?
Lets consider an example of custom class, don't look deep into the calculation, it makes no sense. The point is to show what I'm talking about.
class Calculation(layers.Layer):
def __init__(self, units):
super().__init__()
self.units = units
self.conv3d = layers.Conv3D(units, kernel_size=3, strides=2)
self.norm = layers.LayerNormalization(units)
def call(self, x):
x = activations.gelu(x)
x = self.norm(x) # keras layer
x = tf.transpose(x, (0, 4, 1, 2, 3))
x = self.conv3d(x) # keras layer
x = tf.transpose(x, (0, 2, 3, 4, 1))
x = tf.reshape(x, (-1, 1, 2 * self.units))
return x
Another example, this time not inside custom layer:
encoder_input = keras.Input(shape=(28, 28, 1), name="img")
x = layers.Conv2D(16, 3, activation="relu")(encoder_input)
x = layers.Conv2D(32, 3, activation="relu")(x)
x = layers.MaxPooling2D(3)(x)
x = layers.Conv2D(32, 3, activation="relu")(x)
x = tf.einsum('...ijk->...jik', x) # tensorflow op, just swap height and width for no reason
x = layers.Conv2D(16, 3, activation="relu")(x)
I know that there is such thing as keras.layers.Lambda
layer, but sometimes I need to use like 90% of tensorflow ops and only 10% of calculations done by keras layers (because there is no tf op alternative and I don't want to implement my own every time). In that case using lambda layer makes little sense.
Is there a good approach for writing complex models (where implementation lays beyond just stacking keras existing layers) in tensorflow 2.X?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
IMO,只要您保留批处理尺寸和张量形状,就可以将
Tensorflow
操作与keras
层混合在一起并不重要。例如,您可能需要将tf
在lambda
层内包装操作以设置一些元数据,但这取决于您的口味:例如您使用
tf.Reduce_sum
在axis = 0
中,您将丢失批处理维度(none
),这对于keras < /代码>型号:
IMO, it actually does not matter how you mix
Tensorflow
operations withKeras
layers as long as you preserve the batch dimension and generally the tensor shape. You might want to, for example, wrap thetf
operations insideLambda
layers to set some meta-data like the names, but that depends on your taste:For example, if you use
tf.reduce_sum
withaxis=0
, you lose the batch dimension (None
), which is problematic for aKeras
model: