深梦想“指南”图像中的图像

发布于 2025-01-23 16:41:27 字数 2694 浏览 2 评论 0原文

我正在尝试从TensorFlow文档中修改深层的代码： https://www.tensorflow.org/tutorials/generative/generative/generative/deepdream/deepdream

特定于想要使用“指南图像”来产生梦想特征。这最初是在此笔记本中（在底部）中显示的： https://github.com/google.com/google/deepdream/deepdream/deepdream/blob/blob/master/dream/dream/dream-dream。 ipynb

在他们的示例中，他们在云层的图像上使用了花朵和产生的花样特征。为此，它们提供了另一种损失功能。来自Caffe笔记本：

我们试图最大化当前图像激活之间的点产量，而不是最大化当前图像激活的L2-符号。

在Caffe中，看起来像这样：

end = 'inception_3b/output'
h, w = guide.shape[:2]
src, dst = net.blobs['data'], net.blobs[end]
src.reshape(1,3,h,w)
src.data[0] = preprocess(net, guide)
net.forward(end=end)
guide_features = dst.data[0].copy()

def objective_guide(dst):
    x = dst.data[0].copy()
    y = guide_features
    ch = x.shape[0]
    x = x.reshape(ch,-1)
    y = y.reshape(ch,-1)
    A = x.T.dot(y) # compute the matrix of dot-products with guide features
    dst.diff[0].reshape(ch,-1)[:] = y[:,A.argmax(1)] # select ones that match best

我将其转换为TensorFlow这样的：

def get_activations(img, model):
    # Pass forward the image through the model to retrieve the activations.
    # Converts the image into a batch of size 1.
    img_batch = tf.expand_dims(img, axis=0)
    layer_activations = model(img_batch)
    if len(layer_activations) == 1:
        layer_activations = [layer_activations]
    return layer_activations

guide_activations = get_activations(img, model)

def maximize_to_guide(img, model):
    layer_activations = get_activations(img, model)
    losses = []
    for guide_activation in guide_activations:
        for layer_activation in layer_activations:
            ch = layer_activation.shape[-1]
            layer_activation = tf.reshape(layer_activation, (ch, -1))
            guide_activation = tf.reshape(guide_activation, (ch, -1))
            dot = tf.matmul(tf.transpose(layer_activation), guide_activation)
            max_act_idx = tf.math.argmax(dot, axis=1)
            max_act = tf.gather(guide_activation, max_act_idx, axis=1)
            loss = tf.math.reduce_mean(max_act)
            losses.append(loss)
    return tf.reduce_sum(losses)

但是，tape.gradient（lose，img）返回none。我认为这是因为argmax不是可区分的。但是，如果我从layer_activations而是 - tf.gather（layer_activation，max_act_idx，axis = 1） - 然后会产生渐变（但不是所需的图像）。因此，从返回的损耗值到输入图像，它显然能够逐步逐渐逐步退后，但仅在第二种情况下。这是怎么回事？

原文

I'm trying to modify the deep dream code from the Tensorflow docs here:
https://www.tensorflow.org/tutorials/generative/deepdream

Specifically, I want to use a "guide image" to produce the dream features. This was originally shown in Caffe in this notebook (at the bottom):
https://github.com/google/deepdream/blob/master/dream.ipynb

In their example, they used an image of flowers and produced flower-like features on top of an image of clouds. To do this, they provide an alternate loss function. From the Caffe notebook:

Instead of maximizing the L2-norm of current image activations, we try to maximize the dot-products between activations of current image, and their best matching correspondences from the guide image.

In Caffe, it looks like this:

end = 'inception_3b/output'
h, w = guide.shape[:2]
src, dst = net.blobs['data'], net.blobs[end]
src.reshape(1,3,h,w)
src.data[0] = preprocess(net, guide)
net.forward(end=end)
guide_features = dst.data[0].copy()

def objective_guide(dst):
    x = dst.data[0].copy()
    y = guide_features
    ch = x.shape[0]
    x = x.reshape(ch,-1)
    y = y.reshape(ch,-1)
    A = x.T.dot(y) # compute the matrix of dot-products with guide features
    dst.diff[0].reshape(ch,-1)[:] = y[:,A.argmax(1)] # select ones that match best

I translated this to Tensorflow like so:

def get_activations(img, model):
    # Pass forward the image through the model to retrieve the activations.
    # Converts the image into a batch of size 1.
    img_batch = tf.expand_dims(img, axis=0)
    layer_activations = model(img_batch)
    if len(layer_activations) == 1:
        layer_activations = [layer_activations]
    return layer_activations

guide_activations = get_activations(img, model)

def maximize_to_guide(img, model):
    layer_activations = get_activations(img, model)
    losses = []
    for guide_activation in guide_activations:
        for layer_activation in layer_activations:
            ch = layer_activation.shape[-1]
            layer_activation = tf.reshape(layer_activation, (ch, -1))
            guide_activation = tf.reshape(guide_activation, (ch, -1))
            dot = tf.matmul(tf.transpose(layer_activation), guide_activation)
            max_act_idx = tf.math.argmax(dot, axis=1)
            max_act = tf.gather(guide_activation, max_act_idx, axis=1)
            loss = tf.math.reduce_mean(max_act)
            losses.append(loss)
    return tf.reduce_sum(losses)

However, tape.gradient(loss, img) returns None. I thought that it was because argmax is not differentiable. However, if I gather from the layer_activations instead -- tf.gather(layer_activation, max_act_idx, axis=1) -- then it produces a gradient (but not the desired image). So it's clearly able to step back through the tape, from the returned loss value to the input image, but only in this second case. What's going on here?

分享到QQ

分享到微博