深梦想“指南”图像中的图像

发布于 2025-01-23 16:41:27 字数 2694 浏览 2 评论 0原文

我正在尝试从TensorFlow文档中修改深层的代码: https://www.tensorflow.org/tutorials/generative/generative/generative/deepdream/deepdream

特定于想要使用“指南图像”来产生梦想特征。这最初是在此笔记本中(在底部)中显示的: https://github.com/google.com/google/deepdream/deepdream/deepdream/blob/blob/master/dream/dream/dream-dream。 ipynb

在他们的示例中,他们在云层的图像上使用了花朵和产生的花样特征。为此,它们提供了另一种损失功能。来自Caffe笔记本:

我们试图最大化当前图像激活之间的点产量,而不是最大化当前图像激活的L2-符号。

在Caffe中,看起来像这样:

end = 'inception_3b/output'
h, w = guide.shape[:2]
src, dst = net.blobs['data'], net.blobs[end]
src.reshape(1,3,h,w)
src.data[0] = preprocess(net, guide)
net.forward(end=end)
guide_features = dst.data[0].copy()

def objective_guide(dst):
    x = dst.data[0].copy()
    y = guide_features
    ch = x.shape[0]
    x = x.reshape(ch,-1)
    y = y.reshape(ch,-1)
    A = x.T.dot(y) # compute the matrix of dot-products with guide features
    dst.diff[0].reshape(ch,-1)[:] = y[:,A.argmax(1)] # select ones that match best

我将其转换为TensorFlow这样的:

def get_activations(img, model):
    # Pass forward the image through the model to retrieve the activations.
    # Converts the image into a batch of size 1.
    img_batch = tf.expand_dims(img, axis=0)
    layer_activations = model(img_batch)
    if len(layer_activations) == 1:
        layer_activations = [layer_activations]
    return layer_activations

guide_activations = get_activations(img, model)

def maximize_to_guide(img, model):
    layer_activations = get_activations(img, model)
    losses = []
    for guide_activation in guide_activations:
        for layer_activation in layer_activations:
            ch = layer_activation.shape[-1]
            layer_activation = tf.reshape(layer_activation, (ch, -1))
            guide_activation = tf.reshape(guide_activation, (ch, -1))
            dot = tf.matmul(tf.transpose(layer_activation), guide_activation)
            max_act_idx = tf.math.argmax(dot, axis=1)
            max_act = tf.gather(guide_activation, max_act_idx, axis=1)
            loss = tf.math.reduce_mean(max_act)
            losses.append(loss)
    return tf.reduce_sum(losses)

但是,tape.gradient(lose,img)返回none。我认为这是因为argmax不是可区分的。但是,如果我从layer_activations而是 - tf.gather(layer_activation,max_act_idx,axis = 1) - 然后会产生渐变(但不是所需的图像)。因此,从返回的损耗值到输入图像,它显然能够逐步逐渐逐步退后,但仅在第二种情况下。这是怎么回事?

I'm trying to modify the deep dream code from the Tensorflow docs here:
https://www.tensorflow.org/tutorials/generative/deepdream

Specifically, I want to use a "guide image" to produce the dream features. This was originally shown in Caffe in this notebook (at the bottom):
https://github.com/google/deepdream/blob/master/dream.ipynb

In their example, they used an image of flowers and produced flower-like features on top of an image of clouds. To do this, they provide an alternate loss function. From the Caffe notebook:

Instead of maximizing the L2-norm of current image activations, we try to maximize the dot-products between activations of current image, and their best matching correspondences from the guide image.

In Caffe, it looks like this:

end = 'inception_3b/output'
h, w = guide.shape[:2]
src, dst = net.blobs['data'], net.blobs[end]
src.reshape(1,3,h,w)
src.data[0] = preprocess(net, guide)
net.forward(end=end)
guide_features = dst.data[0].copy()

def objective_guide(dst):
    x = dst.data[0].copy()
    y = guide_features
    ch = x.shape[0]
    x = x.reshape(ch,-1)
    y = y.reshape(ch,-1)
    A = x.T.dot(y) # compute the matrix of dot-products with guide features
    dst.diff[0].reshape(ch,-1)[:] = y[:,A.argmax(1)] # select ones that match best

I translated this to Tensorflow like so:

def get_activations(img, model):
    # Pass forward the image through the model to retrieve the activations.
    # Converts the image into a batch of size 1.
    img_batch = tf.expand_dims(img, axis=0)
    layer_activations = model(img_batch)
    if len(layer_activations) == 1:
        layer_activations = [layer_activations]
    return layer_activations

guide_activations = get_activations(img, model)

def maximize_to_guide(img, model):
    layer_activations = get_activations(img, model)
    losses = []
    for guide_activation in guide_activations:
        for layer_activation in layer_activations:
            ch = layer_activation.shape[-1]
            layer_activation = tf.reshape(layer_activation, (ch, -1))
            guide_activation = tf.reshape(guide_activation, (ch, -1))
            dot = tf.matmul(tf.transpose(layer_activation), guide_activation)
            max_act_idx = tf.math.argmax(dot, axis=1)
            max_act = tf.gather(guide_activation, max_act_idx, axis=1)
            loss = tf.math.reduce_mean(max_act)
            losses.append(loss)
    return tf.reduce_sum(losses)

However, tape.gradient(loss, img) returns None. I thought that it was because argmax is not differentiable. However, if I gather from the layer_activations instead -- tf.gather(layer_activation, max_act_idx, axis=1) -- then it produces a gradient (but not the desired image). So it's clearly able to step back through the tape, from the returned loss value to the input image, but only in this second case. What's going on here?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文