批量尺寸＆GT; 1使用TensorFlow 1.x给出了错误

发布于 2025-01-24 07:52:02 字数 3301 浏览 0 评论 0原文

我正在使用这个 vae。

我做出的唯一区别是将损失从二进制交叉熵更改为MSE，如这样：

class OptimizerVAE(object):

def __init__(self, model, learning_rate=1e-3):
    """
    OptimizerVAE initializer
    :param model: a model object
    :param learning_rate: float, learning rate of the optimizer
    """

    # binary cross entropy error
    self.bce = tf.keras.losses.mse(model.x, model.logits)
    self.reconstruction_loss = tf.reduce_mean(tf.reduce_sum(self.bce, axis=-1))

    if model.distribution == 'normal':
        # KL divergence between normal approximate posterior and standard normal prior
        self.p_z = tf.distributions.Normal(tf.zeros_like(model.z), tf.ones_like(model.z))
        kl = model.q_z.kl_divergence(self.p_z)
        self.kl = tf.reduce_mean(tf.reduce_sum(kl, axis=-1))*0.1
    elif model.distribution == 'vmf':
        # KL divergence between vMF approximate posterior and uniform hyper-spherical prior
        self.p_z = HypersphericalUniform(model.z_dim - 1, dtype=model.x.dtype)
        kl = model.q_z.kl_divergence(self.p_z)
        self.kl = tf.reduce_mean(kl)*0.1
    else:
        raise NotImplemented

    self.ELBO = - self.reconstruction_loss - self.kl

    self.train_step = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(-self.ELBO)

    self.print = {'recon loss': self.reconstruction_loss, 'ELBO': self.ELBO, 'KL': self.kl}

运行原始体系结构时，该型号的运行完美（2 MLP层），无论批处理的大小如何（指定为“无”， GitHub代码）。

我正在尝试将其更改为卷积模型，但是当我仅更改编码器时：

def _encoder(self, x):
    """
    Encoder network
    :param x: placeholder for input
    :return: tuple `(z_mean, z_var)` with mean and concentration around the mean
    """
    
    # 2 hidden layers encoder
    #h0 = tf.layers.dense(x, units=self.h_dim * 2, activation=self.activation)
    #h1 = tf.layers.dense(h0, units=self.h_dim, activation=self.activation)
    h1 = tf.layers.conv1d(x, filters = 32, kernel_size = 7, activation = tf.nn.relu)
    h1 = tf.layers.conv1d(h1, filters = 64, kernel_size = 7, activation =tf.nn.relu)
    h1 = tf.layers.conv1d(h1, filters = 64, kernel_size = 7, activation = tf.nn.relu)
    h1 = tf.layers.flatten(h1)
    h1 = tf.layers.dense(h1, 32, activation = tf.nn.relu)

    if self.distribution == 'normal':
        # compute mean and std of the normal distribution
        z_mean = tf.layers.dense(h1, units=self.z_dim, activation=None, name = 'z_output')
        z_var = tf.layers.dense(h1, units=self.z_dim, activation=tf.nn.softplus)
    elif self.distribution == 'vmf':
        # compute mean and concentration of the von Mises-Fisher
        z_mean = tf.layers.dense(h1, units=self.z_dim, activation=lambda x: tf.nn.l2_normalize(x, axis=-1))
        # the `+ 1` prevent collapsing behaviors
        z_var = tf.layers.dense(h1, units=1, activation=tf.nn.softplus) + 1
    else:
        raise NotImplemented

    return z_mean, z_var

运行模型时，我会得到错误：

InvalidArgumentError: Incompatible shapes: [32,1] vs. [32,512,1]
 [[{{node gradients/SquaredDifference_grad/BroadcastGradientArgs}}]]

32是运行模型时的batch_size。令人困惑的是，当我用batch_size = 1运行此操作时，模型运行！

这出错了哪里？是优化器和平均方式吗？

原文

I am using this example of a VAE.

The only difference I made was change the loss from binary cross entropy to MSE, like this:

class OptimizerVAE(object):

def __init__(self, model, learning_rate=1e-3):
    """
    OptimizerVAE initializer
    :param model: a model object
    :param learning_rate: float, learning rate of the optimizer
    """

    # binary cross entropy error
    self.bce = tf.keras.losses.mse(model.x, model.logits)
    self.reconstruction_loss = tf.reduce_mean(tf.reduce_sum(self.bce, axis=-1))

    if model.distribution == 'normal':
        # KL divergence between normal approximate posterior and standard normal prior
        self.p_z = tf.distributions.Normal(tf.zeros_like(model.z), tf.ones_like(model.z))
        kl = model.q_z.kl_divergence(self.p_z)
        self.kl = tf.reduce_mean(tf.reduce_sum(kl, axis=-1))*0.1
    elif model.distribution == 'vmf':
        # KL divergence between vMF approximate posterior and uniform hyper-spherical prior
        self.p_z = HypersphericalUniform(model.z_dim - 1, dtype=model.x.dtype)
        kl = model.q_z.kl_divergence(self.p_z)
        self.kl = tf.reduce_mean(kl)*0.1
    else:
        raise NotImplemented

    self.ELBO = - self.reconstruction_loss - self.kl

    self.train_step = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(-self.ELBO)

    self.print = {'recon loss': self.reconstruction_loss, 'ELBO': self.ELBO, 'KL': self.kl}

and when running the original architecture, the model runs perfectly (2 MLP layers), no matter the size of the batches (specified as "None" in the github code).

I am trying to change this to a convolutional model, but when I change just the encoder to this:

def _encoder(self, x):
    """
    Encoder network
    :param x: placeholder for input
    :return: tuple `(z_mean, z_var)` with mean and concentration around the mean
    """
    
    # 2 hidden layers encoder
    #h0 = tf.layers.dense(x, units=self.h_dim * 2, activation=self.activation)
    #h1 = tf.layers.dense(h0, units=self.h_dim, activation=self.activation)
    h1 = tf.layers.conv1d(x, filters = 32, kernel_size = 7, activation = tf.nn.relu)
    h1 = tf.layers.conv1d(h1, filters = 64, kernel_size = 7, activation =tf.nn.relu)
    h1 = tf.layers.conv1d(h1, filters = 64, kernel_size = 7, activation = tf.nn.relu)
    h1 = tf.layers.flatten(h1)
    h1 = tf.layers.dense(h1, 32, activation = tf.nn.relu)

    if self.distribution == 'normal':
        # compute mean and std of the normal distribution
        z_mean = tf.layers.dense(h1, units=self.z_dim, activation=None, name = 'z_output')
        z_var = tf.layers.dense(h1, units=self.z_dim, activation=tf.nn.softplus)
    elif self.distribution == 'vmf':
        # compute mean and concentration of the von Mises-Fisher
        z_mean = tf.layers.dense(h1, units=self.z_dim, activation=lambda x: tf.nn.l2_normalize(x, axis=-1))
        # the `+ 1` prevent collapsing behaviors
        z_var = tf.layers.dense(h1, units=1, activation=tf.nn.softplus) + 1
    else:
        raise NotImplemented

    return z_mean, z_var

and when running the model, I get the error:

InvalidArgumentError: Incompatible shapes: [32,1] vs. [32,512,1]
 [[{{node gradients/SquaredDifference_grad/BroadcastGradientArgs}}]]

32 is the batch_size when running the model. The thing that is confusing me is when I run this with batch_size = 1, the model runs!

Where is this going wrong? is it the optimizer and the way it averages?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

倦话 2025-01-31 07:52:02

我通过以形式重塑解码器的输出来解决问题：（ win_size，1），因为MLP未能在其中添加额外的dim'n！

回复收藏 0 原文

~没有更多了~

关于作者

孤凫

暂无简介

文章

25 人气

关注发私信

尘曦

文章 0 评论 0

关注

在梵高的星空下

文章 0 评论 0

关注

善良天后

文章 0 评论 0

关注

韬韬不绝

文章 0 评论 0

关注

qq_CgiN62

文章 0 评论 0

关注

不美如何

文章 0 评论 0

友情链接

文江博客

批量尺寸＆GT; 1使用TensorFlow 1.x给出了错误

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

尘曦

在梵高的星空下

善良天后

韬韬不绝

qq_CgiN62

不美如何

友情链接

批量尺寸＆GT; 1使用TensorFlow 1.x给出了错误

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

尘曦

在梵高的星空下

善良天后

韬韬不绝

qq_CgiN62

不美如何

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。