Tensorflow 梯度磁带“未连接梯度的未知值”

发布于 2025-01-11 12:27:21 字数 3659 浏览 7 评论 0原文

我试图理解为什么在使用梯度带求函数的导数时会出现错误。尝试求 Power 对 T 的导数,定义为:

    import tensorflow as tf
    import numpy as np
    from scipy.fft import fft, fftfreq, fftn
    import tensorflow.python.ops.numpy_ops.np_config as np_config
    np_config.enable_numpy_behavior()

    #####Initialize Values######

    s1 = np.array([[0,1,0],
                   [1,0,1],
                   [0,1,0]])

    s2 = np.array([[0,-1j,0],
                   [1j,0,-1j],
                   [0,1j,0]])

    s3 = np.array([[1,0,0],
                  [0,0,0],
                  [0,0,-1]])

    spin1 = (1/np.sqrt(2))*s1
    spin2 = (1/np.sqrt(2))*s2
    spin3 = (1/np.sqrt(2))*s3

    spin1 = tf.constant(spin1)
    spin2 = tf.constant(spin2)
    spin3 = tf.constant(spin3)

    a = tf.constant(1.0)
    b = tf.constant(1.0)
    c = tf.constant(1.0)
    d = tf.constant(1.0)

    v = tf.constant(1.0)     # ~N(0,sigma_v)
    w = tf.constant(1.0)     # ~N(0,sigma_w)

    c0_0 = tf.complex(tf.constant(1.0), tf.constant(0.0))
    c1_0 = tf.complex(tf.constant(1.0), tf.constant(0.0))

    ###### Define Functions########

    def getDE(T):
        D = a*T+b+v
        E = c*T+d+w
        return D,E

    def H(D,E):
        return D*(spin3**2 - 2/3) + E*(spin1**2-spin2**2)

    def psi(t,eigenvalues,eigenvec1, eigenvec2):
        c_0 = np.array(np.exp(-1j*(eigenvalues[0])*t)*c0_0)
        c_0.shape = (N,1)
        c_1 = np.array(np.exp(-1j*(eigenvalues[1])*t)*c1_0)
        c_1.shape = (N,1)
        return c_0*(eigenvec1.T)+c_1*(eigenvec2.T)

    def forward(T):
        T = tf.Variable(T)
        with tf.GradientTape() as tape:
            D,E = getDE(T)
            H_tf = H(D,E)
            eigenvalues, eigenstates = tf.linalg.eig(H_tf)
            eigenvec1 = eigenstates[:,0]
            eigenvec2 = eigenstates[:,1]
            wave = psi(t,eigenvalues,eigenvec1, eigenvec2)
            a = np.abs(tf.signal.fft2d(wave))**2
            Power = np.full([100,1], None)
            for i in range(N):
                Power[i,:] = a[i,:].conj().T@a[i,:]
        
        return tape.gradient(Power,T)

如果有人能告诉我我这样做是否正确,或者是否有更好的方法来做到这一点,因为我对 python 中的自动微分不是很熟悉。

在前向函数中,取波相对于 T 的导数似乎可行,但是当我执行 fft 时,我会收到以下错误:

警告:tensorflow:目标张量的 dtype 必须是浮动的(例如 tf.float32)调用 GradientTape.gradient 时,得到 dtype('O')

    AttributeError                            Traceback (most recent call last)
    ~\AppData\Local\Temp/ipykernel_352/3452884380.py in <module>
    ----> 1 T_hat = forward(17.0)
          2 print(T_hat)

    ~\AppData\Local\Temp/ipykernel_352/2053063608.py in forward(T)
         13             Power[i,:] = a[i,:].conj().T@a[i,:]
         14 
    ---> 15     return tape.gradient(Power,T)

    ~\anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\backprop.py in 
    gradient(self, target, sources, output_gradients, unconnected_gradients)
       1072                           for x in nest.flatten(output_gradients)]
       1073 
    -> 1074     flat_grad = imperative_grad.imperative_grad(
       1075         self._tape,
       1076         flat_targets,

    ~\anaconda3\envs\tensorflow-gpu\lib\site- 
   packages\tensorflow\python\eager\imperative_grad.py in imperative_grad(tape, target, 
    sources, output_gradients, sources_raw, unconnected_gradients)
         69         "Unknown value for unconnected_gradients: %r" % unconnected_gradients)
         70 
    ---> 71   return pywrap_tfe.TFE_Py_TapeGradient(
         72       tape._tape,  # pylint: disable=protected-access
         73       target,

    AttributeError: 'numpy.ndarray' object has no attribute '_id'

I'm trying to understand why im getting an error when using gradient tape to take the derivative of a function. Try to take the derivative of Power with respect to T, defined as:

    import tensorflow as tf
    import numpy as np
    from scipy.fft import fft, fftfreq, fftn
    import tensorflow.python.ops.numpy_ops.np_config as np_config
    np_config.enable_numpy_behavior()

    #####Initialize Values######

    s1 = np.array([[0,1,0],
                   [1,0,1],
                   [0,1,0]])

    s2 = np.array([[0,-1j,0],
                   [1j,0,-1j],
                   [0,1j,0]])

    s3 = np.array([[1,0,0],
                  [0,0,0],
                  [0,0,-1]])

    spin1 = (1/np.sqrt(2))*s1
    spin2 = (1/np.sqrt(2))*s2
    spin3 = (1/np.sqrt(2))*s3

    spin1 = tf.constant(spin1)
    spin2 = tf.constant(spin2)
    spin3 = tf.constant(spin3)

    a = tf.constant(1.0)
    b = tf.constant(1.0)
    c = tf.constant(1.0)
    d = tf.constant(1.0)

    v = tf.constant(1.0)     # ~N(0,sigma_v)
    w = tf.constant(1.0)     # ~N(0,sigma_w)

    c0_0 = tf.complex(tf.constant(1.0), tf.constant(0.0))
    c1_0 = tf.complex(tf.constant(1.0), tf.constant(0.0))

    ###### Define Functions########

    def getDE(T):
        D = a*T+b+v
        E = c*T+d+w
        return D,E

    def H(D,E):
        return D*(spin3**2 - 2/3) + E*(spin1**2-spin2**2)

    def psi(t,eigenvalues,eigenvec1, eigenvec2):
        c_0 = np.array(np.exp(-1j*(eigenvalues[0])*t)*c0_0)
        c_0.shape = (N,1)
        c_1 = np.array(np.exp(-1j*(eigenvalues[1])*t)*c1_0)
        c_1.shape = (N,1)
        return c_0*(eigenvec1.T)+c_1*(eigenvec2.T)

    def forward(T):
        T = tf.Variable(T)
        with tf.GradientTape() as tape:
            D,E = getDE(T)
            H_tf = H(D,E)
            eigenvalues, eigenstates = tf.linalg.eig(H_tf)
            eigenvec1 = eigenstates[:,0]
            eigenvec2 = eigenstates[:,1]
            wave = psi(t,eigenvalues,eigenvec1, eigenvec2)
            a = np.abs(tf.signal.fft2d(wave))**2
            Power = np.full([100,1], None)
            for i in range(N):
                Power[i,:] = a[i,:].conj().T@a[i,:]
        
        return tape.gradient(Power,T)

If someone could tell me if I'm doing this correctly or if there is a better way to do it, as I am not very familiar with auto differentiation in python.

In the forward function taking the derivative of wave with respect to T seems to work, but as soon as I do the fft I get the following error:

WARNING:tensorflow:The dtype of the target tensor must be floating (e.g. tf.float32) when calling GradientTape.gradient, got dtype('O')

    AttributeError                            Traceback (most recent call last)
    ~\AppData\Local\Temp/ipykernel_352/3452884380.py in <module>
    ----> 1 T_hat = forward(17.0)
          2 print(T_hat)

    ~\AppData\Local\Temp/ipykernel_352/2053063608.py in forward(T)
         13             Power[i,:] = a[i,:].conj().T@a[i,:]
         14 
    ---> 15     return tape.gradient(Power,T)

    ~\anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\backprop.py in 
    gradient(self, target, sources, output_gradients, unconnected_gradients)
       1072                           for x in nest.flatten(output_gradients)]
       1073 
    -> 1074     flat_grad = imperative_grad.imperative_grad(
       1075         self._tape,
       1076         flat_targets,

    ~\anaconda3\envs\tensorflow-gpu\lib\site- 
   packages\tensorflow\python\eager\imperative_grad.py in imperative_grad(tape, target, 
    sources, output_gradients, sources_raw, unconnected_gradients)
         69         "Unknown value for unconnected_gradients: %r" % unconnected_gradients)
         70 
    ---> 71   return pywrap_tfe.TFE_Py_TapeGradient(
         72       tape._tape,  # pylint: disable=protected-access
         73       target,

    AttributeError: 'numpy.ndarray' object has no attribute '_id'

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

眼睛会笑 2025-01-18 12:27:21

我希望您已经找到问题的答案。但如果你还没有,也许这会给你一些启发。

您看到的问题是因为 Tensorflow 无法计算整体前向的梯度,我建议停止使用 NumPy 方法。

据我所知,您可以更改所有由 TensorFlow 实现的 NumPy 方法。

示例:

计算复张量的大小

magnitude = tf.math.abs(complex_tensor)

使用复指数 提取

complex_tensor = tf.math.exp(tf.complex(0.0, -1.0)*tf.cast(phase, "complex64"))

给定维度上的元素

elm1, elm2 = tf.unstack(x, num=2, axis = -1)

计算复张量的共轭

a_conj = tf.math.conj(a)

转置或排列张量维度

x_T = tf.transpose(x, perm = [1, 0])

总结一下,停止使用 NumPy 方法并找到 Tensorflow 替代方案,这将解决你的问题。

I hope you have already found an answer to your question. But if you haven't maybe this will give some light.

The problem that you are seen is because Tensorflow can't calculate the gradient of the overall forward I would recommend stopping using NumPy methods.

As long I can see, you can change all those NumPy methods by TensorFlow implemented.

example:

To calculate the magnitude of complex tensor

magnitude = tf.math.abs(complex_tensor)

To use the complex exponential

complex_tensor = tf.math.exp(tf.complex(0.0, -1.0)*tf.cast(phase, "complex64"))

To extract elements on a given dimension

elm1, elm2 = tf.unstack(x, num=2, axis = -1)

To calculate conjugate of a complex tensor

a_conj = tf.math.conj(a)

To transpose or permute the tensor dimensions

x_T = tf.transpose(x, perm = [1, 0])

To summarize, stop using NumPy methods and find the Tensorflow alternatives, that will solve your problems.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文