Tensorflow 梯度磁带“未连接梯度的未知值”
我试图理解为什么在使用梯度带求函数的导数时会出现错误。尝试求 Power 对 T 的导数,定义为:
import tensorflow as tf
import numpy as np
from scipy.fft import fft, fftfreq, fftn
import tensorflow.python.ops.numpy_ops.np_config as np_config
np_config.enable_numpy_behavior()
#####Initialize Values######
s1 = np.array([[0,1,0],
[1,0,1],
[0,1,0]])
s2 = np.array([[0,-1j,0],
[1j,0,-1j],
[0,1j,0]])
s3 = np.array([[1,0,0],
[0,0,0],
[0,0,-1]])
spin1 = (1/np.sqrt(2))*s1
spin2 = (1/np.sqrt(2))*s2
spin3 = (1/np.sqrt(2))*s3
spin1 = tf.constant(spin1)
spin2 = tf.constant(spin2)
spin3 = tf.constant(spin3)
a = tf.constant(1.0)
b = tf.constant(1.0)
c = tf.constant(1.0)
d = tf.constant(1.0)
v = tf.constant(1.0) # ~N(0,sigma_v)
w = tf.constant(1.0) # ~N(0,sigma_w)
c0_0 = tf.complex(tf.constant(1.0), tf.constant(0.0))
c1_0 = tf.complex(tf.constant(1.0), tf.constant(0.0))
###### Define Functions########
def getDE(T):
D = a*T+b+v
E = c*T+d+w
return D,E
def H(D,E):
return D*(spin3**2 - 2/3) + E*(spin1**2-spin2**2)
def psi(t,eigenvalues,eigenvec1, eigenvec2):
c_0 = np.array(np.exp(-1j*(eigenvalues[0])*t)*c0_0)
c_0.shape = (N,1)
c_1 = np.array(np.exp(-1j*(eigenvalues[1])*t)*c1_0)
c_1.shape = (N,1)
return c_0*(eigenvec1.T)+c_1*(eigenvec2.T)
def forward(T):
T = tf.Variable(T)
with tf.GradientTape() as tape:
D,E = getDE(T)
H_tf = H(D,E)
eigenvalues, eigenstates = tf.linalg.eig(H_tf)
eigenvec1 = eigenstates[:,0]
eigenvec2 = eigenstates[:,1]
wave = psi(t,eigenvalues,eigenvec1, eigenvec2)
a = np.abs(tf.signal.fft2d(wave))**2
Power = np.full([100,1], None)
for i in range(N):
Power[i,:] = a[i,:].conj().T@a[i,:]
return tape.gradient(Power,T)
如果有人能告诉我我这样做是否正确,或者是否有更好的方法来做到这一点,因为我对 python 中的自动微分不是很熟悉。
在前向函数中,取波相对于 T 的导数似乎可行,但是当我执行 fft 时,我会收到以下错误:
警告:tensorflow:目标张量的 dtype 必须是浮动的(例如 tf.float32)调用 GradientTape.gradient 时,得到 dtype('O')
AttributeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_352/3452884380.py in <module>
----> 1 T_hat = forward(17.0)
2 print(T_hat)
~\AppData\Local\Temp/ipykernel_352/2053063608.py in forward(T)
13 Power[i,:] = a[i,:].conj().T@a[i,:]
14
---> 15 return tape.gradient(Power,T)
~\anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\backprop.py in
gradient(self, target, sources, output_gradients, unconnected_gradients)
1072 for x in nest.flatten(output_gradients)]
1073
-> 1074 flat_grad = imperative_grad.imperative_grad(
1075 self._tape,
1076 flat_targets,
~\anaconda3\envs\tensorflow-gpu\lib\site-
packages\tensorflow\python\eager\imperative_grad.py in imperative_grad(tape, target,
sources, output_gradients, sources_raw, unconnected_gradients)
69 "Unknown value for unconnected_gradients: %r" % unconnected_gradients)
70
---> 71 return pywrap_tfe.TFE_Py_TapeGradient(
72 tape._tape, # pylint: disable=protected-access
73 target,
AttributeError: 'numpy.ndarray' object has no attribute '_id'
I'm trying to understand why im getting an error when using gradient tape to take the derivative of a function. Try to take the derivative of Power with respect to T, defined as:
import tensorflow as tf
import numpy as np
from scipy.fft import fft, fftfreq, fftn
import tensorflow.python.ops.numpy_ops.np_config as np_config
np_config.enable_numpy_behavior()
#####Initialize Values######
s1 = np.array([[0,1,0],
[1,0,1],
[0,1,0]])
s2 = np.array([[0,-1j,0],
[1j,0,-1j],
[0,1j,0]])
s3 = np.array([[1,0,0],
[0,0,0],
[0,0,-1]])
spin1 = (1/np.sqrt(2))*s1
spin2 = (1/np.sqrt(2))*s2
spin3 = (1/np.sqrt(2))*s3
spin1 = tf.constant(spin1)
spin2 = tf.constant(spin2)
spin3 = tf.constant(spin3)
a = tf.constant(1.0)
b = tf.constant(1.0)
c = tf.constant(1.0)
d = tf.constant(1.0)
v = tf.constant(1.0) # ~N(0,sigma_v)
w = tf.constant(1.0) # ~N(0,sigma_w)
c0_0 = tf.complex(tf.constant(1.0), tf.constant(0.0))
c1_0 = tf.complex(tf.constant(1.0), tf.constant(0.0))
###### Define Functions########
def getDE(T):
D = a*T+b+v
E = c*T+d+w
return D,E
def H(D,E):
return D*(spin3**2 - 2/3) + E*(spin1**2-spin2**2)
def psi(t,eigenvalues,eigenvec1, eigenvec2):
c_0 = np.array(np.exp(-1j*(eigenvalues[0])*t)*c0_0)
c_0.shape = (N,1)
c_1 = np.array(np.exp(-1j*(eigenvalues[1])*t)*c1_0)
c_1.shape = (N,1)
return c_0*(eigenvec1.T)+c_1*(eigenvec2.T)
def forward(T):
T = tf.Variable(T)
with tf.GradientTape() as tape:
D,E = getDE(T)
H_tf = H(D,E)
eigenvalues, eigenstates = tf.linalg.eig(H_tf)
eigenvec1 = eigenstates[:,0]
eigenvec2 = eigenstates[:,1]
wave = psi(t,eigenvalues,eigenvec1, eigenvec2)
a = np.abs(tf.signal.fft2d(wave))**2
Power = np.full([100,1], None)
for i in range(N):
Power[i,:] = a[i,:].conj().T@a[i,:]
return tape.gradient(Power,T)
If someone could tell me if I'm doing this correctly or if there is a better way to do it, as I am not very familiar with auto differentiation in python.
In the forward function taking the derivative of wave with respect to T seems to work, but as soon as I do the fft I get the following error:
WARNING:tensorflow:The dtype of the target tensor must be floating (e.g. tf.float32) when calling GradientTape.gradient, got dtype('O')
AttributeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_352/3452884380.py in <module>
----> 1 T_hat = forward(17.0)
2 print(T_hat)
~\AppData\Local\Temp/ipykernel_352/2053063608.py in forward(T)
13 Power[i,:] = a[i,:].conj().T@a[i,:]
14
---> 15 return tape.gradient(Power,T)
~\anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\backprop.py in
gradient(self, target, sources, output_gradients, unconnected_gradients)
1072 for x in nest.flatten(output_gradients)]
1073
-> 1074 flat_grad = imperative_grad.imperative_grad(
1075 self._tape,
1076 flat_targets,
~\anaconda3\envs\tensorflow-gpu\lib\site-
packages\tensorflow\python\eager\imperative_grad.py in imperative_grad(tape, target,
sources, output_gradients, sources_raw, unconnected_gradients)
69 "Unknown value for unconnected_gradients: %r" % unconnected_gradients)
70
---> 71 return pywrap_tfe.TFE_Py_TapeGradient(
72 tape._tape, # pylint: disable=protected-access
73 target,
AttributeError: 'numpy.ndarray' object has no attribute '_id'
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我希望您已经找到问题的答案。但如果你还没有,也许这会给你一些启发。
您看到的问题是因为 Tensorflow 无法计算整体前向的梯度,我建议停止使用 NumPy 方法。
据我所知,您可以更改所有由 TensorFlow 实现的 NumPy 方法。
示例:
计算复张量的大小
使用复指数 提取
给定维度上的元素
计算复张量的共轭
转置或排列张量维度
总结一下,停止使用 NumPy 方法并找到 Tensorflow 替代方案,这将解决你的问题。
I hope you have already found an answer to your question. But if you haven't maybe this will give some light.
The problem that you are seen is because Tensorflow can't calculate the gradient of the overall forward I would recommend stopping using NumPy methods.
As long I can see, you can change all those NumPy methods by TensorFlow implemented.
example:
To calculate the magnitude of complex tensor
To use the complex exponential
To extract elements on a given dimension
To calculate conjugate of a complex tensor
To transpose or permute the tensor dimensions
To summarize, stop using NumPy methods and find the Tensorflow alternatives, that will solve your problems.