使用TensorFlow NCE_LOSS进行培训时如何进行预测

发布于 2025-01-29 06:56:02 字数 443 浏览 4 评论 0原文

https://www.tensorflow.org/api_docs/python/tf/nn/ nce_loss 在这里,它说计算评估或推理的完整Sigmoid损失,谁能解释一些详细信息如何在推理期间预测标签?

据我了解,模型的最后一层输出是形状(批处理,num_class),在训练期间,它直接陷入了NCE损失,并被视为二进制分类问题。在推断期间,我直接将Sigmoid在最后一层输出上进行并获取相应的条目表示类i的概率是正确的吗?还是我可以将最大的入口视为类标签,就像使用SoftMax一样?

不太了解这一点,我也没有找到与此在线相关的任何实际示例。任何帮助都将受到赞赏!非常感谢!

https://www.tensorflow.org/api_docs/python/tf/nn/nce_loss
Here it says calculate the full sigmoid loss for evaluation or inference, can anyone explain some detail how to predict the label in the inference period?

As I understand the model's last layer output is of shape (batch, num_class), during training it directly goes into nce loss and is treated as a binary classification problem. During inference, is it right that I directly take the sigmoid over the last layer output and get the corresponding entry i to represent the probability of class i? Or I can directly treat the largest entry as the class label just like using softmax?

Not quite understand this, neither have I found any practical example related to this online. Any help is appreciated! Thanks so much in advance!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

苏别ゝ 2025-02-05 06:56:02

当您考虑序列输入时,NCE_LOSS可能是噪声对抗性估计,它通过选择Acandidate采样器而变化以创建输出。

参考0: htttps:///www.tensorflow.orgg/api_orpi_orpi_docs/api_docs/python/python/ppython/tfff /nn/nce_loss

参考1: https://github.com/yl-1993/tensorflow/blob/master/master/tensorflow/examples/tutorials/mnist/mnist/mnist_deep.py.py.py

ref 2: https://www.progragramcreek.com/python/python/python/example/90447/90447/tensorflow.nce_loss.nce_loss.nce_loss

import os
from os.path import exists

import tensorflow as tf
import tensorflow_io as tfio

import matplotlib.pyplot as plt
import math
import numpy as np

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
None
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
physical_devices = tf.config.experimental.list_physical_devices('GPU')
assert len(physical_devices) > 0, "Not enough GPU hardware devices available"
config = tf.config.experimental.set_memory_growth(physical_devices[0], True)
print(physical_devices)
print(config)   

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Variables
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
learning_rate = 0.001
global_step = 0
vocabulary_size = 5000
start = 0
limit = 128
delta = 1
embedding_size = 16
n_sample = 16

tf.compat.v1.disable_eager_execution()

# Input data.

inputs = tf.compat.v1.get_variable('X', dtype = tf.int32, initializer = tf.random.uniform(shape=[1], maxval=1, dtype=tf.int32, seed=10))
labels = tf.compat.v1.get_variable('Y', dtype = tf.int32, initializer = tf.random.uniform(shape=[1, 1], maxval=1, dtype=tf.int32, seed=10))

# Look up embeddings for inputs.
embeddings = tf.Variable(
tf.random.uniform([vocabulary_size, embedding_size], -1.0, 1.0)
)
embed = tf.nn.embedding_lookup(embeddings, inputs)

# Construct the variables for the NCE loss
nce_weights = tf.Variable(
    tf.random.uniform(shape=[vocabulary_size, embedding_size], maxval=255, dtype=tf.float32,)
)
nce_biases = tf.Variable(tf.zeros([vocabulary_size]))

# Compute the average NCE loss for the batch.
# tf.nce_loss automatically draws a new sample of the negative labels each
# time we evaluate the loss.
loss = tf.reduce_mean(
tf.nn.nce_loss(nce_weights, nce_biases,
               labels=labels,
               inputs=embed,
               num_sampled=n_sample,
               num_classes=vocabulary_size),
name='loss'
)
optimizer = tf.compat.v1.train.ProximalAdagradOptimizer(
    learning_rate,
    initial_accumulator_value=0.1,
    l1_regularization_strength=0.2,
    l2_regularization_strength=0.1,
    use_locking=False,
    name='ProximalAdagrad'
)
training_op = optimizer.minimize(loss, name='minimize') 


"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: DataSet / Input
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
X = np.reshape([ 500 ], (1))
Y = np.reshape([ 15 ], (1, 1))
history = [ ] 

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Training / Optimize
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
with tf.compat.v1.Session() as sess:
    sess.run(tf.compat.v1.global_variables_initializer())
    
    for i in range(1000):
        global_step = global_step + 1
        train_loss, temp = sess.run([loss, training_op], feed_dict={inputs:X, labels:Y})
        history.append(train_loss)
        
        print( 'steps: ' + str(i) )
        
sess.close()

print( history )
plt.plot(history)
plt.show()
plt.close()

input('...')

It is possible when you consider the sequence input, NCE_loss is the noise-contrastive estimation that varies input to create the output by selecting acandidate sampler.

Ref 0: https://www.tensorflow.org/api_docs/python/tf/nn/nce_loss

Ref 1: https://github.com/yl-1993/tensorflow/blob/master/tensorflow/examples/tutorials/mnist/mnist_deep.py

Ref 2: https://www.programcreek.com/python/example/90447/tensorflow.nce_loss

[ Sample ]:

import os
from os.path import exists

import tensorflow as tf
import tensorflow_io as tfio

import matplotlib.pyplot as plt
import math
import numpy as np

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
None
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
physical_devices = tf.config.experimental.list_physical_devices('GPU')
assert len(physical_devices) > 0, "Not enough GPU hardware devices available"
config = tf.config.experimental.set_memory_growth(physical_devices[0], True)
print(physical_devices)
print(config)   

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Variables
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
learning_rate = 0.001
global_step = 0
vocabulary_size = 5000
start = 0
limit = 128
delta = 1
embedding_size = 16
n_sample = 16

tf.compat.v1.disable_eager_execution()

# Input data.

inputs = tf.compat.v1.get_variable('X', dtype = tf.int32, initializer = tf.random.uniform(shape=[1], maxval=1, dtype=tf.int32, seed=10))
labels = tf.compat.v1.get_variable('Y', dtype = tf.int32, initializer = tf.random.uniform(shape=[1, 1], maxval=1, dtype=tf.int32, seed=10))

# Look up embeddings for inputs.
embeddings = tf.Variable(
tf.random.uniform([vocabulary_size, embedding_size], -1.0, 1.0)
)
embed = tf.nn.embedding_lookup(embeddings, inputs)

# Construct the variables for the NCE loss
nce_weights = tf.Variable(
    tf.random.uniform(shape=[vocabulary_size, embedding_size], maxval=255, dtype=tf.float32,)
)
nce_biases = tf.Variable(tf.zeros([vocabulary_size]))

# Compute the average NCE loss for the batch.
# tf.nce_loss automatically draws a new sample of the negative labels each
# time we evaluate the loss.
loss = tf.reduce_mean(
tf.nn.nce_loss(nce_weights, nce_biases,
               labels=labels,
               inputs=embed,
               num_sampled=n_sample,
               num_classes=vocabulary_size),
name='loss'
)
optimizer = tf.compat.v1.train.ProximalAdagradOptimizer(
    learning_rate,
    initial_accumulator_value=0.1,
    l1_regularization_strength=0.2,
    l2_regularization_strength=0.1,
    use_locking=False,
    name='ProximalAdagrad'
)
training_op = optimizer.minimize(loss, name='minimize') 


"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: DataSet / Input
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
X = np.reshape([ 500 ], (1))
Y = np.reshape([ 15 ], (1, 1))
history = [ ] 

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Training / Optimize
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
with tf.compat.v1.Session() as sess:
    sess.run(tf.compat.v1.global_variables_initializer())
    
    for i in range(1000):
        global_step = global_step + 1
        train_loss, temp = sess.run([loss, training_op], feed_dict={inputs:X, labels:Y})
        history.append(train_loss)
        
        print( 'steps: ' + str(i) )
        
sess.close()

print( history )
plt.plot(history)
plt.show()
plt.close()

input('...')
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文