Tensorflow Sampled Softmax Loss Correct Usage
Solution 1:
In your softmax layer you are multiplying your network predictions, which have dimension (num_classes,)
by your w
matrix which has dimension (num_classes, num_hidden_1)
, so you end up trying to compare your target labels of size (num_classes,)
to something that is now size (num_hidden_1,)
. Change your tiny perceptron to output layer_1
instead, then change the definition of your cost. The code below might do the trick.
def tiny_perceptron(x, weights, biases):
layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
layer_1 = tf.nn.relu(layer_1)
return layer_1
layer_1 = tiny_perceptron(x, weights, biases)
loss_function = tf.reduce_mean(tf.nn.sampled_softmax_loss(
weights=weights['h1'],
biases=biases['b1'],
labels=labels,
inputs=layer_1,
num_sampled=num_sampled,
num_true=num_true,
num_classes=num_classes))
When you train your network with some optimizer, you will tell it to minimize loss_function
, which should mean that it will adjust both sets of weights and biases.
Solution 2:
The key point is to pass right shape of weight, bias, input and label. The shape of weight passed to sampled_softmax is not the the same with the general situation.
For example, logits = xw + b
, call sampled_softmax like this:
sampled_softmax(weight=tf.transpose(w), bias=b, inputs=x)
, NOT sampled_softmax(weight=w, bias=b, inputs=logits)
!!
Besides, label is not one-hot representation. if your labels are one-hot represented, pass labels=tf.reshape(tf.argmax(labels_one_hot, 1), [-1,1])
Post a Comment for "Tensorflow Sampled Softmax Loss Correct Usage"