是的,单元输出等于隐藏状态。在 LSTM 的情况下,它是元组的短期部分( 的第二个元素LSTMStateTuple
),如下图所示:
但是对于tf.nn.dynamic_rnn
,当序列较短(参数)时,返回的状态可能不同。sequence_length
看看这个例子:
n_steps = 2
n_inputs = 3
n_neurons = 5
X = tf.placeholder(dtype=tf.float32, shape=[None, n_steps, n_inputs])
seq_length = tf.placeholder(tf.int32, [None])
basic_cell = tf.nn.rnn_cell.BasicRNNCell(num_units=n_neurons)
outputs, states = tf.nn.dynamic_rnn(basic_cell, X, sequence_length=seq_length, dtype=tf.float32)
X_batch = np.array([
# t = 0 t = 1
[[0, 1, 2], [9, 8, 7]], # instance 0
[[3, 4, 5], [0, 0, 0]], # instance 1
[[6, 7, 8], [6, 5, 4]], # instance 2
[[9, 0, 1], [3, 2, 1]], # instance 3
])
seq_length_batch = np.array([2, 1, 2, 2])
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
outputs_val, states_val = sess.run([outputs, states],
feed_dict={X: X_batch, seq_length: seq_length_batch})
print(outputs_val)
print()
print(states_val)
这里输入批次包含 4 个序列,其中一个很短并用零填充。运行后你应该是这样的:
[[[ 0.2315362 -0.37939444 -0.625332 -0.80235624 0.2288385 ]
[ 0.9999524 0.99987394 0.33580178 -0.9981791 0.99975705]]
[[ 0.97374666 0.8373545 -0.7455188 -0.98751736 0.9658986 ]
[ 0. 0. 0. 0. 0. ]]
[[ 0.9994331 0.9929737 -0.8311569 -0.99928087 0.9990415 ]
[ 0.9984355 0.9936006 0.3662448 -0.87244385 0.993848 ]]
[[ 0.9962312 0.99659646 0.98880637 0.99548346 0.9997809 ]
[ 0.9915743 0.9936939 0.4348318 0.8798458 0.95265496]]]
[[ 0.9999524 0.99987394 0.33580178 -0.9981791 0.99975705]
[ 0.97374666 0.8373545 -0.7455188 -0.98751736 0.9658986 ]
[ 0.9984355 0.9936006 0.3662448 -0.87244385 0.993848 ]
[ 0.9915743 0.9936939 0.4348318 0.8798458 0.95265496]]
...这确实表明state == output[1]
对于完整序列和state == output[0]
短序列。也是output[1]
这个序列的零向量。LSTM 和 GRU 单元也是如此。
所以state
是一个方便的张量,它保存最后一个实际的RNN 状态,忽略零。output
张量包含所有单元格的输出,因此它不会忽略零。这就是他们两个都退货的原因。