我一直在尝试在一组时间序列数据上训练 RNN。目标是预测六个分类输出之一。输入以 14 个输入的 5 个时间步长给出,其中 6 个为输出的 one-hot 属性。每个时间步都有一个输出,但目标是使用以前记录的时间点及其人工分配的输出将输出分配给最近的事件。
令人困惑的是,RNN 无法得知其中一个输入实际上是分类的输出。这对我来说只是一个健全的检查,但似乎它可能表明一个更大的潜在问题。
数据严重不平衡,91%、4%、2%、1%、<1%、<1%,但是成本函数被用来对错误分类进行加权,与数据集中的构成相反。不平衡会导致这个问题吗?我现在正在使用 60,000 个训练示例,这还不够吗?
我正在研究这个动态 RNN 模型:https ://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/dynamic_rnn.py 。注意隐藏层的附加矩阵乘法。这是否正确完成?
def dynamicRNN(x, seqlen, weights, biases):
# Prepare data shape to match `rnn` function requirements
# Current data input shape: (batch_size, n_steps, n_input)
# Required shape: 'n_steps' tensors list of shape (batch_size, n_input)
# Permuting batch_size and n_steps
x = tf.transpose(x, [1, 0, 2])
# Reshaping to (n_steps*batch_size, n_input)
x = tf.reshape(x, [-1,n_input])
x = tf.matmul(x, weights['hidden'])+ biases['hidden']
# Split to get a list of 'n_steps' tensors of shape (batch_size, n_input)
x = tf.split(0, n_steps, x)
# Define a lstm cell with tensorflow
lstm_cell = rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0)
# Get lstm cell output, providing 'sequence_length' will perform dynamic
# calculation.
outputs, states = tf.nn.rnn(lstm_cell, x, dtype=tf.float32,
sequence_length=seqlen)
# When performing dynamic calculation, we must retrieve the last
# dynamically computed output, i.e, if a sequence length is 10, we need
# to retrieve the 10th output.
# However TensorFlow doesn't support advanced indexing yet, so we build
# a custom op that for each sample in batch size, get its length and
# get the corresponding relevant output.
# 'outputs' is a list of output at every timestep, we pack them in a Tensor
# and change back dimension to [batch_size, n_step, n_input]
outputs = tf.pack(outputs)
outputs = tf.transpose(outputs, [1, 0, 2])
# Hack to build the indexing and retrieve the right output.
batch_size = tf.shape(outputs)[0]
# Start indices for each sample
index = tf.range(0, batch_size) * n_steps + (seqlen - 1)
# Indexing
outputs = tf.gather(tf.reshape(outputs, [-1, n_hidden]), index)
# Linear activation, using outputs computed above
return tf.matmul(outputs, weights['out']) + biases['out']