数据挖掘 - 使用 TensorFlow 预测时间序列数据 - 吾爱随笔录

使用 TensorFlow 预测时间序列数据

数据挖掘 Python 张量流时间序列

2022-02-24 05:43:33

我有以下格式的输入和输出：

(X) = [[ 0  1  2]
       [ 1  2  3]] 

  y   = [ 3  4 ]

这是时间序列数据。任务是预测下一个数字。基本上，输入是由以下片段制作的：

 def split_sequence(arr,timesteps):
     arr_len = len(arr)
     X,y = [],[]
     for i in range(arr_len):
       end_idx = i + timesteps
       if end_idx > arr_len-1:
          break
       input_component,output_component =  arr[i:end_idx],arr[end_idx]
       X.append(input_component)
       y.append(output_component)

     return np.array(X), np.array(y)

现在，我想在输入上训练模型并预测下一个数字。例如，x = [81,82,83]预测输出将是y = 84。我学会了如何在 keras 中做到这一点。但是，我也想尝试在 tensorflow 中执行此操作。

以下是张量流中的代码：

 # Data generator
 def generate_batch(X,y,batch_size):
    m = X.shape[0]
    indexes = range(m)
    n_batches = m // batch_size
    for batch_index in np.array_split(indexes,n_batches):
       yield X[batch_index],y[batch_index] 

 # parameters
 n_inputs = 3
 n_epochs = 1000
 batch_size = 40
 learning_rate = 0.01
 n_steps = 3

 # generate the input and output using split_sequence method
 input, output = split_sequence(range(1000),n_steps)


 # Define the input variables
 X = tf.placeholder(tf.int32,shape=(None,n_inputs),name='X')
 y = tf.placeholder(tf.float32,shape=(None),name='y')
 theta = tf.Variable(tf.random_uniform([n_steps,1],-1.0,1.0),name='theta')

 # predictions and error 
 y_predictions = tf.matmul(X,theta,name='predictions')
 error = y_predictions - y
 mse = tf.reduce_mean(tf.square(error),name='mse')



 # train the model          
 optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
 training_op = optimizer.minimize(mse)

 init = tf.global_variables_initializer()

 with tf.Session() as session:      
   session.run(init)
   for epoch in range(n_epochs):
      for X_batch,y_batch in generate_batch(input,output,batch_size):
         if epoch % 10 == 0:
             print('epoch',epoch,'MSE=',mse.eval())
         session.run(training_op,feed_dict={X:X_batch,y:y_batch})

老实说，我完全陷入了以下错误：

You must feed a value for placeholder tensor 'X' with dtype float and shape [?,3].

我的输入是一个整数，所以这就是定义的原因：

 X = tf.placeholder(tf.int32,shape=(None,n_inputs),name='X')

有人可以帮我解决这个问题吗？另外，如果我想添加偏差变量，我可以实现上述输入吗？

1个回答

错误是由这一行引起的：

print('epoch',epoch,'MSE=',mse.eval())

发生这种情况是因为张量mse还取决于占位符X和y。解决此问题的一种方法是将训练循环更改为：

for X_batch,y_batch in generate_batch(input,output,batch_size):
     mse_val, _ = session.run([mse, training_op],feed_dict={X:X_batch,y:y_batch})
     if epoch % 10 == 0:
         print('epoch',epoch,'MSE=',mse_val)

此外，您需要切换X回tf.float32因为tf.matmul与 int 和 float 不兼容。输入数据后，数据将自动转换。

要添加一个偏差变量，您可以像定义theta.

b = tf.Variable(0.0, dtype=tf.float32, name='b')
...
y_predictions += b

其它你可能感兴趣的问题

上一篇Keras 输入形状错误下一篇Keras Fit Function (R)：训练具有多个标签的回归模型