我需要从 CSV 文件中读取数据,然后首先将该数据划分为特征和标签,然后再划分为训练和测试集。但是,有几个问题一次又一次地出现。下面是我尝试错误的代码,
ValueError: could not convert string to float: 'mon'
on line
Y: train_y})
线性回归的代码:-
import pandas as pd
from sklearn.model_selection import train_test_split
import tensorflow as tf
import numpy as np
learning_rate = 0.01
training_epochs = 1000
display_step = 50
data = pd.read_csv('forestfires.csv')
y = data.temp
x = data.drop('temp', axis=1)
train_x, test_x, train_y, test_y = train_test_split(x, y,test_size=0.2)
n_samples = train_x.shape[0]
n_features = train_x.shape[1]
X = tf.placeholder('float', [None, n_features])
Y = tf.placeholder('float', [None, 1])
# Model weights.
W = tf.Variable(np.random.randn(n_features, 1), dtype='float32')
b = tf.Variable(np.random.randn(1), dtype='float32')
# Construct linear model.
prediction = tf.matmul(X, W) + b
loss = tf.reduce_sum(tf.pow(prediction - Y, 2))/(2 * n_samples)
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Start training.
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(training_epochs):
for (x, y) in zip(train_x, train_y):
sess.run(optimizer, feed_dict={X: train_x,
Y: train_y})
# Display logs per epoch step.
if (epoch + 1) % display_step == 0:
c = sess.run(loss, feed_dict={X: train_x,
Y: train_y})
print ('Epoch:', '%04d' % (epoch+1), 'cost=','{:.9f}'.format(c), \
'W=', sess.run(W), 'b=', sess.run(b))
print ('Training Done!')
training_cost = sess.run(loss, feed_dict={X: train_x,
Y: train_y})
print ('Training cost=', training_cost, 'W=', sess.run(W), 'b=', sess.run(b), '\n')
# Graphic display.
plt.plot(train_x, train_y, 'ro', label='Original data')
plt.plot(train_x, sess.run(W) * train_x + sess.run(b), label='Fitted line')
plt.legend()
plt.show()
任何人都可以帮助我以相当一般的方式正确读取数据吗?数据快照:-
