Tensorflow - 逻辑回归 -oneHot 编码器 - 用于训练和测试的不同大小的变换数组

数据挖掘 张量流 逻辑回归
2021-10-03 14:52:06
  x_train = tr1.loc[:, ['Sepal Length', 'Sepal Width', 'Petal Length', 'Petal Width']]
#x_train.shape - (120 x 4)

y_train = tr1.loc[:, ['Species']]
#shape - 120 x 3

x_test = test1.loc[:, ['Sepal Length', 'Sepal Width', 'Petal Length', 'Petal Width']]
#shape 30 x 4
y_test = test1.loc[:, ['Species']]
# shape 30 x 3

oneHot = OneHotEncoder()
oneHot.fit(x_train)
# transform
x_train = oneHot.transform(x_train).toarray()
 # fit our y to oneHot encoder
 oneHot.fit(y_train)
 # transform
 y_train = oneHot.transform(y_train).toarray()

 oneHot.fit(x_test)
 # transform
 x_test = oneHot.transform(x_test).toarray()
 # fit our y to oneHot encoder
  oneHot.fit(y_test)
   # transform
  y_test = oneHot.transform(y_test).toarray()

   print("Our features X_test1 in one-hot format")
    print(x_test)

x_train 的形状:(120, 15) y_train 的形状:(120, 3) x_test 的形状:(30, 14) y_test 的形状:(30, 3)

a)转换后为什么大小 x_test = 30 x 14 我假设它必须是 30 x 15 ?

# hyperparameters
learning_rate = 0.0001
num_epochs = 100
display_step = 1

# for visualize purpose in tensorboard we use tf.name_scope
with tf.name_scope("Declaring_placeholder"):
# X is placeholdre for iris features. We will feed data later on
x = tf.placeholder(tf.float32, shape=[None, 15])
# y is placeholder for iris labels. We will feed data later on
y = tf.placeholder(tf.float32, shape=[None, 3])

with tf.name_scope("Declaring_variables"): # W 是我们的权重。这将在训练期间更新 W = tf.Variable(tf.zeros([15, 3])) # b 是我们的偏差。这也将在训练期间更新 b = tf.Variable(tf.zeros([3]))

with tf.name_scope("Declaring_functions"):
# our prediction function
y_ = tf.nn.softmax(tf.add(tf.matmul(x, W), b))

b)我是否正确定义了 x、y、W、b,因为当我运行精度时,我收到此错误“ValueError:无法为具有形状的 Tensor 'Declaring_placeholder_10/Placeholder:0' 提供形状 (30, 14) 的值' (?, 15)' "

1个回答

您的形状是(30, 14)不是(30, 15)因为您的测试中只有 14 个唯一值(缺少一个)。在任何情况下,您都不应该将编码器安装在测试集上,而应该将其安装在训练集上。然后只需在测试集上进行转换,您就会得到正确的尺寸。

据我所知, W 和 b 被正确声明。不过,我会问你下次花一点时间来更好地格式化你的问题。