我试图将我的数据集分成 70% 的训练、15% 的测试和 15% 的验证。
train_X, test_X, train_Y, test_Y = train_test_split(data, labels, test_size=0.3, train_size=0.7,random_state=1,stratify = labels)
test_X, val_X, test_Y, val_Y = train_test_split(test_X, test_Y, test_size=0.5,
random_state=1,stratify = labels)
但我不确定这段代码是否将测试集分成两半。此外,我不断收到此错误:
29 def main():
30 data, labels = load_data()
---> 31 train_X, train_Y, val_X, val_Y, test_X, test_Y = process_data(data, labels)
32
33 best_model, best_k = select_knn_model(train_X, val_X, train_Y, val_Y)
/tmp/ipykernel_50/3409802801.py in process_data(data, labels)
45 X_counts = vectorizer.fit_transform(train_X)
46 X_count = vectorizer.transform(test_X)
---> 47 Xval = vectorizer.transform(Val_X)
48 # Return the training, validation, and test set inputs and labels
49
NameError: name 'Val_X' is not defined
我该如何解决?