数据挖掘 - 不同的隐藏层架构提供相同的分类结果，这正常吗？ - 吾爱随笔录

我有一个包含 600 个数据点和大约 10 个属性（二进制）的数据集。数据集已标准化：

Xnormalized = StandardScaler().fit_transform(X)

输出是二进制的（2 类）我尝试了不同的 MLP 架构，使用 1 或 2 个具有不同节点数的隐藏层，并执行重复的分层交叉验证，如下所示：

names=[]
classifiersmlp = []
for i in range(5, 26,5):
    names.append("mlp-"+str(i))
    classifiersmlp.append(MLPClassifier(solver='sgd', 
                                        random_state=1, 
                                        activation='tanh', 
                                        hidden_layer_sizes=[i]))
    for j in range(5,26,5):
        names.append("mlp-"+str(i)+"_"+str(j))
        classifiersmlp.append(MLPClassifier(solver='sgd', 
                                            random_state=1, 
                                            activation='tanh', 
                                            hidden_layer_sizes=[i,j]))

cv = RepeatedStratifiedKFold(n_splits=5, n_repeats=10, random_state=1)
scoring = {'accuracy': 'accuracy',
           'recall': 'recall',
           'precision': 'precision',
           'f1_score':'f1'}    
mlpResults = []
for name, clf in zip(names, classifiersmlp):
    print(name)
    cvresultMLP = cross_validate(clf, Xnormalized, y, cv=cv, scoring=scoring)
    mlpResults.append(cvresultMLP)
    print(np.mean(cvresultMLP['test_recall']))
    print(np.mean(cvresultMLP['test_precision']))
    print(np.mean(cvresultMLP['test_f1_score']))

所有架构的结果都非常相似（所有 3 个评估指标（召回率（约 78%）和精度（约 74%））只有 1-2% 的差异。架构同样好或应该有更好的结果是正常的吗？结果如此相似是什么意思？

在@yohanesalfredo 评论缩放超出 cv 之后，我更新了代码：

names=[]
classifiersmlp = []
for i in [5,25,50]: #range(5, 5,3)
    names.append("mlp-"+str(i))
    classifiersmlp.append(MLPClassifier(solver='sgd', 
                                        learning_rate_init=0.01,
                                        random_state=1, 
                                        activation='tanh', 
                                        hidden_layer_sizes=(i)))
    for j in [5,20]: #range(1, 5,3):
        names.append("mlp-"+str(i)+"_"+str(j))
        classifiersmlp.append(MLPClassifier(solver='sgd', 
                                            random_state=1, 
                                            learning_rate_init=0.01,
                                            activation='tanh', 
                                            hidden_layer_sizes=(i,j)))


scoring = {'accuracy': 'accuracy',
           'recall': 'recall',
           'precision': 'precision',
           'f1_score':'f1', # according to docu only score for the 1 label
          'roc_auc':'roc_auc'}    
mlpResults = []
rand_state=1
for name, clf_temp in zip(names, classifiersmlp):
    cv = RepeatedStratifiedKFold(n_splits=5, n_repeats=10, random_state=rand_state)
    rand_state+=1        
    classifier_pipeline = make_pipeline(preprocessing.StandardScaler(), clf_temp)
    cvresultMLP = cross_validate(classifier_pipeline, X, y, cv=cv, scoring=scoring)
    mlpResults.append(cvresultMLP)
    print(np.mean(cvresultMLP['test_recall']))
    print(np.mean(cvresultMLP['test_precision']))
    print(np.mean(cvresultMLP['test_f1_score'])) 
    print(np.mean(cvresultMLP['test_roc_auc']))