自动化逻辑回归

数据挖掘 逻辑回归
2022-02-20 10:07:16

我有数据集我想在拆分时运行3logistic regression

  1. X1,y1

  2. X2,y2

  3. X3,y3

如何运行一个循环,以便我可以使用 SkLearn 的 X_train、X_test 拆分函数运行自动逻辑回归,并为每个数据集打印 3 个单独的准确度结果?

要在 X 的一个数据集上运行,y 如下:

X_train, X_test, y_train, y_test = train_test_split(X, y, 
test_size=0.25, shuffle=False)

logreg = LogisticRegression()
logreg.fit(X_train,y_train)
y_pred = logreg.predict(X_test)

print('Accuracy:',metrics.accuracy_score(y_test, y_pred))
1个回答

保存在字典中(X1,y1),(X2,y2),(X2,y2)

dictionary = {X1:y1, X2:y2, X3:y3}
accuracies = []
for k,v in dictionary.items():
    X_train, X_test, y_train, y_test = train_test_split(k, v, 
    test_size=0.25, shuffle=False)
    logreg = LogisticRegression()
    logreg.fit(X_train,y_train)
    y_pred = logreg.predict(X_test)
    accuracies.append(metrics.accuracy_score(y_test, y_pred))

但这并不是一个好方法,尤其是当数据集太大时。


所以:

accuracies = []
for i in range(1,4):
    X = 'X{}'.format(i)
    y = 'y{}'.format(i)
    X_train, X_test, y_train, y_test = train_test_split(vars()[X], vars()[y], 
    test_size=0.25, shuffle=False)
    logreg = LogisticRegression()
    logreg.fit(X_train,y_train)
    y_pred = logreg.predict(X_test)
    accuracies.append(metrics.accuracy_score(y_test, y_pred))

注意:为了访问小部分不同的变量名,比如这里的 X1,X2,...,我们可以使用上面提到的方法:

import numpy as np
X1= np.arange(1,10)                                                                                                                                
y1=[i**2 for i in X1]                                                                                                                                
X2= np.arange(-5,5)                                                                                                                                
y2=[i**2 for i in X2]                                                                                                                          
for i in range(1,3): 
    X = 'X{}'.format(i) 
    y = 'y{}'.format(i) 
    print('X_{}'.format(i) , vars()[X])
    print('y_{}'.format(i) , vars()[y])

输出:

X_1 [1 2 3 4 5 6 7 8 9]
y_1 [1, 4, 9, 16, 25, 36, 49, 64, 81]
X_2 [-5 -4 -3 -2 -1  0  1  2  3  4]
y_2 [25, 16, 9, 4, 1, 0, 1, 4, 9, 16]