数据挖掘 - 为什么 np.linalg.eig 生成的特征向量与存储在 PCA 对象实例中的 PCA 组件不同？ - 吾爱随笔录

我试图理解为什么eVec（由产生np.linalg.eig）pca.components_.T与 PCA 类的实例不同。据我了解，协方差矩阵的特征向量是按特征值降序排序后的主成分。

简单的解释将不胜感激。

import pandas as pd
import numpy as np
from sklearn.decomposition import PCA
from sklearn import preprocessing
from sklearn.preprocessing import StandardScaler

df = pd.read_csv(
'https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv'
)

df = df.drop(['model', 'vs', 'am'], axis = 1)

df = df.apply(lambda x: pd.to_numeric(x))

M = df.to_numpy()

Mnorm = M-np.mean(M, axis=0)

Mnorm = Mnorm/np.std(M, axis=0)
#  This is the normalized source data.

C = (Mnorm.T @ Mnorm) / (Mnorm.shape[0] - 1)
#  This is the Covariance Matrix without bias.

eVal1, eVec1 = np.linalg.eig(C)

eVal = eVal1[np.flip(np.argsort(eVal1))]
#  eVal is sorted according to the order of the eigenvalues.

eVec = eVec1[np.flip(np.argsort(eVal1))]
#  The same sort order as above is applied to the eigenvectors.

### From sklearn:

scaler = StandardScaler()

scaler = scaler.fit(df.to_numpy())

Anorm = scaler.transform(df.to_numpy())   

pca = PCA(n_components=9)

pca_transform = pca.fit_transform(Anorm)

assert (Mnorm == Anorm).all().all()
#  This tests that Mnorm was probably constructed correctly.

assert (C.round(10) == pca.get_covariance().round(10)).all().all()
#  This indicates that the Covariance Matrix (C) was constructed correctly - the rounding is arbitrary.

assert (eVec.round(5) == pca.components_.T.round(5)).all().all()
#  However, eVec and pca.components_.T are not equal.