机器算法验证 - 估计二维协方差误差椭圆的正确方法 - 吾爱随笔录

我知道这个问题，但我的问题是关于在 StackOverflow 上的两个竞争答案中获得 2D 协方差误差椭圆的两种竞争方式。

第一个答案获得椭圆的宽度和高度：

w = 2 \times n s t d \times \sqrt{λ_{1}}; h = 2 \times n s t d \times \sqrt{λ_{2}}

$w=2\times nstd\times \sqrt{\lambda_1} \;\;;\;\; h=2\times nstd\times \sqrt{\lambda_2}$

而第二个答案说：

w = 2 \times \sqrt{λ_{1} \times r 2}; h = 2 \times \sqrt{λ_{2} \times r 2}

$w=2\times \sqrt{\lambda_1\times r2} \;\;;\;\; h=2\times \sqrt{\lambda_2\times r2}$

在哪里 $(\lambda_1, \lambda_2)$ 是二维数据的协方差矩阵的特征值， $nstd$ 是我为椭圆设置的标准偏差（例如： $nstd=2$ 如果我想要 2-sigma 椭圆），以及 $r2$ 是它的卡方百分比点函数 $nstd$ .

下面蓝色的第一个答案总是给出一个较小的椭圆（两个答案的旋转角度相等）。获得协方差椭圆的正确方法是什么？

import numpy as np
from scipy.stats import norm, chi2
import matplotlib.pyplot as plt
from matplotlib.patches import Ellipse


def main(nstd=2.):
    """
    Generate an `nstd` sigma ellipse based on the mean and covariance of a
    point "cloud".
    """

    # Generate some random, correlated data
    cov = np.random.uniform(0., 10., (2, 2))
    points = np.random.multivariate_normal(mean=(0, 0), cov=cov, size=1000)

    # The 2x2 covariance matrix to base the ellipse on.
    cov = np.cov(points, rowvar=False)

    # Location of the center of the ellipse.
    mean_pos = points.mean(axis=0)

    # METHOD 1
    width1, height1, theta1 = cov_ellipse(points, cov, nstd)

    # METHOD 2
    width2, height2, theta2 = cov_ellipse2(points, cov, nstd)

    # Plot the raw points.
    x, y = points.T
    ax = plt.gca()
    plt.scatter(x, y, c='k', s=1, alpha=.5)
    # First ellipse
    ellipse1 = Ellipse(xy=mean_pos, width=width1, height=height1, angle=theta1,
                       edgecolor='b', fc='None', lw=2, zorder=4)
    ax.add_patch(ellipse1)
    # Second ellipse
    ellipse2 = Ellipse(xy=mean_pos, width=width2, height=height2, angle=theta2,
                       edgecolor='r', fc='None', lw=.8, zorder=4)
    ax.add_patch(ellipse2)
    plt.show()


def eigsorted(cov):
    '''
    Eigenvalues and eigenvectors of the covariance matrix.
    '''
    vals, vecs = np.linalg.eigh(cov)
    order = vals.argsort()[::-1]
    return vals[order], vecs[:, order]


def cov_ellipse(points, cov, nstd):
    """
    Source: http://stackoverflow.com/a/12321306/1391441
    """

    vals, vecs = eigsorted(cov)
    theta = np.degrees(np.arctan2(*vecs[:, 0][::-1]))

    # Width and height are "full" widths, not radius
    width, height = 2 * nstd * np.sqrt(vals)

    return width, height, theta


def cov_ellipse2(points, cov, nstd):
    """
    Source: https://stackoverflow.com/a/39749274/1391441
    """

    vals, vecs = eigsorted(cov)
    theta = np.degrees(np.arctan2(*vecs[::-1, 0]))

    # Confidence level
    q = 2 * norm.cdf(nstd) - 1
    r2 = chi2.ppf(q, 2)

    width, height = 2 * np.sqrt(vals * r2)

    return width, height, theta


if __name__ == '__main__':
    main()