机器算法验证 - 如何选择独立成分分析的成分数量？ - 吾爱随笔录

在没有关于独立组件分析中请求的组件数量的良好先验猜测的情况下，我正在寻求自动化选择过程。我认为一个合理的标准可能是最小化计算组件之间相关性的全局证据的数字。这是这种方法的伪代码：

for each candidate number of components, n:
    run ICA specifying n as requested number of components
    for each pair (c1,c2) of resulting components:
        compute a model, m1: lm(c1 ~ 1)
        compute a model, m2: lm(c1 ~ c2)
        compute log likelihood ratio ( AIC(m2)-AIC(m1) ) representing the relative likelihood of a correlation between c1 & c2
    compute mean log likelihood ratio across pairs
Choose the final number of components as that which minimizes the mean log likelihood of component relatedness

我认为这应该会自动惩罚大于组件“真实”数量的候选者，因为由此类候选者产生的 ICA 应该被迫将来自单个真实组件的信息分布到多个估计组件中，从而增加组件对之间相关性的平均证据。

这有意义吗？如果是这样，是否有比上面建议的平均对数似然方法更快的方法来实现估计组件之间的相关性聚合度量（这在计算上可能相当慢）？如果这种方法没有意义，那么一个好的替代程序可能是什么样的？