概率矩阵分解 (PMF) 的 PyMC3 实现:MAP 产生全 0

机器算法验证 可能性 贝叶斯 Python pymc 概率规划
2022-04-14 10:22:47

在过去的几天里,我开始使用 pymc3,在对基础有所了解之后,我尝试实现概率矩阵分解模型。

为了验证,我使用了 Jester 数据集的一个子集。我选取对所有 100 个笑话进行评分的前 100 个用户。我使用前 20 个笑话并保持评分不变;对于所有笑话,它们都在 [-10, 10] 范围内。为了便于参考,我在此处提供了此子集。

import pymc3 as pm
import numpy as np
import pandas as pd
import theano

data = pd.read_csv('jester-dense-subset-100x20.csv')    
n, m = data.shape
test_size = m / 10
train_size = m - test_size

train = data.copy()
train.ix[:,train_size:] = np.nan  # remove test set data
train[train.isnull()] = train.mean().mean()  # mean value imputation
train = train.values

test = data.copy()
test.ix[:,:train_size] = np.nan  # remove train set data
test = test.values    

# Low precision reflects uncertainty; prevents overfitting
alpha_u = alpha_v = 1/np.var(train)
alpha = np.ones((n,m)) * 2  # fixed precision for likelihood function
dim = 10  # dimensionality

# Specify the model.
with pm.Model() as pmf:
    pmf_U = pm.MvNormal('U', mu=0, tau=alpha_u * np.eye(dim),
                        shape=(n, dim))
    pmf_V = pm.MvNormal('V', mu=0, tau=alpha_v * np.eye(dim),
                        shape=(m, dim))
    pmf_R = pm.Normal('R', mu=theano.tensor.dot(pmf_U, pmf_V.T),
                      tau=alpha, observed=train)

    # Find mode of posterior using optimization
    start = pm.find_MAP()  # Find starting values by optimization

这一切似乎都工作得很好,但find_MAP最终产生的值对于 U 和 V 都是 0,可以通过运行看到:

(start['U'] == 0).all()
(start['V'] == 0).all()

我对贝叶斯建模和 pymc 都比较陌生,所以我很容易在这里遗漏一些明显的东西。为什么我全是0?

1个回答

我做了两件事来修复你的代码。一种是从零开始初始化模型,另一种是使用基于非梯度的优化器:

import pymc3 as pm
import numpy as np
import pandas as pd
import theano
import scipy as sp

data = pd.read_csv('jester-dense-subset-100x20.csv')    
n, m = data.shape
test_size = m / 10
train_size = m - test_size

train = data.copy()
train.ix[:,train_size:] = np.nan  # remove test set data
train[train.isnull()] = train.mean().mean()  # mean value imputation
train = train.values

test = data.copy()
test.ix[:,:train_size] = np.nan  # remove train set data
test = test.values    

# Low precision reflects uncertainty; prevents overfitting
alpha_u = alpha_v = 1/np.var(train)
alpha = np.ones((n,m)) * 2  # fixed precision for likelihood function
dim = 10  # dimensionality

# Specify the model.
with pm.Model() as pmf:
    pmf_U = pm.MvNormal('U', mu=0, tau=alpha_u * np.eye(dim),
                        shape=(n, dim), testval=np.random.randn(n, dim)*.01)
    pmf_V = pm.MvNormal('V', mu=0, tau=alpha_v * np.eye(dim),
                        shape=(m, dim), testval=np.random.randn(m, dim)*.01)
    pmf_R = pm.Normal('R', mu=theano.tensor.dot(pmf_U, pmf_V.T),
                      tau=alpha, observed=train)

    # Find mode of posterior using optimization
    start = pm.find_MAP(fmin=sp.optimize.fmin_powell)  # Find starting values by optimization

    step = pm.NUTS(scaling=start)
    trace = pm.sample(500, step, start=start)

这是一个有趣的模型,将做出巨大的贡献。请考虑将其添加到示例文件夹中,一旦您确定它可以正常工作,并执行拉取请求。