在过去的几天里,我开始使用 pymc3,在对基础有所了解之后,我尝试实现概率矩阵分解模型。
为了验证,我使用了 Jester 数据集的一个子集。我选取对所有 100 个笑话进行评分的前 100 个用户。我使用前 20 个笑话并保持评分不变;对于所有笑话,它们都在 [-10, 10] 范围内。为了便于参考,我在此处提供了此子集。
import pymc3 as pm
import numpy as np
import pandas as pd
import theano
data = pd.read_csv('jester-dense-subset-100x20.csv')
n, m = data.shape
test_size = m / 10
train_size = m - test_size
train = data.copy()
train.ix[:,train_size:] = np.nan # remove test set data
train[train.isnull()] = train.mean().mean() # mean value imputation
train = train.values
test = data.copy()
test.ix[:,:train_size] = np.nan # remove train set data
test = test.values
# Low precision reflects uncertainty; prevents overfitting
alpha_u = alpha_v = 1/np.var(train)
alpha = np.ones((n,m)) * 2 # fixed precision for likelihood function
dim = 10 # dimensionality
# Specify the model.
with pm.Model() as pmf:
pmf_U = pm.MvNormal('U', mu=0, tau=alpha_u * np.eye(dim),
shape=(n, dim))
pmf_V = pm.MvNormal('V', mu=0, tau=alpha_v * np.eye(dim),
shape=(m, dim))
pmf_R = pm.Normal('R', mu=theano.tensor.dot(pmf_U, pmf_V.T),
tau=alpha, observed=train)
# Find mode of posterior using optimization
start = pm.find_MAP() # Find starting values by optimization
这一切似乎都工作得很好,但find_MAP最终产生的值对于 U 和 V 都是 0,可以通过运行看到:
(start['U'] == 0).all()
(start['V'] == 0).all()
我对贝叶斯建模和 pymc 都比较陌生,所以我很容易在这里遗漏一些明显的东西。为什么我全是0?