数据挖掘 - MCMC分层线性模型的不同强度先验 - 吾爱随笔录

我正在使用 Pymc3 训练 MCMC 模型。

我的目标是建立一系列线性回归模型，这些模型将根据要卸载的板条箱数量来预测卡车的卸载时间。

我有 2000 个位置的数据，它们分为 4 类位置。同一类别的地点往往具有相似的卸载时间。

对于每个位置，我都有一系列记录的数据点：

卡车上的板条箱数量
卸载需要多长时间

我想为这种形式的每个位置训练一个线性模型：

y = α + βx 其中 y 是时间，x 是箱子的数量。

对于某些位置，我可能只有 20 个数据点，而其他位置我可能有 10,000 个。

显然，当我有 10,000 个数据点时，有足够的数据点在该位置训练一个简单的线性回归，而无需其他输入。

但是，当我有少量数据时，例如 20 个点，我希望在该类别中位置的所有数据点上训练一个总体模型，并将其 α 和 β 作为该位置的 α 和 β 的先验.

我一直在关注https://github.com/parsing-science/pymc3_quickstart_guide/blob/master/Quickstart_Guide_no_errors.ipynb上的非常有用的示例。

我的问题是：

如果我的数据点很少，我如何改变先验的强度，使其成为一个非常强大的先验，并且对于我有足够数据点的位置，它会减弱为零？
我应该为 sigma_alpha 和 sigma_beta（α 和 β 的先验分布的标准差）设置什么值？Pymc3 示例使用均匀分布，最好的办法是通过在每个位置训练线性模型并获取参数的标准差来计算近似分布？

这是我的 MCMC 代码：

import theano.tensor as T
import numpy as np
import pymc3 as pm
import pandas as pd

# load pandas dataframe df_train which has columns location_id, crates, unloading_time
# load location_category_lookup_dict which gives the category of each location
# load category_model_lookup which gives the pre-trained linear model for each of the 4 categories

num_features = 1

grouped = df_train.groupby(["location_id"])

location_no_to_mcmc_models = {}

for location_id, drops_at_this_location in grouped:
    location_category_id = location_category_lookup_dict[location_id]

    # Look up the intercept and coefficient in the previously trained linear model for this category
    model_for_this_locations_category = category_model_lookup[location_category_id]

    alpha_this_category = model_for_this_locations_category.intercept_
    beta_this_category = model_for_this_locations_category.coef_

    # Get the features (number of crates) and regressand (unloading time)
    features = np.asarray([[c] for c in drops_at_this_location["crates"]])
    y = np.asarray(drops_at_this_location["unloading_time"])

    # Set up the PyMC model
    hlm = pm.Model()
    with hlm:

        # Both alpha and beta are drawn from similar distributions
        sigma_alpha = pm.Uniform("sigma_alpha", .0, 20, testval=2.)

        sigma_beta = pm.Uniform("sigma_beta", .0, 4, testval=2.)

        # Simulate the alphas
        alpha = pm.Normal("alpha", alpha_this_category, sigma_alpha, shape=(1))

        # Simulate the betas
        beta = pm.Normal('beta', beta_this_category, sigma_beta, shape=(1, num_features))

        # Simulate the noise
        s = pm.Uniform("s", .01, 10, shape=1)

        yd = pm.Normal('y', mu=alpha[0] + T.sum(beta[0]*features, 1), tau=s[0] ** -2, observed=y)

        step = pm.NUTS()

    tr = pm.sample(int(1e4), step, model=hlm)

    mcmc_summary = pm.summary(tr)

    location_no_to_mcmc_models[location_no] = mcmc_summary

    print (location_id, "intercept:", mcmc_summary["mean"]["alpha__0"], "weight:", mcmc_summary["mean"]["beta__0_0"])