您可以根据双变量马尔可夫链来考虑您的数据。你有两个不同的变量X对于女性和Y对于男性来说,它描述了变化的随机过程X和Y有时t到四种不同状态之一。让我们表示Xt−1,i→Xt,j过渡为X从t−1到t时间,从i-th 到j-th 状态。在这种情况下,及时转换到另一个状态取决于之前的状态X 并且在Y:
Pr(Xt−1,i→Xt,j)=Pr(Xt,j|Xt−1,i,Yt−1,k)Pr(Yt−1,h→Yt,k)=Pr(Yt,h|Yt−1,k,Xt−1,i)
转移概率可以通过计算转移历史和事后归一化概率轻松计算:
states <- c("absent", "present", "attack", "threat")
# data is stored in 3-dimensional array, initialized with
# a very small "default" non-zero count to avoid zeros.
female_counts <- male_counts <- array(1e-16, c(4,4,4), list(states, states, states))
n <- length(male_seq)
for (i in 1:n) {
male_counts[female_seq[i-1], male_seq[i-1], male_seq[i]] <- male_counts[female_seq[i-1], male_seq[i-1], male_seq[i]] + 1
female_counts[male_seq[i-1], female_seq[i-1], female_seq[i]] <- female_counts[male_seq[i-1], female_seq[i-1], female_seq[i]] + 1
}
male_counts/sum(male_counts)
female_counts/sum(female_counts)
也可以使用边际概率轻松模拟:
male_sim <- female_sim <- "absent"
for (i in 2:nsim) {
male_sim[i] <- sample(states, 1, prob = male_counts[female_sim[i-1], male_sim[i-1], ])
female_sim[i] <- sample(states, 1, prob = female_counts[male_sim[i-1], female_sim[i-1], ])
}
这种模拟的结果如下图所示。
此外,它还可用于提前一步预测:
male_pred <- female_pred <- NULL
for (i in 2:n) {
curr_m <- male_counts[female_seq[i-1], male_seq[i-1], ]
curr_f <- female_counts[male_seq[i-1], female_seq[i-1], ]
male_pred[i] <- sample(names(curr_m)[curr_m == max(curr_m)], 1)
female_pred[i] <- sample(names(curr_f)[curr_f == max(curr_f)], 1)
}
您提供的数据的准确率为 69-86%:
> mean(male_seq == male_pred, na.rm = TRUE)
[1] 0.8611111
> mean(female_seq == female_pred, na.rm = TRUE)
[1] 0.6944444
如果转换随机发生,则转换概率将遵循离散均匀分布。这不是证明,但可以作为使用简单模型思考数据的一种方式。