凸包重叠概率的蒙特卡罗估计

机器算法验证 蒙特卡洛
2022-04-02 14:49:12

这是我的Math.SE帖子的统计版本。

给定自然数,均匀随机选择单位正方形内的点称为蓝点,将点称为红点(红点的凸包)与(蓝点的凸包)重叠的概率。如何通过蒙特卡洛(或者更好的)方法brb+rbrp(b,r)HrHbp(b,r)

我可以考虑对大量随机选择的测试用例的概率进行平均,但是如何以给我错误界限的方式对搜索空间进行采样?

1个回答

我知道@Ganesh 发布这个问题已经有一段时间了,但希望你仍然对回复感兴趣。

我已经编写了一些 R 代码,可以满足您的要求,我认为:

library(ggplot2)
library(MASS)

#################################################
#################################################
## Steps:                                      ##
##  1. draw b 'blue' points and r 'red' points ##
##  2. perform LDA                             ##
##  3. check error rate                        ##
#################################################
#################################################

##################################################
# function: drawPoints                           #
#                                                #
# description: draws b + r points uniformly from #
#              a unit square                     #
#                                                #
# inputs: b - number of 'blue' points to draw    #
#         r - number of 'red' points to draw     #
#                                                #
# outputs: data frame containing points          #
##################################################

drawPoints <- function(b,r) {
    x <- runif(b+r)
    y <- runif(b+r)
    class <- c(rep('b',b),rep('r',r))
    return(data.frame(x = x, y = y, class = class))
}

##################################################
# function: checkOverlap                         #
#                                                #
# description: if the data is linearly separable #
#              there is no overlap in the convex #
#              hulls of the different classes    #
#                                                #
# inputs: df - data frame containing classified  #
#              points                            #
#                                                #
# outputs: FALSE if 0 error rate                 #
#          TRUE otherwise                        #
##################################################

checkOverlap <- function(df) {
    disc.anal <- lda(class ~ x + y, data = df)
    return(!identical(predict(disc.anal)$class,df$class))
}

######################################################
######################################################
## Simulate many trials to estimate rate of overlap ##
######################################################
######################################################

trials <- 1000
performance <- rep(as.numeric(NA),10)
for(i in 1:10) {
    results <- replicate(n = trials, expr = checkOverlap(drawPoints(10,i)))
    performance[i] <- prop.table(table(results))['TRUE']
}
qplot(x = 1:length(performance), y = performance,
  xlab = 'Number of Red Points',
  ylab = 'Proportion of Simulations with Overlap',
  main = 'Proportion of Simulations with Overlap, 10 Blue Points')