机器算法验证 - R中的蒙特卡罗模拟 - 吾爱随笔录

R中的蒙特卡罗模拟

机器算法验证 r 蒙特卡洛排队

2022-03-01 23:42:24

我正在尝试解决以下练习，但实际上我不知道如何开始执行此操作。我在我的书中找到了一些看起来像的代码，但这是一个完全不同的练习，我不知道如何将它们相互关联。如何开始模拟到达以及如何知道它们何时完成？我知道如何存储它们并据此计算 a、b、c、d。但我不知道我实际上需要如何模拟蒙特卡罗模拟。有人可以帮我开始吗？我知道这不是为您解答问题的地方，而只是解决了问题。但问题是我不知道如何开始。

IT 支持服务台代表一个排队系统，其中有五名助理接听客户的电话。调用按照泊松过程发生，平均每 45 秒调用一次。1、2、3、4、5 助手的服务时间均为指数型随机变量，参数分别为 λ1 = 0.1、λ2 = 0.2、λ3 = 0.3、λ4 = 0.4 和 λ5 = 0.5 min−1（即第 j 个帮助台助理有 λk = k/10 min−1)。除了正在接受帮助的客户外，最多可以暂停其他十位客户。有时，当达到此容量时，新呼叫者会收到忙音信号。使用蒙特卡罗方法估计以下性能特征，

(a) 收到忙碌信号的客户比例；

(b) 预期响应时间；

(d) 每个服务台助理服务的客户比例；

编辑：到目前为止我所拥有的是（不多）：

pa = 1/45sec-1

jobs = rep(1,5); onHold = rep(1,10);

jobsIndex = 0;

onHoldIndex = 0;

u = runif(1)
for (i in 1:1000) {

    if(u  <= pa){ # new arrival

        if(jobsIndex < 5) # assistant is free, #give job to assistant

            jobsIndex++;

        else #add to onHold array

            onHoldIndex++;
    }
}

1个回答

这是最有启发性和最有趣的模拟类型之一：您在计算机中创建独立的代理，让他们互动，跟踪他们所做的事情，并研究发生了什么。这是了解复杂系统的绝妙方法，尤其是（但不限于）那些无法通过纯数学分析来理解的系统。

构建此类模拟的最佳方法是自上而下的设计。

在最高级别，代码应该看起来像

initialize(...)
while (process(get.next.event())) {}

（这个和所有后续示例都是可执行 代码，而R不仅仅是伪代码。）循环是一个事件驱动的模拟：get.next.event()找到任何感兴趣的“事件”并将其描述传递给process有关它的信息）。TRUE只要事情运行良好，它就会返回；在识别错误或模拟结束时，它返回FALSE，结束循环。

如果我们想象这个队列的物理实现，例如人们在纽约市等待结婚证或几乎在任何地方等待驾驶执照或火车票，我们会想到两种代理：客户和“助手”（或服务器） . 客户通过出现来宣布自己；助手通过打开灯或标志或打开窗户来宣布他们的可用性。这是要处理的两种事件。

这种模拟的理想环境是真正面向对象的环境，其中对象是可变的：它们可以改变状态以独立响应周围的事物。 R这绝对是可怕的（即使是 Fortran 也会更好！）。但是，如果我们小心一点，我们仍然可以使用它。诀窍是将所有信息保存在一组通用数据结构中，这些数据结构可以由许多单独的交互过程访问（和修改）。我将采用对此类数据使用全部大写的变量名的约定。

自上而下设计的下一个层次process是编码。它响应单个事件描述符e：

process <- function(e) {
  if (is.null(e)) return(FALSE)
  if (e$type == "Customer") {
    i <- find.assistant(e$time)
    if (is.null(i)) put.on.hold(e$x, e$time) else serve(i, e$x, e$time)
  } else {
    release.hold(e$time)
  }
  return(TRUE)
}

get.next.event当没有要报告的事件时，它必须响应空事件。否则，process执行系统的“业务规则”。它实际上是根据问题中的描述编写自己的。它的工作原理几乎不需要评论，除了指出最终我们将需要编写子例程put.on.hold和release.hold（实现客户持有队列）和serve（实现客户助理交互）。

什么是“事件”？ 它必须包含有关谁在行动、他们正在采取什么样的行动以及何时发生的信息。因此，我的代码使用了一个包含这三种信息的列表。但是，get.next.event只需要检查次。它只负责维护一个事件队列，其中

任何事件都可以在收到时放入队列中，并且
可以轻松提取队列中最早的事件并将其传递给调用者。

这个优先级队列的最佳实现是堆，但这在R. 根据 Norman Matloff 的The Art of R Programming（提供更灵活、抽象但有限的队列模拟器）中的建议，我使用数据框来保存事件并在其记录中搜索最短时间。

get.next.event <- function() {
  if (length(EVENTS$time) <= 0) new.customer()               # Wait for a customer$
  if (length(EVENTS$time) <= 0) return(NULL)                 # Nothing's going on!$
  if (min(EVENTS$time) > next.customer.time()) new.customer()# See text
  i <- which.min(EVENTS$time)
  e <- EVENTS[i, ]; EVENTS <<- EVENTS[-i, ]
  return (e)
}

有很多方法可以对此进行编码。此处显示的最终版本反映了我在编写如何process对“助手”事件做出反应以及如何new.customer工作时做出的选择：get.next.event只需将客户从等待队列中取出，然后坐下来等待另一个事件。有时需要通过两种方式寻找新客户：首先，查看是否有人在门口等候（实际上），其次，是否有人在我们不寻找的时候进来了。

显然，new.customer和next.customer.time是重要的例程，所以让我们接下来处理它们。

new.customer <- function() {  
  if (CUSTOMER.COUNT < dim(CUSTOMERS)[2]) {
    CUSTOMER.COUNT <<- CUSTOMER.COUNT + 1
    insert.event(CUSTOMER.COUNT, "Customer", 
                 CUSTOMERS["Arrived", CUSTOMER.COUNT])
  }
  return(CUSTOMER.COUNT)
}
next.customer.time <- function() {
  if (CUSTOMER.COUNT < dim(CUSTOMERS)[2]) {
    x <- CUSTOMERS["Arrived", CUSTOMER.COUNT]
  } else {x <- Inf}
  return(x) # Time when the next customer will arrive
}

CUSTOMERS是一个二维数组，每个客户的数据列在列中。它有四行（作为字段）描述客户并记录他们在模拟过程中的体验：“Arrived”、“Served”、“Duration”和“Assistant”（服务人员的正数字标识符，如果有的话）他们，否则-1对于繁忙的信号）。在高度灵活的模拟中，这些列将是动态生成的，但由于R喜欢工作，一开始就在一个大矩阵中生成所有客户很方便，他们的到达时间已经随机生成。 next.customer.time可以偷看这个矩阵的下一列，看看谁是下一个。全局变量CUSTOMER.COUNT表示最后一位顾客到达。通过这个指针可以非常简单地管理客户，推进它以获得新客户并超越它（不推进）以窥视下一个客户。

serve实现模拟中的业务规则。

serve <- function(i, x, time.now) {
  #
  # Serve customer `x` with assistant `i`.
  #
  a <- ASSISTANTS[i, ]
  r <- rexp(1, a$rate)                       # Simulate the duration of service
  r <- round(r, 2)                           # (Make simple numbers)
  ASSISTANTS[i, ]$available <<- time.now + r # Update availability
  #
  # Log this successful service event for later analysis.
  #
  CUSTOMERS["Assistant", x] <<- i
  CUSTOMERS["Served", x] <<- time.now
  CUSTOMERS["Duration", x] <<- r
  #
  # Queue the moment the assistant becomes free, so they can check for
  # any customers on hold.
  #
  insert.event(i, "Assistant", time.now + r)
  if (VERBOSE) cat(time.now, ": Assistant", i, "is now serving customer", 
                   x, "until", time.now + r, "\n")
  return (TRUE)
}

这很简单。 ASSISTANTS是一个具有两个字段的数据帧：（capabilities给出他们的服务率）和available，它标记下一次助手将空闲的时间。通过根据助手的能力生成随机服务持续时间、更新助手下一次可用的时间并在CUSTOMERS数据结构中记录服务间隔来为客户提供服务。该VERBOSE标志便于测试和调试：当为真时，它会发出一串描述关键处理点的英文句子。

如何将助手分配给客户是重要且有趣的。 可以想象几个过程：随机分配，按固定顺序分配，或根据谁空闲时间最长（或最短）。其中许多在注释掉的代码中说明：

find.assistant <- function(time.now) {
  j <- which(ASSISTANTS$available <= time.now)
  #if (length(j) > 0) {
  #  i <- j[ceiling(runif(1) * length(j))]
  #} else i <- NULL                                    # Random selection
  #if (length(j) > 0) i <- j[1] else i <- NULL         # Pick first assistant
  #if (length(j) > 0) i <- j[length(j)] else i <- NULL # Pick last assistant
  if (length(j) > 0) {
    i <- j[which.min(ASSISTANTS[j, ]$available)]
  } else i <- NULL                                     # Pick most-rested assistant
  return (i)
}

其余的模拟实际上只是说服R实现标准数据结构的例行练习，主要是用于等待队列的循环缓冲区。因为你不想在全局变量中乱跑，所以我把所有这些都放在一个过程sim中。它的参数描述了这个问题：要模拟的客户数量 ( n.events)、客户到达率、助手的能力以及等待队列的大小（可以将其设置为零以完全消除队列）。

r <- sim(n.events=250, arrival.rate=60/45, capabilities=1:5/10, hold.queue.size=10)

它返回模拟期间维护的数据结构列表；最令人感兴趣的是CUSTOMERS数组。 R使得以有趣的方式在这个数组中绘制基本信息变得相当容易。这是一个输出，显示了客户的较长模拟中个客户。 $50$ $250$

每个客户的体验都被绘制为一条水平时间线，到达时带有圆形符号，任何等待等待的黑色实线以及与助手交互持续时间的彩色线（颜色和线类型区分助手）。在此客户图下方是显示助手体验的图，标记了他们与客户互动的时间和未与客户互动的时间。每个活动间隔的端点由竖线分隔。

使用运行时verbose=TRUE，模拟的文本输出如下所示：

...
160.71 : Customer 211 put on hold at position 1 
161.88 : Customer 212 put on hold at position 2 
161.91 : Assistant 3 is now serving customer 213 until 163.24 
161.91 : Customer 211 put on hold at position 2 
162.68 : Assistant 4 is now serving customer 212 until 164.79 
162.71 : Assistant 5 is now serving customer 211 until 162.9 
163.51 : Assistant 5 is now serving customer 214 until 164.05 
...

（左边的数字是每条消息发出的时间。）您可以将这些描述与客户图的位于时间和之间的部分相匹配。 $160$ $165$

我们可以通过按客户标识符绘制等待持续时间来研究客户的等待体验，使用特殊（红色）符号表示客户收到忙音。

图 2

（对于管理此服务队列的任何人来说，所有这些图不会成为一个出色的实时仪表板！）

比较通过改变传递给sim. 当客户到达太快而无法处理时会发生什么？当保持队列变小或消除时会发生什么？当以不同的方式选择助手时会发生什么变化？助手的数量和能力如何影响客户体验？一些客户开始被拒之门外或开始长时间搁置的关键点是什么？

通常，对于像这样一个明显的自学问题，我们会在这里停下来，把剩下的细节留作练习。但是，我不想让那些可能已经走到这一步并且有兴趣自己尝试一下（并且可能会对其进行修改并基于其他目的进行构建）的读者失望，因此下面附上了完整的工作代码。

（此站点上的处理会弄乱包含符号的任何行的缩进，但是在将代码粘贴到文本文件时应该恢复可读的缩进。） $\TeX$ $\$$

sim <- function(n.events, verbose=FALSE, ...) {
  #
  # Simulate service for `n.events` customers.
  #
  # Variables global to this simulation (but local to the function):
  #
  VERBOSE <- verbose         # When TRUE, issues informative message
  ASSISTANTS <- list()       # List of assistant data structures
  CUSTOMERS <- numeric(0)    # Array of customers that arrived
  CUSTOMER.COUNT <- 0        # Number of customers processed
  EVENTS <- list()           # Dynamic event queue   
  HOLD <- list()             # Customer on-hold queue
  #............................................................................#
  #
  # Start.
  #
  initialize <- function(arrival.rate, capabilities, hold.queue.size) {
    #
    # Create common data structures.
    #
    ASSISTANTS <<- data.frame(rate=capabilities,     # Service rate
                              available=0            # Next available time
    )
    CUSTOMERS <<- matrix(NA, nrow=4, ncol=n.events, 
                         dimnames=list(c("Arrived",  # Time arrived
                                         "Served",   # Time served
                                         "Duration", # Duration of service
                                         "Assistant" # Assistant id
                         )))
    EVENTS <<- data.frame(x=integer(0),              # Assistant or customer id
                          type=character(0),         # Assistant or customer
                          time=numeric(0)            # Start of event
    )
    HOLD <<- list(first=1,                           # Index of first in queue
                  last=1,                            # Next available slot
                  customers=rep(NA, hold.queue.size+1))
    #
    # Generate all customer arrival times in advance.
    #
    CUSTOMERS["Arrived", ] <<- cumsum(round(rexp(n.events, arrival.rate), 2))
    CUSTOMER.COUNT <<- 0
    if (VERBOSE) cat("Started.\n")
    return(TRUE)
  }
  #............................................................................#
  #
  # Dispatching.
  #
  # Argument `e` represents an event, consisting of an assistant/customer 
  # identifier `x`, an event type `type`, and its time of occurrence `time`.
  #
  # Depending on the event, a customer is either served or an attempt is made
  # to put them on hold.
  #
  # Returns TRUE until no more events occur.
  #
  process <- function(e) {
    if (is.null(e)) return(FALSE)
    if (e$type == "Customer") {
      i <- find.assistant(e$time)
      if (is.null(i)) put.on.hold(e$x, e$time) else serve(i, e$x, e$time)
    } else {
      release.hold(e$time)
    }
    return(TRUE)
  }#$
  #............................................................................#
  #
  # Event queuing.
  #
  get.next.event <- function() {
    if (length(EVENTS$time) <= 0) new.customer()
    if (length(EVENTS$time) <= 0) return(NULL)
    if (min(EVENTS$time) > next.customer.time()) new.customer()
    i <- which.min(EVENTS$time)
    e <- EVENTS[i, ]; EVENTS <<- EVENTS[-i, ]
    return (e)
  }
  insert.event <- function(x, type, time.occurs) {
    EVENTS <<- rbind(EVENTS, data.frame(x=x, type=type, time=time.occurs))
    return (NULL)
  }
  # 
  # Customer arrivals (called by `get.next.event`).
  #
  # Updates the customers pointer `CUSTOMER.COUNT` and returns the customer
  # it newly points to.
  #
  new.customer <- function() {  
    if (CUSTOMER.COUNT < dim(CUSTOMERS)[2]) {
      CUSTOMER.COUNT <<- CUSTOMER.COUNT + 1
      insert.event(CUSTOMER.COUNT, "Customer", 
                   CUSTOMERS["Arrived", CUSTOMER.COUNT])
    }
    return(CUSTOMER.COUNT)
  }
  next.customer.time <- function() {
    if (CUSTOMER.COUNT < dim(CUSTOMERS)[2]) {
      x <- CUSTOMERS["Arrived", CUSTOMER.COUNT]
    } else {x <- Inf}
    return(x) # Time when the next customer will arrive
  }
  #............................................................................#
  #
  # Service.
  #
  find.assistant <- function(time.now) {
    #
    # Select among available assistants.
    #
    j <- which(ASSISTANTS$available <= time.now) 
    #if (length(j) > 0) {
    #  i <- j[ceiling(runif(1) * length(j))]
    #} else i <- NULL                                    # Random selection
    #if (length(j) > 0) i <- j[1] else i <- NULL         # Pick first assistant
    #if (length(j) > 0) i <- j[length(j)] else i <- NULL # Pick last assistant
    if (length(j) > 0) {
      i <- j[which.min(ASSISTANTS[j, ]$available)]
    } else i <- NULL # Pick most-rested assistant
    return (i)
  }#$
  serve <- function(i, x, time.now) {
    #
    # Serve customer `x` with assistant `i`.
    #
    a <- ASSISTANTS[i, ]
    r <- rexp(1, a$rate)                       # Simulate the duration of service
    r <- round(r, 2)                           # (Make simple numbers)
    ASSISTANTS[i, ]$available <<- time.now + r # Update availability
    #
    # Log this successful service event for later analysis.
    #
    CUSTOMERS["Assistant", x] <<- i
    CUSTOMERS["Served", x] <<- time.now
    CUSTOMERS["Duration", x] <<- r
    #
    # Queue the moment the assistant becomes free, so they can check for
    # any customers on hold.
    #
    insert.event(i, "Assistant", time.now + r)
    if (VERBOSE) cat(time.now, ": Assistant", i, "is now serving customer", 
                     x, "until", time.now + r, "\n")
    return (TRUE)
  }
  #............................................................................#
  #
  # The on-hold queue.
  #
  # This is a cicular buffer implemented by an array and two pointers,
  # one to its head and the other to the next available slot.
  #
  put.on.hold <- function(x, time.now) {
    #
    # Try to put customer `x` on hold.
    #
    if (length(HOLD$customers) < 1 || 
          (HOLD$first - HOLD$last %% length(HOLD$customers) == 1)) {
      # Hold queue is full, alas.  Log this occurrence for later analysis.
      CUSTOMERS["Assistant", x] <<- -1 # Busy signal
      if (VERBOSE) cat(time.now, ": Customer", x, "got a busy signal.\n")
      return(FALSE)
    }
    #
    # Add the customer to the hold queue.
    #
    HOLD$customers[HOLD$last] <<- x
    HOLD$last <<- HOLD$last %% length(HOLD$customers) + 1
    if (VERBOSE) cat(time.now, ": Customer", x, "put on hold at position", 
                 (HOLD$last - HOLD$first - 1) %% length(HOLD$customers) + 1, "\n")
    return (TRUE)
  }
  release.hold <- function(time.now) {
    #
    # Pick up the next customer from the hold queue and place them into
    # the event queue.
    #
    if (HOLD$first != HOLD$last) {
      x <- HOLD$customers[HOLD$first]   # Take the first customer
      HOLD$customers[HOLD$first] <<- NA # Update the hold queue
      HOLD$first <<- HOLD$first %% length(HOLD$customers) + 1
      insert.event(x, "Customer", time.now)
    }
  }$
  #............................................................................#
  #
  # Summaries.
  #
  # The CUSTOMERS array contains full information about the customer experiences:
  # when they arrived, when they were served, how long the service took, and
  # which assistant served them.
  #
  summarize <- function() return (list(c=CUSTOMERS, a=ASSISTANTS, e=EVENTS,
                                       h=HOLD))
  #............................................................................#
  #
  # The main event loop.
  #
  initialize(...)
  while (process(get.next.event())) {}
  #
  # Return the results.
  #
  return (summarize())
}
#------------------------------------------------------------------------------#
#
# Specify and run a simulation.
#
set.seed(17)
n.skip <- 200  # Number of initial events to skip in subsequent summaries
system.time({
  r <- sim(n.events=50+n.skip, verbose=TRUE, 
           arrival.rate=60/45, capabilities=1:5/10, hold.queue.size=10)
})
#------------------------------------------------------------------------------#
#
# Post processing.
#
# Skip the initial phase before equilibrium.
#
results <- r$c
ids <- (n.skip+1):(dim(results)[2])
arrived <- results["Arrived", ]
served <- results["Served", ]
duration <- results["Duration", ]
assistant <- results["Assistant", ]
assistant[is.na(assistant)] <- 0   # Was on hold forever
ended <- served + duration
#
# A detailed plot of customer experiences.
#
n.events <- length(ids)
n.assistants <- max(assistant, na.rm=TRUE) 
colors <- rainbow(n.assistants + 2)
assistant.color <- colors[assistant + 2]
x.max <- max(results["Served", ids] + results["Duration", ids], na.rm=TRUE)
x.min <- max(min(results["Arrived", ids], na.rm=TRUE) - 2, 0)
#
# Lay out the graphics.
#
layout(matrix(c(1,1,2,2), 2, 2, byrow=TRUE), heights=c(2,1))
#
# Set up the customers plot.
#
plot(c(x.min, x.max), range(ids), type="n",
     xlab="Time", ylab="Customer Id", main="Customers")
#
# Place points at customer arrival times.
#
points(arrived[ids], ids, pch=21, bg=assistant.color[ids], col="#00000070")
#
# Show wait times on hold.
#
invisible(sapply(ids, function(i) {
  if (!is.na(served[i])) lines(x=c(arrived[i], served[i]), y=c(i,i))
}))
#
# More clearly show customers getting a busy signal.
#
ids.not.served <- ids[is.na(served[ids])]
ids.served <- ids[!is.na(served[ids])]
points(arrived[ids.not.served], ids.not.served, pch=4, cex=1.2)
#
# Show times of service, colored by assistant id.
#
invisible(sapply(ids.served, function(i) {
  lines(x=c(served[i], ended[i]), y=c(i,i), col=assistant.color[i], lty=assistant[i])
}))
#
# Plot the histories of the assistants.
#
plot(c(x.min, x.max), c(1, n.assistants)+c(-1,1)/2, type="n", bty="n",
     xlab="", ylab="Assistant Id", main="Assistants")
abline(h=1:n.assistants, col="#808080", lwd=1)
invisible(sapply(1:(dim(results)[2]), function(i) {
  a <- assistant[i]
  if (a > 0) {
    lines(x=c(served[i], ended[i]), y=c(a, a), lwd=3, col=colors[a+2])
    points(x=c(served[i], ended[i]), y=c(a, a), pch="|", col=colors[a+2])
  }
}))
#
# Plot the customer waiting statistics.
#
par(mfrow=c(1,1))
i <- is.na(served)
plot(served - arrived, xlab="Customer Id", ylab="Minutes",
     main="Service Wait Durations")
lines(served - arrived, col="Gray")
points(which(i), rep(0, sum(i)), pch=16, col="Red")
#
# Summary statistics.
#
mean(!is.na(served)) # Proportion of customers served
table(assistant)

其它你可能感兴趣的问题

上一篇为什么添加解释变量时残差平方和不增加？下一篇去除异常值的好方法？