机器算法验证 - 模拟对项目反应理论测试的反应 - 吾爱随笔录

模拟对项目反应理论测试的反应

机器算法验证 r 模拟项目反应理论拉施

2022-04-09 11:51:58

我正在开发一个在线评估系统，我需要校准问题库，但我没有足够的人来实施试点测试。这就是为什么我决定模拟题库的反应。我将使用 Rasch 模型开发计算机自适应测试 (CAT)；我只有来自银行的问题以及每个选项的数量。如何仅使用这些数据进行模拟？

我曾想过让它生成一个随机数 $X$ ，如果 $X < 1 / N$ ，在哪里 $N$ 是特定问题中选项的数量，则此人已得到正确答案，否则为不正确。

这是通过模拟将项目库校准为 Rasch 模型的正确方法吗？

PS：我要做的是计算每个问题的难度级别。我的理解是，要校准一个题库，该题库必须应用于一个主题样本，但我没有足够的人来做这件事，所以我决定尝试一种模拟方法。我测试了包中的功能sim.rasch，psych我发现我需要考虑每个问题的选项数量，因为这可能会影响它们的难度。

4个回答

我在 Rsimdata()的包中包含了一个函数，mirt用于计算模拟 IRT 数据，给定几种不同类别的单维和多维 IRT 模型的各种已知条件。因此，如果您需要一些更灵活的东西，这可能是一个不错的选择，并且应该使您不必重新发明轮子。

这是 R 中此模拟的第一个版本，作为不做什么的示例。未经测试的代码。

# we're making a table of three columns: person, question, and correct or not
resps <- data.frame(person=integer(), question=integer(), correct=integer())

# do the simulation
for (qu in 1:nQ) { # loop over questions, nQ is number of questions
  # how many possible answers does this question have? flip a coin with max_opts sides
  opts <- sample.int(max_opts, 1)

  for (pe in 1:nP) { # loop over test-takers, nP is number of test-takers
    # did this person answer correctly? flip a coin with 1/opts probability of success
    resp <- rbinom(1, 1, 1/opts)
    resps <- rbind(resps, data.frame(person=pe, question=qu, correct=resp))
  }
}

这是一个坏主意的一些原因：

它不允许丢失响应
它假设所有人都具有相同的能力
它假设所有问题都具有相同的难度（给定答案选项的数量）

你可以再想出几个...

关键是您希望了解您正在测试的对象和测试对象，并且您希望这些人和问题的潜在属性能够反映在您的模拟数据中。

我的建议是尝试让一些真正的人来真正地接受你的测试。在 Amazon Mechanical Turk 上可能非常便宜和容易。

做你想做的最简单的方法可能是使用psychR 的包。这包括诸如sim.rasch和之类的函数sim.irt，它将为你模拟任何大小的适当数据。

我知道这有点太晚了，但出于历史原因，我想回答这个问题。

使用 R 包模拟 CAT

如今，该软件包catr能够模拟 CAT，这似乎正是您想要做的。它有很多选项，例如起始项的数量，下一项的选择方法，停止规则的配置等。

这是我计算机中的一段旧代码。所有学分都转到catr 手册。

# call the catR package, if not installed then install it
if (!require('catR')) install.packages('catR')

require('catR')
# create a bank with 3PL items
Bank <- genDichoMatrix(items = 500, cbControl = NULL, model = "Rasch",
                       seed = 1)

# list of four parameters that characterize a CAT: start, test, stop, final
# these lists will feed the randomCAT function to generate a response pattern

# one first item selected, ability level starts at 0, criterion for
# selecting first items is maximum Fisher information
Start <- list(nrItems = 1, theta = 0, startSelect = "MFI")

# use weighted likelihood, select items through MFI (see previous comment)
Test <- list(method = "WL", itemSelect = "MFI")

# stopping rule by classication, meaning that the test will stop when the
# CI no longer holds the threshold inside it anymore
Stop <- list(rule = "precision", thr = 0.4, alpha = 0.05)

# how estimates of ability are calculated
Final <- list(method = "WL", alpha = 0.05)

# set true ability at 1, calls lists above
res <- randomCAT(trueTheta = rnorm(n=1,mean=0,sd=1), itemBank = Bank,
                 start = Start, test = Test,
                 stop = Stop, final = Final)

# plotting the response pattern
plot(res, ci = TRUE, trueTh = TRUE, classThr = 2)

额外位：从响应模式中提取项目参数

从这个问题中，我意识到您已经有了想要用来提取项目参数（辨别力、难度、猜测因素等）的响应模式。为了做到这一点，我推荐这个mirt包，它可以做到这一点（还有更多）。您可以在此处和此处找到有关如何使用此软件包的示例。

我预测您要做的唯一额外工作是将mirt的输出矩阵转换为使用的输入格式catr。

其它你可能感兴趣的问题

上一篇挖掘搜索日志以改进自动完成建议？下一篇遗传算法的收敛