TNT神经网络
对于 TNT-NN,请参见:
迈尔,乔 M.,等人。“TNT-NN:一种解决大型非负最小二乘问题的快速主动集方法。” Procedia 计算机科学 108 (2017): 755-764。
https://doi.org/10.1016/j.procs.2017.05.194
为了形成活动集,TNT-NN 首先解决了一个无约束的最小二乘问题。违反非负约束的变量被添加到活动集中。一旦找到可行的解决方案,其中没有违反任何非负约束,残差的 2 范数被用作适应度的度量,并且该解决方案被保存为当前的“最佳”解决方案。
TNT-NN 试图通过迭代地将一些变量从活动集中移回无约束集来修改活动集。活动集变量根据它们的梯度分量进行排序。通过将其中一些从活动集中移动到无约束集中来测试显示最大正梯度分量的变量。重要的是要注意,最初可以在单个测试中移动大量变量。如果新解决方案的适应度没有提高,则拒绝该解决方案并测试较小的变量集。如果可以从活动集中删除一组变量,并找到一个“更好”的新可行解,则保存该解,算法开始新的迭代。当活动集不能再被修改时,算法达到收敛。
您的实施只是此解决方案的一半。它是搜索新的可行集以测试从活动集中删除变量的部分。在下面的代码中进行了演示。我已经复制了你的fastnnls函数并将其变成feasible_set了算法的一部分。
在 Myre 等人的文章中谈到“通过将其中一些从活动集中移动到无约束集中。重要的是要注意,最初可以在单个测试中移动大量变量”。但是我不确定他们是如何在代码中这样做的,我一直在将它们一一添加。可能有一些额外的技巧可以更快地选择一次添加的大型组,而不是尝试所有变量的 for 循环。
multiway::fnnls和之间的区别nnls::nnls
由于比较中的一个小错误,您会有所不同。一个函数需要矩阵和向量,另一个函数需要矩阵和向量。您已将后者用于这两个功能。在下面的代码中,我给出了一个示例输出。XYXTXXTy
示例代码
### finding the feasible set
feasible_set <- function(a, b, ind){
x <- rep(0, length(b))
x[ind] <- solve(a[ind, ind], b[ind])
while(any(x < 0)){
ind <- which(x > 0)
x <- rep(0, length(b))
x[ind] <- solve(a[ind, ind], b[ind])
}
as.vector(x)
}
### finding the gradients
gradients <- function(b,y,X) {
current_y <- X %*% b
d_y <- y-current_y
gradients <- t(X) %*% d_y
return(gradients)
}
### The algorithm that repeatedly updates the active set
### The updates are done by removing the variable with the highest positive gradient
fastnnls <- function(y,X) {
### Initiation
a <- crossprod(X)
b <- as.vector(crossprod(X, y))
current_active <- rep(TRUE,length(X[1,])) ### start with all variables in active set
current_s <- rep(0,length(X[1,])) ### initial conditions
current_y <- X %*% current_s
current_loss <- sum((y-current_y)^2)
### algorithm that stops untill no improvement can be made
cont <- TRUE
while (cont) {
### add variables based on gradients
### in these four lines the gradients are found and ordered
gradients <- gradients(current_s,y,X)
testing <- which(gradients*current_active>0) ### find out which variables are active and have positive gradients
ord <- order(gradients, decreasing = TRUE)
ord <- ord[ord %in% testing] ### strip the negative or non-active variables
### keep adding variables in a loop while this improves the solution
addition <- 0 ### itterative variable keeping track of the additions
new_active <- current_active
for (i in 1:length(ord)) {
### Try out a new active set with one variable removed
new_active[ord[i]] <- FALSE ### remove 'ord[i]' from active set
new_s <- feasible_set(a,b, ind = which(new_active == FALSE))
new_y <- X %*% new_s
new_loss <- sum((y-new_y)^2)
### Update the solution if the new trial is better
if (new_loss < current_loss) {
addition <- i
current_active <- new_active
current_loss <- new_loss
current_s <- new_s
current_y <- new_y
} else {
break ### skip loop to end
}
}
if (addition == 0) { ### quit while when no addition is made
cont = FALSE
} else {
new_active <- new_s == 0 ### in the for loop we had only been decreasing the active set
### but the feasible_set function also increases the active it and we need to adapt accrdingly
}
if (sum(current_active) == 0) { ### quit if active set is empty (all variables positive)
cont = FALSE
}
### The while loop continues by recomputing the gradients
}
return(current_s)
}
set.seed(123)
X <- matrix(rnorm(2000),100,20)
y <- X %*% runif(20) + rnorm(100)*5
library(nnls)
library(multiway)
data.frame(multiway = multiway::fnnls(a, b),
nnls = nnls::nnls(X, y)$x,
manual = fastnnls(y,X))
输出
> data.frame(multiway = multiway::fnnls(a, b),
+ nnls = nnls::nnls(X, y)$x,
+ manual = fastnnls(y,X))
multiway nnls manual
1 0.610802720 0.610802720 0.610802720
2 0.146121047 0.146121047 0.146121047
3 0.841809005 0.841809005 0.841809005
4 1.131040740 1.131040740 1.131040740
5 0.000000000 0.000000000 0.000000000
6 1.093652478 1.093652478 1.093652478
7 0.725590111 0.725590111 0.725590111
8 0.211525228 0.211525228 0.211525228
9 0.000000000 0.000000000 0.000000000
10 1.472333600 1.472333600 1.472333600
11 0.005740395 0.005740395 0.005740395
12 2.131277775 2.131277775 2.131277775
13 0.000000000 0.000000000 0.000000000
14 0.590923989 0.590923989 0.590923989
15 0.652530944 0.652530944 0.652530944
16 0.717713755 0.717713755 0.717713755
17 1.115162378 1.115162378 1.115162378
18 0.603304661 0.603304661 0.603304661
19 0.000000000 0.000000000 0.000000000
20 0.218073317 0.218073317 0.218073317