神经网络中的额外输出层(十进制到二进制)

数据挖掘 神经网络
2021-10-04 23:12:23

我正在处理在线书籍中的一个问题。

我可以理解,如果额外的输出层有 5 个输出神经元,我可以将前一层的偏差设置为 0.5,权重为 0.5。但是现在的问题是需要一个新的四个输出神经元层——这足以代表 10 个可能的输出24.

有人可以指导我了解和解决这个问题所涉及的步骤吗?

练习题:

有一种方法可以通过在上面的三层网络中添加一个额外的层来确定数字的按位表示。额外层将前一层的输出转换为二进制表示,如下图所示。为新的输出层找到一组权重和偏差。假设前 3 层神经元使得第三层(即旧输出层)中正确输出的激活度至少为 0.99,而错误输出的激活度小于 0.01。

在此处输入图像描述

4个回答

问题是要求您在旧表示和新表示之间进行以下映射:

Represent    Old                     New
0            1 0 0 0 0 0 0 0 0 0     0 0 0 0 
1            0 1 0 0 0 0 0 0 0 0     0 0 0 1 
2            0 0 1 0 0 0 0 0 0 0     0 0 1 0 

3            0 0 0 1 0 0 0 0 0 0     0 0 1 1 
4            0 0 0 0 1 0 0 0 0 0     0 1 0 0 
5            0 0 0 0 0 1 0 0 0 0     0 1 0 1 

6            0 0 0 0 0 0 1 0 0 0     0 1 1 0 
7            0 0 0 0 0 0 0 1 0 0     0 1 1 1 
8            0 0 0 0 0 0 0 0 1 0     1 0 0 0 

9            0 0 0 0 0 0 0 0 0 1     1 0 0 1

因为旧的输出层有一个简单的形式,这很容易实现。每个输出神经元在它自己和输出神经元之间应该有一个正权重,应该打开来表示它,在它自己和输出神经元之间应该有一个负权重应该关闭。这些值应该结合起来足够大,以便干净地打开或关闭,所以我会使用较大的权重,例如 +10 和 -10。

如果你在这里有 sigmoid 激活,那么偏差就没有那么重要了。你只想简单地使每个神经元饱和到开或关。该问题使您可以在旧输出层中假设非常清晰的信号。

因此,以表示 3 并按照我显示它们的顺序对神经元使用零索引(这些选项未在问题中设置)为例,我的权重可能来自激活旧输出 i=3, A3Old 新输出的logit ZjNew, 在哪里 ZjNew=Σi=0i=9WijAiOld 如下:

W3,0=10
W3,1=10
W3,2=+10
W3,3=+10

0 0 1 1当只有代表“3”的旧输出层的神经元处于活动状态时,这应该清楚地产生接近输出。在这个问题中,您可以假设一个神经元的激活率为 0.99,而旧层中的竞争神经元的激活率为 <0.01。因此,如果您始终使用相同大小的权重,则来自其他旧层激活值的 +-0.1 (0.01 * 10) 的相对较小的值不会严重影响 +-9.9 值,并且新层中的输出将在非常接近 0 或 1 时饱和。

下面来自 SaturnAPI 的代码回答了这个问题。在https://saturnapi.com/artitiw/neural-network-decimal-digits-to-binary-bitwise-conversion查看并运行代码

% Welcome to Saturn's MATLAB-Octave API.
% Delete the sample code below these comments and write your own!

% Exercise from http://neuralnetworksanddeeplearning.com/chap1.html
% There is a way of determining the bitwise representation of a digit by adding an extra layer to the three-layer network above. The extra layer converts the output from the previous layer into a binary representation, as illustrated in the figure below. Find a set of weights and biases for the new output layer. Assume that the first 3 layers of neurons are such that the correct output in the third layer (i.e., the old output layer) has activation at least 0.99, and incorrect outputs have activation less than 0.01.

% Inputs from 3rd layer
xj = eye(10,10)

% Weights matrix
wj = [0 0 0 0 0 0 0 0 1 1 ;
      0 0 0 0 1 1 1 1 0 0 ;
      0 0 1 1 0 0 1 1 0 0 ;
      0 1 0 1 0 1 0 1 0 1 ]';

% Results
wj*xj


% Confirm results
integers = 0:9;
dec2bin(integers)

上述练习的 Pythonic 证明:

"""
NEURAL NETWORKS AND DEEP LEARNING by Michael Nielsen

Chapter 1

http://neuralnetworksanddeeplearning.com/chap1.html#exercise_513527

Exercise:

There is a way of determining the bitwise representation of a digit by adding an extra layer to the three-layer network above. The extra layer converts the output from the previous layer into a binary representation, as illustrated in the figure below. Find a set of weights and biases for the new output layer. Assume that the first 3 layers of neurons are such that the correct output in the third layer (i.e., the old output layer) has activation at least 0.99, and incorrect outputs have activation less than 0.01.

"""
import numpy as np


def sigmoid(x):
    return(1/(1+np.exp(-x)))


def new_representation(activation_vector):
    a_0 = np.sum(w_0 * activation_vector)
    a_1 = np.sum(w_1 * activation_vector)
    a_2 = np.sum(w_2 * activation_vector)
    a_3 = np.sum(w_3 * activation_vector)

    return a_3, a_2, a_1, a_0


def new_repr_binary_vec(new_representation_vec):
    sigmoid_op = np.apply_along_axis(sigmoid, 0, new_representation_vec)
    return (sigmoid_op > 0.5).astype(int)


w_0 = np.full(10, -1, dtype=np.int8)
w_0[[1, 3, 5, 7, 9]] = 1
w_1 = np.full(10, -1, dtype=np.int8)
w_1[[2, 3, 6, 7]] = 1
w_2 = np.full(10, -1, dtype=np.int8)
w_2[[4, 5, 6, 7]] = 1
w_3 = np.full(10, -1, dtype=np.int8)
w_3[[8, 9]] = 1

activation_vec = np.full(10, 0.01, dtype=np.float)
# correct number is 5
activation_vec[3] = 0.99

new_representation_vec = new_representation(activation_vec)
print(new_representation_vec)
# (-1.04, 0.96, -1.0, 0.98)
print(new_repr_binary_vec(new_representation_vec))
# [0 1 0 1]

# if you wish to convert binary vector to int
b = new_repr_binary_vec(new_representation_vec)
print(b.dot(2**np.arange(b.size)[::-1]))
# 5

对FullStack的关于Neil Slater使用 Octave 的评论的回答稍作修改:

% gzanellato
% Octave

% 3rd layer:
A = eye(10,10);

% Weights matrix:

fprintf('\nSet of weights:\n\n')

wij = [-10 -10 -10 -10 -10 -10 -10 -10 10 10;
       -10 -10 -10 -10 10 10 10 10 -10 -10;
       -10 -10 10 10 -10 -10 10 10 -10 -10;
       -10 10 -10 10 -10 10 -10 10 -10 10]

% Any bias between -9.999.. and +9.999.. runs ok

bias=5

Z=wij*A+bias;

% Sigmoid function:

for j=1:10;
  for i=1:4;
    Sigma(i,j)=int32(1/(1+exp(-Z(i,j))));
  end
end

fprintf('\nBitwise representation of digits:\n\n')

disp(Sigma')