1D CNN 变分自动编码器 Conv1D 大小

数据挖掘 喀拉斯 张量流 卷积神经网络 自动编码器 vae
2021-10-06 18:11:01

我正在尝试创建一个 1D 变分自动编码器以将 931x1 向量作为输入,但我遇到了两件事:

  1. 获得 931 的输出大小,因为 maxpooling 和上采样给出了均匀的大小
  2. 获得适当的图层大小

这就是我到目前为止所拥有的。在训练之前,我在输入数组的两侧添加了 0 填充(这就是为什么您会看到输入为 h+2,931+2 = 933),然后裁剪输出以获得 933 的输出大小。使用 931 输入会产生 928 输出,我不确定从那里获得 931 而不进行裁剪的最佳方法是什么。


input_sig = Input(batch_shape=(w,h+2, 1))
x = Conv1D(8,3, activation='relu', padding='same',dilation_rate=2)(input_sig)
# x = ZeroPadding1D((2,1))(x)
x1 = MaxPooling1D(2)(x)
x2 = Conv1D(4,3, activation='relu', padding='same',dilation_rate=2)(x1)
x3 = MaxPooling1D(2)(x2)
x4 = AveragePooling1D()(x3)
flat = Flatten()(x4)
encoder = Dense(2)(flat)
x = encoder
z_mean = Dense(latent_dim, name="z_mean")(x)
z_log_var = Dense(latent_dim, name="z_log_var")(x)
z = Sampling()([z_mean, z_log_var])
encoder = Model(input_sig, [z_mean, z_log_var, z], name="encoder")
encoder.summary()

latent_inputs = keras.Input(shape=(latent_dim,))
# d1 = Dense(464)(latent_inputs)
d1 = Dense(468)(latent_inputs)
# d2 = Reshape((117,4))(d1)
d2 = Reshape((117,4))(d1)
d3 = Conv1D(4,1,strides=1, activation='relu', padding='same')(d2)
d4 = UpSampling1D(2)(d3)
d5 = Conv1D(8,1,strides=1, activation='relu', padding='same')(d4)
d6 = UpSampling1D(2)(d5)
d7 = UpSampling1D(2)(d6)
d8 = Conv1D(1,1, strides=1, activation='sigmoid', padding='same')(d7)
decoded = Cropping1D(cropping=(1,2))(d8) # this is the added step

decoder = Model(latent_inputs, decoded, name="decoder")
decoder.summary()

这是打印的摘要:

Model: "encoder"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_99 (InputLayer)           [(1, 933, 1)]        0                                            
__________________________________________________________________________________________________
conv1d_209 (Conv1D)             (1, 933, 8)          32          input_99[0][0]                   
__________________________________________________________________________________________________
max_pooling1d_90 (MaxPooling1D) (1, 466, 8)          0           conv1d_209[0][0]                 
__________________________________________________________________________________________________
conv1d_210 (Conv1D)             (1, 466, 4)          100         max_pooling1d_90[0][0]           
__________________________________________________________________________________________________
max_pooling1d_91 (MaxPooling1D) (1, 233, 4)          0           conv1d_210[0][0]                 
__________________________________________________________________________________________________
average_pooling1d_45 (AveragePo (1, 116, 4)          0           max_pooling1d_91[0][0]           
__________________________________________________________________________________________________
flatten_45 (Flatten)            (1, 464)             0           average_pooling1d_45[0][0]       
__________________________________________________________________________________________________
dense_89 (Dense)                (1, 2)               930         flatten_45[0][0]                 
__________________________________________________________________________________________________
z_mean (Dense)                  (1, 2)               6           dense_89[0][0]                   
__________________________________________________________________________________________________
z_log_var (Dense)               (1, 2)               6           dense_89[0][0]                   
__________________________________________________________________________________________________
sampling_45 (Sampling)          (1, 2)               0           z_mean[0][0]                     
                                                                 z_log_var[0][0]                  
==================================================================================================
Total params: 1,074
Trainable params: 1,074
Non-trainable params: 0
__________________________________________________________________________________________________
Model: "decoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_100 (InputLayer)       [(None, 2)]               0         
_________________________________________________________________
dense_90 (Dense)             (None, 468)               1404      
_________________________________________________________________
reshape_44 (Reshape)         (None, 117, 4)            0         
_________________________________________________________________
conv1d_211 (Conv1D)          (None, 117, 4)            20        
_________________________________________________________________
up_sampling1d_117 (UpSamplin (None, 234, 4)            0         
_________________________________________________________________
conv1d_212 (Conv1D)          (None, 234, 8)            40        
_________________________________________________________________
up_sampling1d_118 (UpSamplin (None, 468, 8)            0         
_________________________________________________________________
up_sampling1d_119 (UpSamplin (None, 936, 8)            0         
_________________________________________________________________
conv1d_213 (Conv1D)          (None, 936, 1)            9         
_________________________________________________________________
cropping1d_18 (Cropping1D)   (None, 933, 1)            0         
=================================================================
Total params: 1,473
Trainable params: 1,473
Non-trainable params: 0
______________________________

但是,当我尝试拟合我的模型时,出现以下异常:

ValueError: Invalid reduction dimension 2 for input with 2 dimensions. for '{{node Sum}} = Sum[T=DT_FLOAT, Tidx=DT_INT32, keep_dims=false](Mean, Sum/reduction_indices)' with input shapes: [1,933], [2] and with computed input tensors: input[1] = <1 2>.

任何人都遇到过这个错误,或者看到我在模型构建中做错了什么?我是新手,不确定我做错了什么。

请注意,我已从 Keras 文档中的 28x28 MNIST VAE 对其进行了修改。

提前致谢

1个回答

我认为您对自动编码器的输入尺寸及其输出尺寸是不同的。输入是(1,933,1),而输出是(933,1)这些实际上应该是相同的。