数据挖掘 - 如何在 Keras 模型中使用 arctan2 函数？ - 吾爱随笔录

我正在尝试将arctan2功能添加到 Keras 模型的末尾，但看起来它甚至没有接近局部最小值。Add()这是我使用本机Keras 函数而不是 arctan2 函数的荒谬但最小的工作代码：

import numpy as np
import matplotlib.pyplot as plt
import scipy.signal as ss

from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Conv1D, Add
import tensorflow as tf

kernel_size = 64
epochs = 1000

def atan2(tensors):
  Q = tensors[0]
  I = tensors[1]
  return tf.math.atan2(Q, I)

def atan2_output_shape(input_shapes):
    return tuple(input_shapes[0])

atan2_layer = Lambda(atan2, output_shape=atan2_output_shape)

## Data generation for training

x_train = np.random.randn(1024, 512)

t = np.linspace(0, x_train.shape[1], x_train.shape[1], endpoint=False)
sine = np.sin(2*np.pi*t/32)
cosine = np.cos(2*np.pi*t/32)

x_I = np.multiply(x_train, cosine)
x_Q = np.multiply(x_train, sine)

b_I = ss.tukey(kernel_size)
b_Q = ss.tukey(kernel_size)

x_I_filt = np.array([np.convolve(b_I, x_I_i, mode='valid') for x_I_i in x_I])
x_Q_filt = np.array([np.convolve(b_Q, x_Q_i, mode='valid') for x_Q_i in x_Q])

y_train = x_Q_filt + x_I_filt
# y_train = x_Q_filt * x_I_filt
# y_train = np.arctan2(x_Q_filt, x_I_filt)

x_I = np.expand_dims(x_I, axis=2)
x_Q = np.expand_dims(x_Q, axis=2)
y_train = np.expand_dims(y_train, axis=2)

## Keras model

input_I = Input(shape=(x_I.shape[1], 1))
input_Q = Input(shape=(x_Q.shape[1], 1))

conv_I_1D = Conv1D(filters=1, kernel_size=kernel_size, activation=None, padding='valid', use_bias=False)(input_I)
conv_Q_1D = Conv1D(filters=1, kernel_size=kernel_size, activation=None, padding='valid', use_bias=False)(input_Q)

out_I_Q = Add()([conv_I_1D, conv_Q_1D])
# out_I_Q = Multiply()([conv_I_1D, conv_Q_1D])
# out_I_Q = atan2_layer([conv_Q_1D, conv_I_1D])

model_1D = Model([input_I, input_Q], out_I_Q)

model_1D.compile(optimizer='sgd', loss='mean_squared_error') 
history_1D = model_1D.fit([x_I, x_Q], y_train, epochs=epochs, verbose=0)

在 100 个 epoch 之后，我得到了几乎完美的初始过滤器内核：

plt.semilogy(history_1D.history['loss'])
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.show()

它也适用于Multiply()功能。但是如果我替换y_traintoy_train = np.arctan2(x_Q_filt, x_I_filt)和out_I_Qtoout_I_Q = atan2_layer([conv_Q_1D, conv_I_1D])我会得到这个悲伤的损失图：

我什至将权重初始化为应有的值，但在运行之前有一点偏移model_1D.fit(...)。b_I并且b_Q是相同的数组。

offset = 1e-5
array_for_I_weights = np.array(model_1D.layers[2].get_weights())
array_for_I_weights[0,:,0,0] = list(b_I+offset)
model_1D.layers[2].set_weights(array_for_I_weights)
model_1D.layers[3].set_weights(array_for_I_weights)

训练阶段后的损失图将如下所示：

但是，如果我更改offset为，1e-4我会得到这个损失图：它只会在一个时代之后变得更糟。如果将 arctan2 函数放在 Keras 模型中，它有什么问题？为什么模型从好的最小值跳出来？也许我应该使用另一个自定义损失/度量函数？

更改优化器类型根本无济于事。我正在使用 tensorflow 2.0 和 Keras 2.2.5。