信息处理 - 使用从小波散射系数中提取的特征对信号进行分类 - 吾爱随笔录

使用从小波散射系数中提取的特征对信号进行分类

信息处理小波分类散射

2022-02-15 08:10:49

我有一组 2s 长度的 RF 信号样本，每个样本都以 2MHz 采样率记录，例如这个 IQ 文件：https ://www.dropbox.com/s/dd6fr4va4alpazj/move_x_movedist_1_speed_25k_sample_6.wav?dl=0

kymatio我可以使用以下代码有效地收集零阶、一阶和二阶系数并绘制它们：

import scipy.io.wavfile
import numpy as np
import matplotlib.pyplot as plt
from kymatio.numpy import Scattering1D

path = r"move_x_movedist_1_speed_25k_sample_6.wav"

# Read in the sample WAV file
fs, x = scipy.io.wavfile.read(path)
x = x.T
print(fs)
# Once the recording is in memory, we normalise it to +1/-1
x = x / np.max(np.abs(x))
print(x)
# Set up parameters for the scattering transform
## number of samples, T
N = x.shape[-1]
print(N)
## Averaging scale as power of 2, 2**J, for averaging
## scattering scale of 2**6 = 64 samples
J = 6
## No. of wavelets per octave (resolve frequencies at
## a resolution of 1/16 octaves)
Q = 16
# Create object to compute scattering transform
scattering = Scattering1D(J, N, Q)
# Compute scattering transform of our signal sample
Sx = scattering(x)
# Extract meta information to identify scattering coefficients
meta = scattering.meta()
# Zeroth-order
order0 = np.where(meta['order'] == 0)
# First-order
order1 = np.where(meta['order'] == 1)
# Second-order
order2 = np.where(meta['order'] == 2)

#%%
# Plot original signal
plt.figure(figsize=(8, 2))
plt.plot(x)
plt.title('Original Signal')
plt.show()

# Plot zeroth-order scattering coefficient (average of
# original signal at scale 2**J)
plt.figure(figsize=(8,8))
plt.subplot(3, 1, 1)
plt.plot(Sx[order0][0])
plt.title('Zeroth-Order Scattering')
# Plot first-order scattering coefficient (arrange
# along time and log-frequency)
plt.subplot(3, 2, 1)
plt.imshow(Sx[0][order1], aspect='auto')
plt.title('First-order scattering [1]')
plt.subplot(3, 2, 2)
plt.imshow(Sx[1][order1], aspect='auto')
plt.title('First-order scattering [2]')
# Plot second-order scattering coefficient (arranged
# along time but has two log-frequency indicies -- one
# first- and one second-order frequency. Both are mixed
# along the vertical axis)
plt.subplot(3, 3, 1)
plt.imshow(Sx[0][order2], aspect='auto')
plt.title('Second-order scattering [0]')
plt.subplot(3, 3, 2)
plt.imshow(Sx[1][order2], aspect='auto')
plt.title('Second-order scattering [1]')
plt.show()

但是，我的任务是使用神经网络架构对每个样本进行分类。我遇到的问题是这些系数的形状和大小非常大，简单地将它们存储在内存中是不可行的（总共大约有 2k 个样本）。

因此，我想知道是否有一种提取特征的好方法可以用来表示这一点（即，如果我采用 MFCC，我可以创建 1-13 个 MFCC 系数的特征列，例如 via mfcc = librosa.feature.mfcc(y=x, sr=fs, n_mfcc=14, hop_length=hop_length, n_fft=M)[1:]）。或者也许还有其他缩短这些数据的方法，因此它实际上可用于神经网络进行训练（即，取每个系数的空间平均值：¯Sm,J = ∑x ˜Sm,J ((λ1,··· ,λm), x)但是这个空间信息同时减少了维度）。

任何帮助都会很棒！

编辑：这是来自 matlab 信号分析仪的功率谱和时频图。如何使用它来识别频谱占用以进行下采样以最小化数据。

1个回答

扩展我的相关答案，但对于小波散射：

更高T-> 更大的时移不变性。
更高Q-> 更高的频率分辨率、更低的时间分辨率和时间扭曲稳定性。
更高J-> 更大的最大规模特征（最大核卷积）。应该设置为我们认为最大规模的相关结构是什么——例如，一个句子比一个词长，一个词比一个字符长。

全局平均，即T==len(x)，可以表现得非常好，同时大幅削减输入大小。还有助于沿时间拆分输入，然后加入散射（“无重叠加入卷积”），因为它比一个大卷积更快。

一个紧迫的问题是是否确实需要如此高的采样率（正如 Dan 指出的那样），以及相关功能是否跨越整个 0-2MHz 带宽。如果有效带宽仅为例如1-1.5MHz，那么在馈送到散射网络之前，可以使用下采样技术将数据缩小四倍，从而大大减少处理时间。

回复：下采样从我可以根据图表看出，下采样 2 不应该受到伤害，但是 1）这不是很多，并且 2）如果丢弃的东西很重要，它可能确实会受到伤害，这取决于是否存在例如幂律缩放例如脑电图。频率似乎相当均匀分布，所以如果没有其他东西知道它们的“重要性”，也许无能为力。一种方法：

下采样 8，训练，测试
重复4次下采样，看看是否有改善
是：重复下采样 2。否：重复下采样 16。

PS，这也应该有帮助，但没有meta。

其它你可能感兴趣的问题

上一篇相控阵系统中基于 DSP 的相移下一篇GMSK 信号方程中的 BT 值