信息处理 - 实时分区卷积不起作用 - 吾爱随笔录

实时分区卷积不起作用

信息处理 fft 卷积即时的混响

2022-02-06 20:48:35

编辑：向下滚动查看实际工作代码。

我正在努力在 C 上实现一个实时卷积混响 JACK 客户端，并且我一直在尝试遵循许多来源（包括Gardner和Wefers（第 110-111 页））但无济于事。我已经尽我所能地按照文本进行操作，老实说，我无法摆脱可怕的声音伪影。我在波特线（频谱图/波形可视化器）上查找了它们，您可以看到输出为零的不连续性和跳跃的组合。我尝试使用窗口来消除 Wefers 对统一分区算法的描述所产生的不连续性，即使我摆脱了不连续性，我仍然会得到最奇怪的混叠。在 Internet 上的任何地方，我都找不到清楚地概述此任务所需代码的单一来源。

我正在创建这样的分区：

// ir is being read from a .wav file using libsndfile and is zero padded to
// have a length that is a multiple of nframes (nframes*partitions, exactly).
for (int k = 0; k < partitions; k++){
            for (int i = 0; i < nframes; i++){
                    // create zero padded partitions
                    ir_time[i]           = ir[k*nframes + i];
                    ir_time[nframes + i] = 0.0;
            }

            fftw_execute(ir_forward);
            for (int i = 0; i < 2*nframes; i++){
                    // write filter partitions
                    fir[k][i] = ir_fft[i];
                    // initialize FDL to zero
                    fdl[k][i] = 0.0 + I*0.0;
            }

    }

进行处理的 jack_callback() 函数如下所示（从 Wefers p.111 复制流处理指令）：

（请注意，b[1] 和 b[0] 缓冲区的长度为 3*nsamples，因此它们在中间重叠添加，我知道这很奇怪，但我知道它可以与其他基于 fftw 的代码一起使用。我知道我应该将它们更改为 2*nframes 长度的缓冲区，我会在某个时候解决这个问题。不过，它们在这里仅用作输入缓冲区。）

int jack_callback (jack_nframes_t nframes, void *arg){
    jack_default_audio_sample_t *in, *out;
    int i, j;

    in  = (jack_default_audio_sample_t *)jack_port_get_buffer (input_port , nframes);
    out = (jack_default_audio_sample_t *)jack_port_get_buffer (output_port, nframes);

    for (i = 0; i < nframes; i++){
            // 1. Input buffer acts as a 2B-point sliding sliding window of
            //    the input signal. With each new input block, the right half of the input
            //    buffer is shifted to the left and the new block is stored in the right
            //    half.
            
            // tried windowing the 2*nframes length buffer but it
            // didn't work
            // shift right half of input buffer to the left
            b[1][nframes + i    ] = b[1][nframes + i];

            // store input to right hand side of buffer
            b[1][two_nframes + i] = in[i];

            // prepare buffer for FFT
            
            // tried writing in[i] to in_time[i] and zero padding the rest but
            // it also didn't work
            i_time[i]         = b[1][nframes + i];
            i_time[nframes+i] = b[1][two_nframes + i];
    }
    // 2. All conttents (DFT spectra) in the FDL are shifted up by one slot.
    for (int k = 0; k < partitions - 1; k++){
            for (int i = 0; i < two_nframes; i++){
                    fdl[k+1][i] = fdl[k][i];
            }
    }

    // 3. a 2B-point real-to-complex FFT is computed from the input buffer,
    //    resulting in 2B DFT coefficients. The result is stored in the 
    //    first FDL slot.

    // taking R2C-FFT
    fftw_execute(i_forward);

    // 4. The P sub-filter spectra are pairwisely multiplied with the input
    //    spectra in the FDL. The results are accumulated in the frequency
    //    domain.
    
    for (int i = 0; i < two_nframes; i++){
            fdl[0][i] = i_fft[i];
    }

    for (int i = 0;  i < two_nframes; i++){
            // reset o_fft[i] to erase previous callback buffer
            o_fft[i] = 0;
            for (int k = 0; k < partitions; k++){
                    // accumulation stage
                    o_fft[i] += fir[k][i] * fdl[k][i];
            }
    }

    // 5. Of the accumulated spectral convolutions, an 2B-point C2R-IFFT
    //    is computed. From the resulting 2B samples, the left half is 
    //    discarded and the right half is returned as the next output block 

    fftw_execute(o_inverse);
    for (i = 0; i < nframes; i++){
            // tried this with and without windowing but it didn't work
            //b[1][    nframes + i] = creal( o_time[i]        ) / two_nframes;
            //b[1][two_nframes + i] = creal( o_time[nframes+i]) / two_nframes;
            
            // tried to do overlap-add like this but it also didn't work
            out[i] = b[0][nframes + i]+b[1][nframes + i]; // + conv[i];

            // discarding left side of IFFT and returning rigth half as output.
            //out[i] = creal(o_time[nframes + i]) / two_nframes;
             
            // more overlap-add stuff
            // b[0][i]           = b[1][    nframes + i];
            // b[0][nframes + i] = b[1][two_nframes + i];

            //b[1][nframes+i] = in[i];
    }
    return 0;
}

如果您想查看其余代码，我很乐意将其上传到 github 并在此处添加链接。我对此束手无策。任何帮助将不胜感激！

编辑：我发现我的代码有问题。我做错了循环移位。FDL 的索引在覆盖它时应该减少以避免不必要的循环。这是实际的工作代码：

int jack_callback (jack_nframes_t nframes, void *arg){
    jack_default_audio_sample_t *in, *out;
    int i, j, k;

    in = (jack_default_audio_sample_t *)jack_port_get_buffer (input_port, nframes);
    out = (jack_default_audio_sample_t *)jack_port_get_buffer (output_port, nframes);

    for (i = 0; i < nframes; i++){
            // nframes come in and are then saved in the right part of the input buffer
            b[1][nframes + i] = in[i];
            i_time[i] = b[1][i];
            i_time[nframes+i] = b[1][nframes+i];
    }
    // take the FFT of the input:
    fftw_execute(i_forward);

    // circular shift (done right!):
    for (i = 0; i < two_nframes; i++){
            for (k = partitions - 1; k > 0; k--){
                    fdl[k][i] = fdl[k-1][i];
            }
    }

    // write the most recent FFT to the first slot of the FDL:
    for (i = 0; i < two_nframes; i++){
            fdl[0][i] = i_fft[i];
            o_fft[i] = 0.0 + I*0.0;
    }
    // Processing block: multiply-add stage across the FDL
    for (i = 0; i < two_nframes; i++){
            for (k = 0; k < partitions; k++){
                    o_fft[i] += fdl[k][i] * fir[k][i];
            }
    }

    // Taking the ifft and returning to the time domain.
    fftw_execute(o_inverse);
    for (i = 0; i < nframes; i++){
            out[i] = vol*creal(o_time[nframes+i])/two_nframes;
            // shift the input buffer to the left.
            b[1][i] = in[i];
    }

    return 0;

}

2个回答

看来您正在实现一个统一的分区卷积，这并不难实现。您应该坚持使用 Wefers p110 的图 5.2 的信号流和 p111 上的流处理。分区卷积不需要加窗。一个简单的matlab代码可以是：

NFFT = blockSize * 2;
X_fdl = zeros(NFFT / 2 + 1, hBlocks); % input blocks frequency delay line
H = zeros(NFFT / 2 + 1, hBlocks); % impulse response blocks FFT results
x_tdl = zeros(NFFT, 1); % temp buffer for FFT

%% split very long impulse response into small blocks and convert to FFT domain
startIdx = 1;
endIdx = blockSize;
for hBlock = 1:hBlocks
    H(:, hBlock) = fftr(h(startIdx:endIdx), NFFT);
    startIdx = endIdx + 1;
    endIdx = endIdx + blockSize;
end
    
%% process signal stream x frame by frame and apply UPOLS
startIdx = 1;
endIdx = blockSize;

while endIdx <= length(x)
    % 1. right half of the input buffer is shifted to the left
    x_tdl(1:blockSize) = x_tdl(blockSize + 1:end); 
    
    % and the new block is stored in the right half
    x_tdl(blockSize + 1:end) = x(startIdx:endIdx); 
    
    % 2. circular shift X_fdl, so that 1st column is current block
    X_fdl = circshift(X_fdl, [0, 1]); 

    % 3. take FFT for current block input
    X_fdl(:, 1) = fftr(x_tdl, NFFT);
    
    % 4. point-wise multiplication and accumulate the result
    Y = sum(X_fdl .* H, 2);
    
    % 5. take IFFT for current block output
    yifft = ifftr(Y, NFFT); 
    
    % left half is discarded and the right half is returned
    y(startIdx:endIdx) = yifft(blockSize + 1:end);
    
    % update loop variable
    startIdx = endIdx + 1;
    endIdx = endIdx + blockSize;
end
```

在第 5 步中，您似乎正在实施“重叠保存”而不是“重叠添加”。在这种情况下，您不要对输入进行零填充，而是对输入的两个完整帧进行 FFT。

重叠添加并不是那么复杂，但在缓冲、索引、处理 DC 和 Nyquist 的实际值等方面有很多细节。

我建议标准的调试程序。从具有已知答案的简单信号开始，而不是增加复杂性直到出现问题。你可以单步到达中断发生的位置。

添加一些断言：检查所有光谱是否共轭对称，逆 FFT 的实部是否实际上为 0（或足够小）等。
从信号和脉冲响应 (IR) 的单位脉冲开始
将每个单位脉冲延迟不到半帧
延迟它，单位脉冲我超过半帧但不到一帧
将单位脉冲放入第一帧的最后一个样本
将它们延迟一帧以上
在 IR 的第一帧中再添加两个单位脉冲
在多个帧上传播 IR 单元脉冲
也向信号添加一些单位脉冲。
使用单位脉冲作为 IR，使用正弦波作为信号。确保周期不是帧大小的整数除法器。
使用正弦波信号重复步骤 3-8

我建议为所有这些情况编写自动化单元测试。在所有测试用例中，正确的结果都可以很容易地手动计算出来。您还可以使用 Matlab、Octave 或 Python 作为参考。

在您通过当前步骤的测试之前，不要进入下一步。进行更改后，重新运行所有测试以确保您没有破坏其他任何内容。

如果您成功完成了第 11 步，您的代码可能工作正常，因此中间会出现问题。您的下一步取决于它究竟在哪里中断，但该过程应该缩小范围。

其它你可能感兴趣的问题

上一篇使用 scipy.signal.butter 设计带通时，SOS 矩阵的顺序与给定参数不对应下一篇什么可能导致 FPGA 实现的 FFT 和 numpy.fft.rfft 之间的相位差