您需要的是滤除较低的频率:您可以使用高通滤波器来做到这一点。
由于您实际上并不需要在非常窄的频带中从“非常阻塞”到“完全让所有东西通过”的过滤器,因此该过滤器将非常短(这意味着每个样本只需要很少的乘法即可计算其输出。
我已经安装pyfda
了一个快速而肮脏的设计¹,使用与您相同的采样率、12 kHz 的截止频率和 4 kHz 的过渡宽度:
就“这个过滤器实际上看起来如何”而言:
b = [0.04889, -0.1332, 0.2142, -0.2142, 0.1332, -0.04889]
a = [1, 1.217, 1.996, 1.56, 0.9704, 0.3965 ]
就是这样。那就是过滤器。
你可以这样应用它:注意这段代码是徒手编写的,甚至没有通过 Python 运行来检查语法上的非法拼写错误。换句话说,您必须对其进行测试。
from scipy import signal
import numpy
THRESHOLD = 3000
# … the rest of your code up to the while-loop:
"""
Your filter: these are the coefficients for the feedforward part (b) and feedback part (a) of an IIR filter. I got these from pyfda.
"""
b = [0.04889, -0.1332, 0.2142, -0.2142, 0.1332, -0.04889]
a = [1, 1.217, 1.996, 1.56, 0.9704, 0.3965 ]
"""
This is yet another filter, a low-pass filter used for smoothing.
Design specs were fs=44100 Hz, f_bp = 60 Hz, f_sb = 100 Hz,
A_bp = 3 dB, A_sb = 30 db, it's also an elliptic IIR.
The 60 Hz cutoff should get rid of anything that's shorter than
ca 1/60 of a second – we don't care about things shorter than that,
was my assumption.
"""
b_lpf = [0.0005067106574837601, -0.000506612969002681, -0.0005066129690026811, 0.00050671065748376]
a_lpf = [1.0 , -2.9949235962214953, 2.9899182657208225, -0.9949944741223649]
"""
Initialize the filter state so that it starts "eingeschwungen" (in steady state)
This is optional, but without it, chances are high you always trigger when you first enter the while loop.
"""
filter_state = signal.lfilter_zi(b, a)
filter_state_lpf = signal.lfilter_zi(b_lpf, a_lpf)
left_to_record = 0
wavfile = None
leftover_data = []
while nighttime:
data=stream.read(CHUNK)
data_chunk=array('h',data)
"""
Filter your signal!
Afterwards, it doesn't contain lower frequencies anymore
"""
signal_filtered, filter_state = signal.lfilter(b, a, data_chunk, zi=filter_state)
"""
Convert amplitude to square amplitude, which is proportional to power,
which is proportional to volume.
"""
power = numpy.multiply(signal_filtered, signal_filtered)
"""
Now, we'll also need to filter this – but this time with
a low-pass filter. We'll use the second filter we've designed:
"""
power_filtered, filter_state = signal.lfilter(b_lpf, a_lpf, power, zi=filter_state_lpf)
"""
Fun fact! We've now reduced the *actually occupied spectrum*
significantly at the output of this rather hefty filter.
That means of the 22.050 kHz that this signal originally could
represent (says Nyquist), only maybe 1/200 actually still
contain 'change'. I.e. we can now more than safely simply ignore
99 out of 100 samples - the others are definitely just nice and
smoothly interpolated between these.
Now. That saves *a lot* of calculation. We only need to find the
maximum on that subset! The real maximum power can't be much
higher than what we see there.
"""
maximum_power = max(power_filtered[::100])
if maximum_power >= THRESHOLD:
left_to_record = RECORD_SECONDS * RATE // CHUNK * CHUNK
# I'm **REALLY** not versed with wavfile, and
# I think you shouldn't be using that, but pysndfile
# instead to write losslessly compressed (FLAC) audio
# instead. Anyways, I haven't tried this, so debugging
# is on you, sorry :)
if not wavfile:
timestring = time.strftime("%Y%m%d-%H%M%S")
filename = f"RECORDING-{timestring}.wav"
wavfile=wave.open(filename,'wb')
wavfile.setnchannels(CHANNELS)
wavfile.setsampwidth(audio.get_sample_size(FORMAT))
wavfile.setframerate(RATE)
if left_to_record > 0:
"""
Write available full chunks!
the // CHUNK * CHUNK trick just rounds down to the
next multiple of CHUNK
"""
write_now = (len(leftover_data) + len(data)) // CHUNK * CHUNK
write_now = min(write_now, left_to_record)
write_from_leftover = min(write_now, len(leftover_data))
if write_from_leftover:
wavfile.writeframes(bytes(leftover_data[:write_from_leftover]))
leftover_data = [write_from_leftover]
left_to_record -= write_from_leftover
write_now -= write_from_leftover
wavfile.writeframes(bytes(data[:write_now]))
left_to_record -= write_now
if left_to_record < CHUNK:
wavfile.close()
wavfile = None
else:
leftover_data = data[:write_now]
明显的错误:
高通和低通滤波器都有我们所说的群延迟。所以,触发器是在声音开始之后才出现的。那不好吗?不,我认为延迟时间少于 50 毫秒,您可能会接受。而且,我们实际上并没有在触发时捕捉到 5 秒,而是在帧开始时捕捉到 5 秒,所以平均而言,我们实际上会早一点。
事实上,如果在一个记录时间内出现新的超阈值音量,这实际上是为了“保持记录活动”另外 5 秒。我不确定这是你的意图,但我认为它是。
¹ 设计参数:
- 高通
- IIR(因为我们不关心相位或精度,这只是为了检测活动,所以它应该是省力的!IIR 需要更少的计算来获得相同的陡度,但它们不如 FIR 好信号)
- 椭圆(在快速抑制阻带的能力和仍然具有可靠通带之间的良好权衡)
- 最小阶数(我们想要通过这些规范获得的“最便宜”的过滤器,而不是我们为计算复杂度支付的“价格”获得的最佳过滤器)
- 频率单位 kHz, ,fs=44.1
- fSB(阻带结束的频率)8
- ASB阻带的最小衰减(以分贝为单位):50
- fPB(通带开始的频率)11
- APB我们在通带中允许的纹波,即通带允许的不平坦程度(分贝):2.85
- fC 11
² 第二个滤波器,低通滤波器,如下所示: