我正在做一些关于 BatchNormalization 的研究: https ://towardsdatascience.com/batch-normalization-8a2e585775c9
在文章中,它说:
Using batch normalization allows us to use much higher learning rates, which further increases the speed at which networks train.
任何人都可以分享他们对为什么批量标准化允许更高的学习率的想法吗?谢谢!