My notes for the paper: Adaptive Computation Time for Recurrent Neural Networks1.
Additive vs multiplicative halting probability Multiplicative: In the paper (footnote 1), the authors discuss throughly their considerations for deciding the computation time.