We flush these closures only when the connection goes IDLE.
This will cause no completion being sent, if we have a continuous
stream of bytes that never stops, causing a memory bloat because
we never call the callbacks of the ops.
For example, we use 100s of GiB of memory after a minute of exchanging
1MiB RPCs with callback API.
This patch runs the closures when we have done running
one write action.
After this change memory remains stable for the 1MiB benchmark.
QPS is increased by 200 QPS (520 -> 749), and latency is dropped
by 70ms, because we were basically page-faulting on every RPC.