Batching heuristics are used in multiple layers of the TCP/IP stack, attempting to improve performance by amortizing overheads. When defining "performance" as average latency and throughput, optimal batching decisions can be infeasible if application-perceived end-to-end performance is unknown, which is commonly the case in general-purpose setups. We propose to address this problem by occasionally adding several easily-maintained counters to TCP metadata exchanges and using them to estimate end-to-end performance via Little's law.
We contend that this approach can yield reasonable estimates without application support, and that applications can optionally further improve accuracy through a trivial interface we introduce. We demonstrate experimentally that such estimates have the potential to significantly improve the quality of batching decisions, extending Redis's range of sustainable throughput with tolerable latencies by more than 1.6x and improving the latencies within this range by up to 3.9x.