cgroup: Avoid false cacheline sharing of read mostly rstat_cpu
The rstat_cpu and also rstat_css_list of the cgroup structure are read mostly variables. However, they may share the same cacheline as the subsequent rstat_flush_next and *bstat variables which can be updated frequently. That will slow down the cgroup_rstat_cpu() call which is called pretty frequently in the rstat code. Add a CACHELINE_PADDING() line in between them to avoid false cacheline sharing. A parallel kernel build on a 2-socket x86-64 server is used as the benchmarking tool for measuring the lock hold time. Below were the lock hold time frequency distribution before and after the patch: Run time Before patch After patch -------- ------------ ----------- 0-01 us 9,928,562 9,820,428 01-05 us 110,151 50,935 05-10 us 270 93 10-15 us 273 146 15-20 us 135 76 20-25 us 0 2 25-30 us 1 0 It can be seen that the patch further pushes the lock hold time towards the lower end. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
Showing
Please register or sign in to comment