• Linus Torvalds's avatar
    x86: re-introduce support for ERMS copies for user space accesses · 47ee3f1d
    Linus Torvalds authored
    I tried to streamline our user memory copy code fairly aggressively in
    commit adfcf423 ("x86: don't use REP_GOOD or ERMS for user memory
    copies"), in order to then be able to clean up the code and inline the
    modern FSRM case in commit 577e6a7f ("x86: inline the 'rep movs' in
    user copies for the FSRM case").
    
    We had reports [1] of that causing regressions earlier with blogbench,
    but that turned out to be a horrible benchmark for that case, and not a
    sufficient reason for re-instating "rep movsb" on older machines.
    
    However, now Eric Dumazet reported [2] a regression in performance that
    seems to be a rather more real benchmark, where due to the removal of
    "rep movs" a TCP stream over a 100Gbps network no longer reaches line
    speed.
    
    And it turns out that with the simplified the calling convention for the
    non-FSRM case in commit 427fda2c ("x86: improve on the non-rep
    'copy_user' function"), re-introducing the ERMS case is actually fairly
    simple.
    
    Of course, that "fairly simple" is glossing over several missteps due to
    having to fight our assembler alternative code.  This code really wanted
    to rewrite a conditional branch to have two different targets, but that
    made objtool sufficiently unhappy that this instead just ended up doing
    a choice between "jump to the unrolled loop, or use 'rep movsb'
    directly".
    
    Let's see if somebody finds a case where the kernel memory copies also
    care (see commit 68674f94: "x86: don't use REP_GOOD or ERMS for
    small memory copies").  But Eric does argue that the user copies are
    special because networking tries to copy up to 32KB at a time, if
    order-3 pages allocations are possible.
    
    In-kernel memory copies are typically small, unless they are the special
    "copy pages at a time" kind that still use "rep movs".
    
    Link: https://lore.kernel.org/lkml/202305041446.71d46724-yujie.liu@intel.com/ [1]
    Link: https://lore.kernel.org/lkml/CANn89iKUbyrJ=r2+_kK+sb2ZSSHifFZ7QkPLDpAtkJ8v4WUumA@mail.gmail.com/ [2]
    Reported-and-tested-by: default avatarEric Dumazet <edumazet@google.com>
    Fixes: adfcf423 ("x86: don't use REP_GOOD or ERMS for user memory copies")
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    47ee3f1d
copy_user_64.S 2.69 KB