• Andrew Jones's avatar
    RISC-V: Use Zicboz in clear_page when available · ab0f7746
    Andrew Jones authored
    Using memset() to zero a 4K page takes 563 total instructions, where
    20 are branches. clear_page(), with Zicboz and a 64 byte block size,
    takes 169 total instructions, where 4 are branches and 33 are nops.
    Even though the block size is a variable, thanks to alternatives, we
    can still implement a Duff device without having to do any preliminary
    calculations. This is achieved by using the alternatives' cpufeature
    value (the upper 16 bits of patch_id). The value used is the maximum
    zicboz block size order accepted at the patch site. This enables us
    to stop patching / unrolling when 4K bytes have been zeroed (we would
    loop and continue after 4K if the page size would be larger)
    
    For 4K pages, unrolling 16 times allows block sizes of 64 and 128 to
    only loop a few times and larger block sizes to not loop at all. Since
    cbo.zero doesn't take an offset, we also need an 'add' after each
    instruction, making the loop body 112 to 160 bytes. Hopefully this
    is small enough to not cause icache misses.
    Signed-off-by: default avatarAndrew Jones <ajones@ventanamicro.com>
    Acked-by: default avatarConor Dooley <conor.dooley@microchip.com>
    Link: https://lore.kernel.org/r/20230224162631.405473-7-ajones@ventanamicro.comSigned-off-by: default avatarPalmer Dabbelt <palmer@rivosinc.com>
    ab0f7746
insn-def.h 5.6 KB