• Petr Tesarik's avatar
    swiotlb: optimize get_max_slots() · d069ed28
    Petr Tesarik authored
    Use a simple logical shift and increment to calculate the number of slots
    taken by the DMA segment boundary.
    
    At least GCC-13 is not able to optimize the expression, producing this
    horrible assembly code on x86:
    
    	cmpq	$-1, %rcx
    	je	.L364
    	addq	$2048, %rcx
    	shrq	$11, %rcx
    	movq	%rcx, %r13
    .L331:
    	// rest of the function here...
    
    	// after function epilogue and return:
    .L364:
    	movabsq $9007199254740992, %r13
    	jmp	.L331
    
    After the optimization, the code looks more reasonable:
    
    	shrq	$11, %r11
    	leaq	1(%r11), %rbx
    Signed-off-by: default avatarPetr Tesarik <petr.tesarik.ext@huawei.com>
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    d069ed28
swiotlb.c 45.6 KB