• Rusty Russell's avatar
    lguest: have example Launcher service all devices in separate threads · 659a0e66
    Rusty Russell authored
    Currently lguest has three threads: the main Launcher thread, a Waker
    thread, and a thread for the block device (because synchronous block
    was simply too painful to bear).
    
    The Waker selects() on all the input file descriptors (eg. stdin, net
    devices, pipe to the block thread) and when one becomes readable it calls
    into the kernel to kick the Launcher thread out into userspace, which
    repeats the poll, services the device(s), and then tells the kernel to
    release the Waker before re-entering the kernel to run the Guest.
    
    Also, to make a slightly-decent network transmit routine, the Launcher
    would suppress further network interrupts while it set a timer: that
    signal handler would write to a pipe, which would rouse the Waker
    which would prod the Launcher out of the kernel to check the network
    device again.
    
    Now we can convert all our virtqueues to separate threads: each one has
    a separate eventfd for when the Guest pokes the device, and can trigger
    interrupts in the Guest directly.
    
    The linecount shows how much this simplifies, but to really bring it
    home, here's an strace analysis of single Guest->Host ping before:
    
    * Guest sends packet, notifies xmit vq, return control to Launcher
    * Launcher clears notification flag on xmit ring
    * Launcher writes packet to TUN device
    	writev(4, [{"\0\0\0\0\0\0\0\0\0\0", 10}, {"\366\r\224`\2058\272m\224vf\274\10\0E\0\0T\0\0@\0@\1\265"..., 98}], 2) = 108
    * Launcher sets up interrupt for Guest (xmit ring is empty)
    	write(10, "\2\0\0\0\3\0\0\0", 8) = 0
    * Launcher sets up timer for interrupt mitigation
    	setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 505}}, NULL) = 0
    * Launcher re-runs guest
    	pread64(10, 0xbfa5f4d4, 4, 0) ...
    * Waker notices reply packet in tun device (it was in select)
    	select(12, [0 3 4 6 11], NULL, NULL, NULL) = 1 (in [4])
    * Waker kicks Launcher out of guest:
    	pwrite64(10, "\3\0\0\0\1\0\0\0", 8, 0) = 0
    * Launcher returns from running guest:
    	... = -1 EAGAIN (Resource temporarily unavailable)
    * Launcher looks at input fds:
    	select(7, [0 3 4 6], NULL, NULL, {0, 0}) = 1 (in [4], left {0, 0})
    * Launcher reads pong from tun device:
    	readv(4, [{"\0\0\0\0\0\0\0\0\0\0", 10}, {"\272m\224vf\274\366\r\224`\2058\10\0E\0\0T\364\26\0\0@"..., 1518}], 2) = 108
    * Launcher injects guest notification:
    	write(10, "\2\0\0\0\2\0\0\0", 8) = 0
    * Launcher rechecks fds:
    	select(7, [0 3 4 6], NULL, NULL, {0, 0}) = 0 (Timeout)
    * Launcher clears Waker:
    	pwrite64(10, "\3\0\0\0\0\0\0\0", 8, 0) = 0
    * Launcher reruns Guest:
    	pread64(10, 0xbfa5f4d4, 4, 0) = ? ERESTARTSYS (To be restarted)
    * Signal comes in, uses pipe to wake up Launcher:
    	--- SIGALRM (Alarm clock) @ 0 (0) ---
    	write(8, "\0", 1)       = 1
    	sigreturn()             = ? (mask now [])
    * Waker sees write on pipe:
    	select(12, [0 3 4 6 11], NULL, NULL, NULL) = 1 (in [6])
    * Waker kicks Launcher out of Guest:
    	pwrite64(10, "\3\0\0\0\1\0\0\0", 8, 0) = 0
    * Launcher exits from kernel:
    	pread64(10, 0xbfa5f4d4, 4, 0) = -1 EAGAIN (Resource temporarily unavailable)
    * Launcher looks to see what fd woke it:
    	select(7, [0 3 4 6], NULL, NULL, {0, 0}) = 1 (in [6], left {0, 0})
    * Launcher reads timeout fd, sets notification flag on xmit ring
    	read(6, "\0", 32)       = 1
    * Launcher rechecks fds:
    	select(7, [0 3 4 6], NULL, NULL, {0, 0}) = 0 (Timeout)
    * Launcher clears Waker:
    	pwrite64(10, "\3\0\0\0\0\0\0\0", 8, 0) = 0
    * Launcher resumes Guest:
    	pread64(10, "\0p\0\4", 4, 0) ....
    
    strace analysis of single Guest->Host ping after:
    
    * Guest sends packet, notifies xmit vq, creates event on eventfd.
    * Network xmit thread wakes from read on eventfd:
    	read(7, "\1\0\0\0\0\0\0\0", 8)          = 8
    * Network xmit thread writes packet to TUN device
    	writev(4, [{"\0\0\0\0\0\0\0\0\0\0", 10}, {"J\217\232FI\37j\27\375\276\0\304\10\0E\0\0T\0\0@\0@\1\265"..., 98}], 2) = 108
    * Network recv thread wakes up from read on tunfd:
    	readv(4, [{"\0\0\0\0\0\0\0\0\0\0", 10}, {"j\27\375\276\0\304J\217\232FI\37\10\0E\0\0TiO\0\0@\1\214"..., 1518}], 2) = 108
    * Network recv thread sets up interrupt for the Guest
    	write(6, "\2\0\0\0\2\0\0\0", 8) = 0
    * Network recv thread goes back to reading tunfd
    	13:39:42.460285 readv(4,  <unfinished ...>
    * Network xmit thread sets up interrupt for Guest (xmit ring is empty)
    	write(6, "\2\0\0\0\3\0\0\0", 8) = 0
    * Network xmit thread goes back to reading from eventfd
    	read(7, <unfinished ...>
    Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
    659a0e66
lguest.c 52.7 KB