- 23 Jul, 2013 40 commits
-
-
Peter Hurley authored
Separate the head and tail ptrs to avoid cache-line contention (so called 'false-sharing') between concurrent threads. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Now that dropping the buffer lock is not necessary (as result of converting the spin lock to a mutex), the flip buffer flush no longer needs to be handled by the buffer work. Simply signal a flush is required; the buffer work will exit the i/o loop, which allows tty_buffer_flush() to proceed. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
The buffer work may race with parallel tty_buffer_flush. Use a mutex to guarantee exclusive modify access to the head flip buffer. Remove the unneeded spin lock. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Driver-side flip buffer input is already single-threaded; 'publish' the .next link as the last operation on the tail buffer so the 'consumer' sees the already-completed flip buffer. The commit buffer index is already 'published' by driver-side functions. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Lockless flip buffers require atomically updating the bytes-in-use watermark. The pty driver also peeks at the watermark value to limit memory consumption to a much lower value than the default; query the watermark with new fn, tty_buffer_space_avail(). Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Use a 0-sized sentinel to avoid assigning the head ptr from the driver side thread. This also eliminates testing head/tail for NULL. When the sentinel is first 'consumed' by the buffer work (or by tty_buffer_flush()), it is detached from the list but not freed nor added to the free list. Both buffer work and tty_buffer_flush() continue to preserve at least 1 flip buffer to which head & tail is pointed. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
In preparation for lockless flip buffers, make the flip buffer free list lockless. NB: using llist is not the optimal solution, as the driver and buffer work may contend over the llist head unnecessarily. However, test measurements indicate this contention is low. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
tty_buffer_find() implements a simple free list lookaside cache. Merge this functionality into tty_buffer_alloc() to reflect the more traditional alloc/free symmetry. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Factor shared code; prepare for adding 0-sized sentinel flip buffer. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Since flip buffers are size-aligned to 256 bytes and all flip buffers 512-bytes or larger are not added to the free list, the free list only contains 256-byte flip buffers. Remove the list search when allocating a new flip buffer. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
The char_buf_ptr and flag_buf_ptr values are trivially derived from the .data field offset; compute values as needed. Fixes a long-standing type-mismatch with the char and flag ptrs. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Scheduling buffer work on the same cpu as the read() thread limits the parallelism now possible between the receive_buf path and the n_tty_read() path. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
The pty driver forces ldisc flow control on, regardless of available receive buffer space, so the writer can be woken whenever unthrottle is called. However, this 'forced throttle' has performance consequences, as multiple atomic operations are necessary to unthrottle and perform the write wakeup for every input line (in canonical mode). Instead, short-circuit the unthrottle if the tty is a pty and perform the write wakeup directly. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Prepare to special case pty flow control; avoid forward declaration. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Prepare for special handling of pty throttle/unthrottle; factor flow control into helper functions. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Prepare to factor throttle and unthrottle into helper functions; relocate chars_in_buffer() to avoid forward declaration. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
No tty driver modifies termios during throttle() or unthrottle(). Therefore, only read safety is required. However, tty_throttle_safe and tty_unthrottle_safe must still be mutually exclusive; introduce throttle_mutex for that purpose. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
If the read buffer indices are in the same cache-line, cpus will contended over the cache-line (so called 'false sharing'). Separate the producer-published fields from the consumer-published fields; document the locks relevant to each field. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
User-space read() can run concurrently with receiving from device; waiting for receive_buf() to complete is not required. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
lnext escapes the next input character as a literal, and must be reset when canonical mode changes (to avoid misinterpreting a special character as a literal if canonical mode is changed back again). lnext is specifically not reset on a buffer flush so as to avoid misinterpreting the next input character as a special character. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
n_tty has a single-producer/single-consumer input model; use lockless publish instead. Use termios_rwsem to exclude both consumer and producer while changing or resetting buffer indices, eg., when flushing. Also, claim exclusive termios_rwsem to safely retrieve the buffer indices from a thread other than consumer or producer (eg., TIOCINQ ioctl). Note the read_tail is published _after_ clearing the newline indicator in read_flags to avoid racing the producer. Drop read_lock spinlock. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
canon_data represented the # of lines which had been copied to the receive buffer but not yet copied to the user buffer. The value was tested to determine if input was available in canonical mode (and also to force input overrun if the receive buffer was full but a newline had not been received). However, the actual count was irrelevent; only whether it was non-zero (meaning 'is there any input to transfer?'). This shared count is unnecessary and unsafe with a lockless algorithm. The same check is made by comparing canon_head with read_tail instead. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Use termios_rwsem to guarantee safe access to the termios values. This is particularly important for N_TTY as changing certain termios settings alters the mode of operation. termios_rwsem must be dropped across throttle/unthrottle since those functions claim the termios_rwsem exclusively (to guarantee safe access to the termios and for mutual exclusion). Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
termios is commonly accessed unsafely (especially by N_TTY) because the existing mutex forces exclusive access. Convert existing usage. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Storing the read_cnt creates an unnecessary shared variable between the single-producer (n_tty_receive_buf()) and the single-consumer (n_tty_read()). Compute read_cnt from head & tail instead of storing. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Wrap read_buf indices (read_head, read_tail, canon_head) at max representable value, instead of at the N_TTY_BUF_SIZE. This step is necessary to allow lockless reads of these shared variables (by updating the variables atomically). Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Prepare for replacing read_cnt field with computed value. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
N_TTY .chars_in_buffer() method requires serialized access if the current thread is not the single-consumer, n_tty_read(). Separate the internal interface; prepare for lockless read-side. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Instead of pushing one char per loop, pre-compute the data length to copy and copy all at once. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Simplify n_tty_read(); extract complex copy algorithm into separate function, canon_copy_to_user(). Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Although line discipline receiving is single-producer/single-consumer, using tty->receive_room to manage flow control creates unnecessary critical regions requiring additional lock use. Instead, introduce the optional .receive_buf2() ldisc method which returns the # of bytes actually received. Serialization is guaranteed by the caller. In turn, the line discipline should schedule the buffer work item whenever space becomes available; ie., when there is room to receive data and receive_room() previously returned 0 (the buffer work item stops processing if receive_buf2() returns 0). Note the 'no room' state need not be atomic despite concurrent use by two threads because only the buffer work thread can set the state and only the read() thread can clear the state. Add n_tty_receive_buf2() as the receive_buf2() method for N_TTY. Provide a public helper function, tty_ldisc_receive_buf(), to use when directly accessing the receive_buf() methods. Line disciplines not using input flow control can continue to set tty->receive_room to a fixed value and only provide the receive_buf() method. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Ldisc interface functions must be called with interrupts enabled. Separating the ldisc calls into a helper function simplies the eventual removal of the spinlock. Note that access to the buf->head ptr outside the spinlock is safe here because; * __tty_buffer_flush() is prevented from running while buffer work performs i/o, * tty_buffer_find() only assigns buf->head if the flip buffer list is empty (which is never the case in flush_to_ldisc() since at least one buffer is always left in the list after use) Access to the read index outside the spinlock is safe here for the same reasons. Update the buffer's read index _after_ the data has been received by the ldisc. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
tty_set_ldisc() is guaranteed exclusive use of the line discipline by tty_ldisc_lock_pair_timeout(); shutting off input by resetting receive_room is unnecessary. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
The hangup may already have happened; check for that state also. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Rename o_ldisc to avoid confusion with the ldisc of the 'other' tty. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Line discipline locking was performed with a combination of a mutex, a status bit, a count, and a waitqueue -- basically, a rw semaphore. Replace the existing combination with an ld_semaphore. Fixes: 1) the 'reference acquire after ldisc locked' bug 2) the over-complicated halt mechanism 3) lock order wrt. tty_lock() 4) dropping locks while changing ldisc 5) previously unidentified deadlock while locking ldisc from both linked ttys concurrently 6) previously unidentified recursive deadlocks Adds much-needed lockdep diagnostics. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-