Commits · 0bbed3beb4f208eb7771607e67586149d70be8d0 · Kirill Smelkov / linux

25 Jul, 2002 1 commit

[PATCH] Thread-Local Storage (TLS) support · 0bbed3be

Ingo Molnar authored Jul 25, 2002

the following patch implements proper x86 TLS support in the Linux kernel,
via a new system-call, sys_set_thread_area():

   http://redhat.com/~mingo/tls-patches/tls-2.5.28-C6

a TLS test utility can be downloaded from:

    http://redhat.com/~mingo/tls-patches/tls_test.c

what is TLS? Thread Local Storage is a concept used by threading
abstractions - fast an efficient way to store per-thread local (but not
on-stack local) data. The __thread extension is already supported by gcc.

proper TLS support in compilers (and glibc/pthreads) is a bit problematic
on the x86 platform. There's only 8 general purpose registers available,
so on x86 we have to use segments to access the TLS. The approach used by
glibc so far was to set up a per-thread LDT entry to describe the TLS.
Besides the generic unrobustness of LDTs, this also introduced a limit:
the maximum number of LDT entries is 8192, so the maximum number of
threads per application is 8192.

this patch does it differently - the kernel keeps a specific per-thread
GDT entry that can be set up and modified by each thread:

     asmlinkage int sys_set_thread_area(unsigned int base,
               unsigned int limit, unsigned int flags)

the kernel, upon context-switch, modifies this GDT entry to match that of
the thread's TLS setting. This way user-space threaded code can access
per-thread data via this descriptor - by using the same, constant %gs (or
%gs) selector. The number of TLS areas is unlimited, and there is no
additional allocation overhead associated with TLS support.


the biggest problem preventing the introduction of this concept was
Linux's global shared GDT on SMP systems. The patch fixes this by
implementing a per-CPU GDT, which is also a nice context-switch speedup,
2-task lat_ctx context-switching got faster by about 5% on a dual Celeron
testbox. [ Could it be that a shared GDT is fundamentally suboptimal on
SMP? perhaps updating the 'accessed' bit in the DS/CS descriptors causes
some sort locked memory cycle overhead? ]

the GDT layout got simplified:

 *   0 - null
 *   1 - Thread-Local Storage (TLS) segment
 *   2 - kernel code segment
 *   3 - kernel data segment
 *   4 - user code segment              <==== new cacheline
 *   5 - user data segment
 *   6 - TSS
 *   7 - LDT
 *   8 - APM BIOS support               <==== new cacheline
 *   9 - APM BIOS support
 *  10 - APM BIOS support
 *  11 - APM BIOS support
 *  12 - PNPBIOS support                <==== new cacheline
 *  13 - PNPBIOS support
 *  14 - PNPBIOS support
 *  15 - PNPBIOS support
 *  16 - PNPBIOS support                <==== new cacheline
 *  17 - not used
 *  18 - not used
 *  19 - not used

set_thread_area() currently recognizes the following flags:

  #define TLS_FLAG_LIMIT_IN_PAGES         0x00000001
  #define TLS_FLAG_WRITABLE               0x00000002
  #define TLS_FLAG_CLEAR                  0x00000004

- in theory we could avoid the 'limit in pages' bit, but i wanted to
  preserve the flexibility to potentially enable the setting of
  byte-granularity stack segments for example. And unlimited segments
  (granularity = pages, limit = 0xfffff) might have a performance
  advantage on some CPUs. We could also automatically figure out the best
  possible granularity for a given limit - but i wanted to avoid this kind
  of guesswork. Some CPUs might have a plus for page-limit segments - who
  knows.

- The 'writable' flag is straightforward and could be useful to some
  applications.

- The 'clear' flag clears the TLS. [note that a base 0 limit 0 TLS is in
  fact legal, it's a single-byte segment at address 0.]

(the system-call does not expose any other segment options to user-space,
priviledge level is 3, the segment is 32-bit, etc. - it's using safe and
sane defaults.)

NOTE: the interface does not allow the changing of the TLS of another
thread on purpose - that would just complicate the interface (and
implementation) unnecesserily. Is there any good reason to allow the
setting of another thread's TLS?

NOTE2: non-pthreads glibc applications can call set_thread_area() to set
up a GDT entry just below the end of stack. We could use some sort of
default TLS area as well, but that would hard-code a given segment.

0bbed3be

24 Jul, 2002 37 commits

Linux v2.5.28 · 2c0a3925
Linus Torvalds authored Jul 24, 2002

2c0a3925
Merge http://linuxusb.bkbits.net/linus-2.5 · 0cd3abca
Linus Torvalds authored Jul 23, 2002
```
into penguin.transmeta.com:/home/penguin/torvalds/repositories/kernel/linux
```
0cd3abca

[PATCH] scheduler fixes · 97db62cc

Ingo Molnar authored Jul 23, 2002

 - introduce new type of context-switch locking, this is a must-have for
   ia64 and sparc64.

 - load_balance() bug noticed by Scott Rhine and myself: scan the
   whole list to find imbalance number of tasks, not just the tail
   of the list.

 - sched_yield() fix: use current->array not rq->active.

97db62cc

Merge bk://linuxusb.bkbits.net/pci_hp-2.5 · 9e7cec88
Linus Torvalds authored Jul 23, 2002
```
into penguin.transmeta.com:/home/penguin/torvalds/repositories/kernel/linux
```
9e7cec88
Merge http://linuxusb.bkbits.net/agpgart-2.5 · b73a365a
Linus Torvalds authored Jul 23, 2002
```
into penguin.transmeta.com:/home/penguin/torvalds/repositories/kernel/linux
```
b73a365a
Merge kroah.com:/home/greg/linux/BK/bleeding_edge-2.5 · 01bb3440
Greg Kroah-Hartman authored Jul 23, 2002
```
into kroah.com:/home/greg/linux/BK/pci_hp-2.5
```
01bb3440
Merge bk://bkbits.ras.ucalgary.ca/rgooch-2.5 · a26a5ed8
Linus Torvalds authored Jul 23, 2002
```
into penguin.transmeta.com:/home/penguin/torvalds/repositories/kernel/linux
```
a26a5ed8
No commit message · 0d3555f1
Richard Gooch authored Jul 24, 2002
```
No commit message
```
0d3555f1
Merge atnf.csiro.au:/workaholix1/kernel/v2.5/linus · c8f7cc1e
Richard Gooch authored Jul 24, 2002
```
into atnf.csiro.au:/workaholix1/kernel/v2.5/rgooch-2.5
```
c8f7cc1e

Switched to ISO C structure field initialisers. · 2c4b185c

Richard Gooch authored Jul 24, 2002

Switch to set_current_state() and move before add_wait_queue().
Updated README from master HTML file.
Fixed devfs entry leak in <devfs_readdir> when *readdir fails.

2c4b185c

[PATCH] consolidate task->mm code + fix · 3b89dbbd

John Levon authored Jul 23, 2002

The patch below consolidates some duplicate code, reduces some
indentation, and adds a freeing of a page in mem_read() that could be left
unfreed, as far as I can see.

3b89dbbd

Merge bk://bk.arm.linux.org.uk:14691 · d4ea8ebe
Linus Torvalds authored Jul 23, 2002
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
d4ea8ebe
Merge http://fbdev.bkbits.net/fbdev-2.5 · 82e6c293
Linus Torvalds authored Jul 23, 2002
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
82e6c293
Merge master.kernel.org:/home/axboe/BK/linux-2.5-block · 6885f788
Linus Torvalds authored Jul 23, 2002
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
6885f788
add __blk_stop_queue() as locked variant of blk_stop_queue() and · eeda2160
Jens Axboe authored Jul 24, 2002
```
make cpqarray and cciss use these
```
eeda2160
[PATCH] NFSD - new struct initialisers for nfsd · 44ce4c74
Neil Brown authored Jul 23, 2002
```
Heading Rusty off at the pass...

This also changes and array initialiser...
```
44ce4c74
[PATCH] MD - Remove get_spare declaration and associated warning · 3d49915a
Neil Brown authored Jul 23, 2002
```
get_spare recently became static and no-one told md_k.h
```
3d49915a
[PATCH] MD - Convert struct initialised in md to "the new way" · 06a61909
Neil Brown authored Jul 23, 2002

06a61909

[PATCH] MD - Fix two bugs that would cause sync_sbs to Oops · d1cde62a

Neil Brown authored Jul 23, 2002

Sync_sbs tries to access the ->sb for the first rdev of an mddev.
This can oops as the wrong arg is given to list_entry, and also
if a define was faound to be failed, as failed devices have their ->sb
removed.  But that removal isn't necessary, so now an rdev will always
have an ->sb.

d1cde62a

[PATCH] type safe(r) list_entry repacement: container_of · ec4f2142

Neil Brown authored Jul 23, 2002

Define container_of which cast from member to struct with some type checking.

This is much like list_entry but is cearly for things other than lists.

List_entry now uses container_of.

ec4f2142

[PATCH] USB: rtl8150 updated · f5dbd9a0
Petko Manolov authored Jul 23, 2002
```
  new vendor/device ID;
  redundant check removed from probe();
```
f5dbd9a0
Merge bk://jfs.bkbits.net/linux-2.5 · 6f84f62a
Linus Torvalds authored Jul 23, 2002
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
6f84f62a

[PATCH] drivers/hotplug designated initializers · a36a2ed3

Rusty Russell authored Jul 23, 2002

 The old form of designated initializers are obsolete: we need to
 replace them with the ISO C forms before 2.6.  Gcc has always supported
 both forms anyway.

a36a2ed3

[PATCH] shmem_getpage_locked missing unlock · 4a6fdb2d

Hugh Dickins authored Jul 23, 2002

Dawson Engler's Stanford Checker reported this missing unlock to
LKML 11 July (amongst "56 potential lock/unlock bugs in 2.5.8").

4a6fdb2d

[PATCH] shmem_file_write double kunmap · ee9e4c9c

Hugh Dickins authored Jul 23, 2002

Found by Simon Trimmer <simon@veritas.com>: shmem_file_write
failure path duplicates kunmap, causing oops holding kmap_lock.

ee9e4c9c

[PATCH] shmem_link duplicated test · 3489f24f

Hugh Dickins authored Jul 23, 2002

Trivial: vfs_link in 2.5 checks S_ISDIR first, shmem_link
need not repeat it, but test crept back in at some stage.

3489f24f

[PATCH] shm_destroy lock hang · 960d4b34

Hugh Dickins authored Jul 23, 2002

Martin Schwidefsky <schwidefsky@de.ibm.com> reported "Bug with shared
memory" to LKML 14 May: hang due to schedule in truncate_list_pages
called from .... shm_destroy holding shm_lock spinlock. shm_destroy
needs that lock for shm_rmid, but it can be safely unlocked once link
from id to shp has been removed.

960d4b34

Fix up irqlock removal patch, avoid compiler warnings · 7e9b34ab
Linus Torvalds authored Jul 23, 2002

7e9b34ab

[PATCH] IDE-101 · 02114f71

Martin Dalecki authored Jul 23, 2002

Here is a quick fix. I would like to synchronize with the irq handler
changes as well. Becouse right now I know that preemption is killing
the disk subsystem when moving data between disks using different
request queues... In esp. It get's me in to do_request() with a queue
in unplugged state. (Not everything is my fault, after all :-).

02114f71

Merge jfs@jfs.bkbits.net:linux-2.5 · 2b0c7536
Dave Kleikamp authored Jul 23, 2002
```
into kleikamp.austin.ibm.com:/home/shaggy/bk/jfs-2.5
```
2b0c7536
Remove unused variable · 10f024fd
Linus Torvalds authored Jul 23, 2002

10f024fd

[PATCH] irqlock patch 2.5.27-H6 · a6efb709

Ingo Molnar authored Jul 23, 2002

 - init thread needs to have preempt_count of 1 until sched_init().
   (William Lee Irwin III)
 - clean up the irq-mask macros. (Linus)
 - add barrier() to irq_enter() and irq_exit(). (based on Oleg Nesterov's
   comment.)
 - move the irqs-off check into preempt_schedule() and remove
   CONFIG_DEBUG_IRQ_SCHEDULE.
 - remove spin_unlock_no_resched() and comment the affected places more
   agressively.
 - slab.c needs to spin_unlock_no_resched(), instead of spin_unlock(). (It
   also has to check for preemption in the right spot.) This should fix
   the memory corruption.
 - irq_exit() needs to run softirqs if interrupts not active - in the
   previous patch it ran them when preempt_count() was 0, which is
   incorrect.
 - spinlock macros are updated to enable preemption after enabling
   interrupts. Besides avoiding false positive warnings, this also
 - fork.c has to call scheduler_tick() with preemption disabled -
   otherwise scheduler_tick()'s spin_unlock can preempt!
 - irqs_disabled() macro introduced.
 - [ all other local_irq_enable() or sti instances conditional on
     CONFIG_DEBUG_IRQ_SCHEDULE are to fix false positive warnings. ]
 - fix buggy in_softirq(). Fortunately the bug made the test broader,
   which didnt result in algorithmical breakage, just suboptimal
   performance.
 - move do_softirq() processing into irq_exit() => this also fixes the
   softirq processing bugs present in apic.c IRQ handlers that did not
   test for softirqs after irq_exit().
 - simplify local_bh_enable().

a6efb709

[PATCH] AGP designated initializer update. · 5e88f91f

Rusty Russell authored Jul 23, 2002

The old form of designated initializers are obsolete: we need to
replace them with the ISO C forms before 2.6.  Gcc has always supported
both forms anyway.

5e88f91f

[PATCH] USB: changed the interface name to be a bit more unique. · a7e9ed60

Greg Kroah-Hartman authored Jul 23, 2002

This is needed as long as we have the directory of symlinks in the bus
subdir in driverfs to point to the unique interfaces.

a7e9ed60

Synced up to m68k changes · 0fec01ba
James Simmons authored Jul 23, 2002

0fec01ba
Use local files for now. Then add in m68k changes · 89863ff4
James Simmons authored Jul 23, 2002

89863ff4
Merge http://linus.bkbits.net/linux-2.5 · 9b4651a2
James Simmons authored Jul 23, 2002
```
into maxwell.earthlink.net:/usr/src/linus-2.5
```
9b4651a2

23 Jul, 2002 2 commits
- [SERIAL] Fix sa1100 serial driver stop function parameters. · 1c36ac5d
  Russell King authored Jul 24, 2002
  
  1c36ac5d
- Removed all old fbgen code. Small cleanups. · 928c6224
  James Simmons authored Jul 23, 2002
  
  928c6224