• Andrew Morton's avatar
    [PATCH] cpumask_t: allow more than BITS_PER_LONG CPUs · bf8cb61f
    Andrew Morton authored
    From: William Lee Irwin III <wli@holomorphy.com>
    
    Contributions from:
    	Jan Dittmer <jdittmer@sfhq.hn.org>
    	Arnd Bergmann <arnd@arndb.de>
    	"Bryan O'Sullivan" <bos@serpentine.com>
    	"David S. Miller" <davem@redhat.com>
    	Badari Pulavarty <pbadari@us.ibm.com>
    	"Martin J. Bligh" <mbligh@aracnet.com>
    	Zwane Mwaikambo <zwane@linuxpower.ca>
    
    It has ben tested on x86, sparc64, x86_64, ia64 (I think), ppc and ppc64.
    
    cpumask_t enables systems with NR_CPUS > BITS_PER_LONG to utilize all their
    cpus by creating an abstract data type dedicated to representing cpu
    bitmasks, similar to fd sets from userspace, and sweeping the appropriate
    code to update callers to the access API.  The fd set-like structure is
    according to Linus' own suggestion; the macro calling convention to ambiguate
    representations with minimal code impact is my own invention.
    
    Specifically, a new set of inline functions for manipulating arbitrary-width
    bitmaps is introduced with a relatively simple implementation, in tandem with
    a new data type representing bitmaps of width NR_CPUS, cpumask_t, whose
    accessor functions are defined in terms of the bitmap manipulation inlines.
    This bitmap ADT found an additional use in i386 arch code handling sparse
    physical APIC ID's, which was convenient to use in this case as the
    accounting structure was required to be wider to accommodate the physids
    consumed by larger numbers of cpus.
    
    For the sake of simplicity and low code impact, these cpu bitmasks are passed
    primarily by value; however, an additional set of accessors along with an
    auxiliary data type with const call-by-reference semantics is provided to
    address performance concerns raised in connection with very large systems,
    such as SGI's larger models, where copying and call-by-value overhead would
    be prohibitive.  Few (if any) users of the call-by-reference API are
    immediately introduced.
    
    Also, in order to avoid calling convention overhead on architectures where
    structures are required to be passed by value, NR_CPUS <= BITS_PER_LONG is
    special-cased so that cpumask_t falls back to an unsigned long and the
    accessors perform the usual bit twiddling on unsigned longs as opposed to
    arrays thereof.  Audits were done with the structure overhead in-place,
    restoring this special-casing only afterward so as to ensure a more complete
    API conversion while undergoing the majority of its end-user exposure in -mm.
     More -mm's were shipped after its restoration to be sure that was tested,
    too.
    
    The immediate users of this functionality are Sun sparc64 systems, SGI mips64
    and ia64 systems, and IBM ia32, ppc64, and s390 systems.  Of these, only the
    ppc64 machines needing the functionality have yet to be released; all others
    have had systems requiring it for full functionality for at least 6 months,
    and in some cases, since the initial Linux port to the affected architecture.
    bf8cb61f
sched.c 63.4 KB