1. 04 Oct, 2006 40 commits
    • Eric W. Biederman's avatar
      [PATCH] genirq: x86_64 irq: make vector_irq per cpu · 550f2299
      Eric W. Biederman authored
      This refactors the irq handling code to make the vectors a per cpu resource so
      the same vector number can be simultaneously used on multiple cpus for
      different irqs.
      
      This should make systems that were hitting limits on the total number of irqs
      much more livable.
      
      [akpm@osdl.org: build fix]
      [akpm@osdl.org: __target_IO_APIC_irq is unneeded on UP]
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      550f2299
    • Eric W. Biederman's avatar
      [PATCH] genirq: x86_64 irq: Make the external irq handlers report their vector, not the irq number · e500f574
      Eric W. Biederman authored
      This is a small pessimization but it paves the way for making this information
      per cpu.  Which allows the the maximum number of IRQS to become NR_CPUS*224.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      e500f574
    • Eric W. Biederman's avatar
      [PATCH] genirq: irq: generalize the check for HARDIRQ_BITS · 23d0b8b0
      Eric W. Biederman authored
      This patch adds support for systems that cannot receive every interrupt on a
      single cpu simultaneously, in the check to see if we have enough HARDIRQ_BITS.
      
      MAX_HARDIRQS_PER_CPU becomes the count of the maximum number of hardare
      generated interrupts per cpu.
      
      On architectures that support per cpu interrupt delivery this can be a
      significant space savings and scalability bonus.
      
      This patch adds support for systems that cannot receive every interrupt on
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      23d0b8b0
    • Eric W. Biederman's avatar
      [PATCH] genirq: irq: remove msi hacks · 323a01c5
      Eric W. Biederman authored
      Because of the nasty way that CONFIG_PCI_MSI was implemented we wound up with
      set_irq_info and set_native_irq_info, with move_irq and move_native_irq.  Both
      functions did the same thing but they were built and called under different
      circumstances.  Now that the msi hacks are gone we can kill move_irq and
      set_irq_info.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      323a01c5
    • Eric W. Biederman's avatar
      [PATCH] genirq: i386 irq: Remove the msi assumption that irq == vector · ace80ab7
      Eric W. Biederman authored
      This patch removes the change in behavior of the irq allocation code when
      CONFIG_PCI_MSI is defined.  Removing all instances of the assumption that irq
      == vector.
      
      create_irq is rewritten to first allocate a free irq and then to assign that
      irq a vector.
      
      assign_irq_vector is made static and the AUTO_ASSIGN case which allocates an
      vector not bound to an irq is removed.
      
      The ioapic vector methods are removed, and everything now works with irqs.
      
      The definition of NR_IRQS no longer depends on CONFIG_PCI_MSI
      
      [akpm@osdl.org: cleanup]
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      ace80ab7
    • Eric W. Biederman's avatar
      [PATCH] genirq: x86_64 irq: Remove the msi assumption that irq == vector · 04b9267b
      Eric W. Biederman authored
      This patch removes the change in behavior of the irq allocation code when
      CONFIG_PCI_MSI is defined.  Removing all instances of the assumption that irq
      == vector.
      
      create_irq is rewritten to first allocate a free irq and then to assign that
      irq a vector.
      
      assign_irq_vector is made static and the AUTO_ASSIGN case which allocates an
      vector not bound to an irq is removed.
      
      The ioapic vector methods are removed, and everything now works with irqs.
      
      The definition of NR_IRQS no longer depends on CONFIG_PCI_MSI
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      04b9267b
    • Eric W. Biederman's avatar
      [PATCH] genirq: msi: only build msi-apic.c on ia64 · 4b2fabb9
      Eric W. Biederman authored
      After the previous changes ia64 is the only architecture useing msi-apic.c
      
      [akpm@osdl.org: unbreak MSI on ia64]
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      4b2fabb9
    • Eric W. Biederman's avatar
      [PATCH] genirq: i386 irq: Move msi message composition into io_apic.c · 2d3fcc1c
      Eric W. Biederman authored
      This removes the hardcoded assumption that irq == vector in the msi
      composition code, and it allows the msi message composition to setup logical
      mode, or lowest priorirty delivery mode as we do for other apic interrupts,
      and with the same selection criteria.
      
      Basically this moves the problem of what is in the msi message into the
      architecture irq management code where it belongs.  Not in a generic layer
      that doesn't have enough information to compose msi messages properly.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      2d3fcc1c
    • Eric W. Biederman's avatar
      [PATCH] genirq: x86_64 irq: Move msi message composition into io_apic.c · 589e367f
      Eric W. Biederman authored
      This removes the hardcoded assumption that irq == vector in the msi
      composition code, and it allows the msi message composition to setup logical
      mode, or lowest priorirty delivery mode as we do for other apic interrupts,
      and with the same selection criteria.
      
      Basically this moves the problem of what is in the msi message into the
      architecture irq management code where it belongs.  Not in a generic layer
      that doesn't have enough information to compose msi messages properly.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      589e367f
    • Eric W. Biederman's avatar
      [PATCH] genirq: msi: make the msi code irq based and not vector based · 1ce03373
      Eric W. Biederman authored
      The msi currently allocates irqs backwards.  First it allocates a platform
      dependent routing value for an interrupt the ``vector'' and then it figures
      out from the vector which irq you are on.
      
      For ia64 this is fine.  For x86 and x86_64 this is complete nonsense and makes
      an enourmous mess of the irq handling code and prevents some pretty
      significant cleanups in the code for handling large numbers of irqs.
      
      This patch refactors msi.c to work in terms of irqs and create_irq/destroy_irq
      for dynamically managing irqs.
      
      Hopefully this is finally a version of msi.c that is useful on more than just
      x86 derivatives.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      1ce03373
    • Eric W. Biederman's avatar
      [PATCH] genirq: x86_64 irq: Dynamic irq support · c4fa0bbf
      Eric W. Biederman authored
      The current implementation of create_irq() is a hack but it is the current
      hack that msi.c uses, and unfortunately the ``generic'' apic msi ops depend on
      this hack.  Thus we are this hack of assuming irq == vector until the
      depencencies in the generic irq code are removed.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      c4fa0bbf
    • Eric W. Biederman's avatar
      [PATCH] genirq: i386 irq: Dynamic irq support · 3fc471ed
      Eric W. Biederman authored
      The current implementation of create_irq() is a hack but it is the current
      hack that msi.c uses, and unfortunately the ``generic'' apic msi ops depend on
      this hack.  Thus we are stuck this hack of assuming irq == vector until the
      depencencies in the generic msi code are removed.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      3fc471ed
    • Eric W. Biederman's avatar
      [PATCH] genirq: ia64 irq: Dynamic irq support · b6cf2583
      Eric W. Biederman authored
      [akpm@osdl.org: build fix]
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      b6cf2583
    • Eric W. Biederman's avatar
      [PATCH] genirq: irq: add a dynamic irq creation API · 3a16d713
      Eric W. Biederman authored
      With the msi support comes a new concept in irq handling, irqs that are
      created dynamically at run time.
      
      Currently the msi code allocates irqs backwards.  First it allocates a
      platform dependent routing value for an interrupt the ``vector'' and then it
      figures out from the vector which irq you are on.
      
      This msi backwards allocator suffers from two basic problems.  The allocator
      suffers because it is trying to do something that is architecture specific in
      a generic way making it brittle, inflexible, and tied to tightly to the
      architecture implementation.  The alloctor also suffers from it's very
      backwards nature as it has tied things together that should have no
      dependencies.
      
      To solve the basic dynamic irq allocation problem two new architecture
      specific functions are added: create_irq and destroy_irq.
      
      create_irq takes no input and returns an unused irq number, that won't be
      reused until it is returned to the free poll with destroy_irq.  The irq then
      can be used for any purpose although the only initial consumer is the msi
      code.
      
      destroy_irq takes an irq number allocated with create_irq and returns it to
      the free pool.
      
      Making this functionality per architecture increases the simplicity of the irq
      allocation code and increases it's flexibility.
      
      dynamic_irq_init() and dynamic_irq_cleanup() are added to automate the
      irq_desc initializtion that should happen for dynamic irqs.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      3a16d713
    • Eric W. Biederman's avatar
      [PATCH] genirq: msi: simplify the msi irq limit policy · 92db6d10
      Eric W. Biederman authored
      Currently we attempt to predict how many irqs we will be able to allocate with
      msi using pci_vector_resources and some complicated accounting, and then we
      only allow each device as many irqs as we think are available on average.
      
      Only the s2io driver even takes advantage of this feature all other drivers
      have a fixed number of irqs they need and bail if they can't get them.
      
      pci_vector_resources is inaccurate if anyone ever frees an irq.  The whole
      implmentation is racy.  The current irq limit policy does not appear to make
      sense with current drivers.  So I have simplified things.  We can revisit this
      we we need a more sophisticated policy.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      92db6d10
    • Eric W. Biederman's avatar
      [PATCH] genirq: msi: refactor the msi_ops · 38bc0361
      Eric W. Biederman authored
      The current msi_ops are short sighted in a number of ways, this patch attempts
      to fix the glaring deficiences.
      
      - Report in msi_ops if a 64bit address is needed in the msi message, so we
        can fail 32bit only msi structures.
      
      - Send and receive a full struct msi_msg in both setup and target.  This is
        a little cleaner and allows for architectures that need to modify the data
        to retarget the msi interrupt to a different cpu.
      
      - In target pass in the full cpu mask instead of just the first cpu in case
        we can make use of the full cpu mask.
      
      - Operate in terms of irqs and not vectors, currently there is still a 1-1
        relationship but on architectures other than ia64 I expect this will change.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      38bc0361
    • Eric W. Biederman's avatar
      [PATCH] genirq: msi: implement helper functions read_msi_msg and write_msi_msg · 0366f8f7
      Eric W. Biederman authored
      In support of this I also add a struct msi_msg that captures the the two
      address and one data field ina typical msi message, and I remember the pos and
      if the address is 64bit in struct msi_desc.
      
      This makes the code a little more readable and easier to maintain, and paves
      the way to further simplfications.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      0366f8f7
    • Eric W. Biederman's avatar
      [PATCH] genirq: msi: make the msi boolean tests return either 0 or 1 · dd159eec
      Eric W. Biederman authored
      This allows the output of the msi tests to be stored directly in a bit field.
      If you don't do this a value greater than one will be truncated and become 0.
      Changing true to false with bizare consequences.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      dd159eec
    • Eric W. Biederman's avatar
      [PATCH] genirq: msi: simplify msi enable and disable · 7bd007e4
      Eric W. Biederman authored
      The problem.  Because the disable routines leave the msi interrupts in all
      sorts of half enabled states the enable routines become impossible to
      implement correctly, and almost impossible to understand.
      
      Simplifing this allows me to simply kill the buggy reroute_msix_table, and
      generally makes the code more maintainable.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      7bd007e4
    • Eric W. Biederman's avatar
      [PATCH] genirq: x86_64 irq: Reenable migrating irqs to other cpus · 0be6652f
      Eric W. Biederman authored
      In the latest changes the code for migrating x86_64 irqs was dropped.  This
      reads it in a fashion that will work even if we change the vector on level
      triggered irqs when we migrate them.
      
      [akpm@osdl.org: build fix]
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      0be6652f
    • Eric W. Biederman's avatar
      [PATCH] genirq: irq: add moved_masked_irq · e7b946e9
      Eric W. Biederman authored
      Currently move_native_irq disables and renables the irq we are migrating to
      ensure we don't take that irq when we are actually doing the migration
      operation.  Disabling the irq needs to happen but sometimes doing the work is
      move_native_irq is too late.
      
      On x86 with ioapics the irq move sequences needs to be:
      edge_triggered:
        mask irq.
        move irq.
        unmask irq.
        ack irq.
      level_triggered:
        mask irq.
        ack irq.
        move irq.
        unmask irq.
      
      We can easily perform the edge triggered sequence, with the current defintion
      of move_native_irq.  However the level triggered case does not map well.  For
      that I have added move_masked_irq, to allow me to disable the irqs around both
      the ack and the move.
      
      Q: Why have we not seen this problem earlier?
      
      A: The only symptom I have been able to reproduce is that if we change
         the vector before acknowleding an irq the wrong irq is acknowledged.
         Since we currently are not reprogramming the irq vector during
         migration no problems show up.
      
         We have to mask the irq before we acknowledge the irq or else we could
         hit a window where an irq is asserted just before we acknowledge it.
      
         Edge triggered irqs do not have this problem because acknowledgements
         do not propogate in the same way.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      e7b946e9
    • Eric W. Biederman's avatar
      [PATCH] genirq: irq: convert the move_irq flag from a 32bit word to a single bit · a24ceab4
      Eric W. Biederman authored
      The primary aim of this patchset is to remove maintenances problems caused by
      the irq infrastructure.  The two big issues I address are an artificially
      small cap on the number of irqs, and that MSI assumes vector == irq.  My
      primary focus is on x86_64 but I have touched other architectures where
      necessary to keep them from breaking.
      
      - To increase the number of irqs I modify the code to look at the (cpu,
        vector) pair instead of just looking at the vector.
      
        With a large number of irqs available systems with a large irq count no
        longer need to compress their irq numbers to fit.  Removing a lot of brittle
        special cases.
      
        For acpi guys the result is that irq == gsi.
      
      - Addressing the fact that MSI assumes irq == vector takes a few more
        patches.  But suffice it to say when I am done none of the generic irq code
        even knows what a vector is.
      
      In quick testing on a large Unisys x86_64 machine we stumbled over at least
      one driver that assumed that NR_IRQS could always fit into an 8 bit number.
      This driver is clearly buggy today.  But this has become a class of bugs that
      it is now much easier to hit.
      
      This patch:
      
      This is a minor space optimization.  In practice I don't think this has any
      affect because of our alignment constraints and the other fields but there is
      not point in chewing up an uncessary word and since we already read the flag
      field this should improve the cache hit ratio of the irq handler.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rajesh Shah <rajesh.shah@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      a24ceab4
    • Ingo Molnar's avatar
      [PATCH] genirq: convert the i386 architecture to irq-chips · f5b9ed7a
      Ingo Molnar authored
      This patch converts all the i386 PIC controllers (except VisWS and Voyager,
      which I could not test - but which should still work as old-style IRQ layers)
      to the new and simpler irq-chip interrupt handling layer.
      
      [akpm@osdl.org: build fix]
      [mingo@elte.hu: enable fasteoi handler for i386 level-triggered IO-APIC irqs]
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Roland Dreier <rolandd@cisco.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      f5b9ed7a
    • Ingo Molnar's avatar
      [PATCH] genirq: convert the x86_64 architecture to irq-chips · f29bd1ba
      Ingo Molnar authored
      This patch converts all the x86_64 PIC controllers layers to the new and
      simpler irq-chip interrupt handling layer.
      
      [mingo@elte.hu: The patch also enables the fasteoi handler for x86_64]
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Roland Dreier <rolandd@cisco.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      f29bd1ba
    • Andrew Morton's avatar
      [PATCH] fbdev: riva warning fix · 0271eb94
      Andrew Morton authored
      drivers/video/riva/fbdev.c: In function `riva_get_EDID_OF':
      drivers/video/riva/fbdev.c:1846: warning: assignment discards qualifiers from pointer target type
      
      This code is being bad: copying a pointer to read-only OF data into a
      non-const pointer.
      
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: "Antonino A. Daplas" <adaplas@pol.net>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      0271eb94
    • Michael Halcrow's avatar
      [PATCH] ecryptfs: fs/Makefile and fs/Kconfig · 237fead6
      Michael Halcrow authored
      eCryptfs is a stacked cryptographic filesystem for Linux.  It is derived from
      Erez Zadok's Cryptfs, implemented through the FiST framework for generating
      stacked filesystems.  eCryptfs extends Cryptfs to provide advanced key
      management and policy features.  eCryptfs stores cryptographic metadata in the
      header of each file written, so that encrypted files can be copied between
      hosts; the file will be decryptable with the proper key, and there is no need
      to keep track of any additional information aside from what is already in the
      encrypted file itself.
      
      [akpm@osdl.org: updates for ongoing API changes]
      [bunk@stusta.de: cleanups]
      [akpm@osdl.org: alpha build fix]
      [akpm@osdl.org: cleanups]
      [tytso@mit.edu: inode-diet updates]
      [pbadari@us.ibm.com: generic_file_*_read/write() interface updates]
      [rdunlap@xenotime.net: printk format fixes]
      [akpm@osdl.org: make slab creation and teardown table-driven]
      Signed-off-by: default avatarPhillip Hellewell <phillip@hellewell.homeip.net>
      Signed-off-by: default avatarMichael Halcrow <mhalcrow@us.ibm.com>
      Signed-off-by: default avatarErez Zadok <ezk@cs.sunysb.edu>
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      Signed-off-by: default avatarStephan Mueller <smueller@chronox.de>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarBadari Pulavarty <pbadari@us.ibm.com>
      Signed-off-by: default avatarRandy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      237fead6
    • Cedric Le Goater's avatar
      [PATCH] Fix linux/nfsd/const.h for make headers_check · f7aa2638
      Cedric Le Goater authored
      make headers_check fails on linux/nfsd/const.h.
      
      Since linux/sunrpc/msg_prot.h does not seem to export anything interesting
      for userspace, this patch moves it in the __KERNEL__ protected section.
      Signed-off-by: default avatarCedric Le Goater <clg@fr.ibm.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Neil Brown <neilb@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      f7aa2638
    • J.Bruce Fields's avatar
      [PATCH] knfsd: nfsd4: actually use all the pieces to implement referrals · 42ca0993
      J.Bruce Fields authored
      Use all the pieces set up so far to implement referral support, allowing
      return of NFS4ERR_MOVED and fs_locations attribute.
      Signed-off-by: default avatarManoj Naik <manoj@almaden.ibm.com>
      Signed-off-by: default avatarFred Isaman <iisaman@citi.umich.edu>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      42ca0993
    • J.Bruce Fields's avatar
      [PATCH] knfsd: nfsd4: xdr encoding for fs_locations · 81c3f413
      J.Bruce Fields authored
      Encode fs_locations attribute.
      Signed-off-by: default avatarManoj Naik <manoj@almaden.ibm.com>
      Signed-off-by: default avatarFred Isaman <iisaman@citi.umich.edu>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      81c3f413
    • Manoj Naik's avatar
      [PATCH] knfsd: nfsd4: fslocations data structures · 93346919
      Manoj Naik authored
      Define FS locations structures, some functions to manipulate them, and add
      code to parse FS locations in downcall and add to the exports structure.
      
      [bfields@fieldses.org: bunch of fixes and cleanups]
      Signed-off-by: default avatarManoj Naik <manoj@almaden.ibm.com>
      Signed-off-by: default avatarFred Isaman <iisaman@citi.umich.edu>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      93346919
    • J.Bruce Fields's avatar
      [PATCH] knfsd: nfsd: store export path in export · b009a873
      J.Bruce Fields authored
      Store the export path in the svc_export structure instead of storing only the
      dentry.  This will prevent the need for additional d_path calls to provide
      NFSv4 fs_locations support.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      b009a873
    • NeilBrown's avatar
      [PATCH] knfsd: close a race-opportunity in d_splice_alias · 21c0d8fd
      NeilBrown authored
      There is a possible race in d_splice_alias.  Though __d_find_alias(inode, 1)
      will only return a dentry with DCACHE_DISCONNECTED set, it is possible for it
      to get cleared before the BUG_ON, and it is is not possible to lock against
      that.
      
      There are a couple of problems here.  Firstly, the code doesn't match the
      comment.  The comment describes a 'disconnected' dentry as being IS_ROOT as
      well as DCACHE_DISCONNECTED, however there is not testing of IS_ROOT anythere.
      
      A dentry is marked DCACHE_DISCONNECTED when allocated with d_alloc_anon, and
      remains DCACHE_DISCONNECTED while a path is built up towards the root.  So a
      dentry can have a valid name and a valid parent and even grandparent, but will
      still be DCACHE_DISCONNECTED until a path to the root is created.  Once the
      path to the root is complete, everything in the path gets DCACHE_DISCONNECTED
      cleared.  So the fact that DCACHE_DISCONNECTED isn't enough to say that a
      dentry is free to be spliced in with a given name.  This can only be allowed
      if the dentry does not yet have a name, so the IS_ROOT test is needed too.
      
      However even adding that test to __d_find_alias isn't enough.  As
      d_splice_alias drops dcache_lock before calling d_move to perform the splice,
      it could race with another thread calling d_splice_alias to splice the inode
      in with a different name in a different part of the tree (in the case where a
      file has hard links).  So that splicing code is only really safe for
      directories (as we know that directories only have one link).  For
      directories, the caller of d_splice_alias will be holding i_mutex on the
      (unique) parent so there is no room for a race.
      
      A consequence of this is that a non-directory will never benefit from being
      spliced into a pre-exisiting dentry, but that isn't a problem.  It is
      perfectly OK for a non-directory to have multiple dentries, some anonymous,
      some not.  And the comment for d_splice_alias says that it only happens for
      directories anyway.
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      21c0d8fd
    • NeilBrown's avatar
      [PATCH] knfsd: fix auto-sizing of nfsd request/reply buffers · 44c55600
      NeilBrown authored
      totalram is measured in pages, not bytes, so PAGE_SHIFT must be used when
      trying to find 1/4096 of RAM.
      
      Cc:  "J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      44c55600
    • NeilBrown's avatar
      [PATCH] knfsd: lockd: fix refount on nsm · 6b54dae2
      NeilBrown authored
      If nlm_lookup_host finds what it is looking for it exits with an extra
      reference on the matching 'nsm' structure.
      
      So don't actually count the reference until we are (fairly) sure it is going
      to be used.
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      6b54dae2
    • J.Bruce Fields's avatar
      [PATCH] knfsd: nfsd4: acls: fix handling of zero-length acls · b66285ce
      J.Bruce Fields authored
      It is legal to have zero-length NFSv4 acls; they just deny everything.
      
      Also, nfs4_acl_nfsv4_to_posix will always return with pacl and dpacl set on
      success, so the caller doesn't need to check this.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      b66285ce
    • J.Bruce Fields's avatar
      [PATCH] knfsd: nfsd4: acls: simplify nfs4_acl_nfsv4_to_posix interface · f3b64eb6
      J.Bruce Fields authored
      There's no need to handle the case where the caller passes in null for pacl or
      dpacl; no caller does that, because it would be a dumb thing to do.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      f3b64eb6
    • J.Bruce Fields's avatar
      [PATCH] knfsd: nfsd4: acls: fix inheritance · b548edc2
      J.Bruce Fields authored
      We can be a little more flexible about the flags allowed for inheritance (in
      particular, we can deal with either the presence or the absence of
      INHERIT_ONLY), but we should probably reject other combinations that we don't
      understand.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      b548edc2
    • J.Bruce Fields's avatar
      [PATCH] knfsd: nfsd4: acls: relax the nfsv4->posix mapping · 09229edb
      J.Bruce Fields authored
      Use a different nfsv4->(draft posix) acl mapping which is
      	1. completely backwards compatible,
      	2. accepts any nfsv4 acl, and
      	3. errs on the side of restricting permissions.
      
      In detail:
      
      	1. completely backwards compatible: The new mapping produces the
      	same result on any acl produced by the existing (draft
      	posix)->nfsv4 mapping; the one exception is that we no longer
      	attempt to guess the value of the mask by assuming certain denies
      	represent the mask.  Since the server still keeps track of the mask
      	locally, sequences of chmod's will still be handled fine; the only
      	thing this will change is sequences of chmod's with intervening
      	read-modify-writes of the acl.  That last case just isn't worth the
      	trouble and the possible misrepresentations of the user's intent
      	(if we guess that a certain deny indicates masking is in effect
      	when it really isn't).
      
      	2. accepts any nfsv4 acl: That's not quite true: we still reject
      	acls that use combinations of inheritance flags that we don't
      	support.  We also reject acls that attempt to explicitly deny
      	read_acl or read_attributes permissions, or that attempt to deny
      	write_acl or write_attributes permissions to the owner of the file.
      
      	3.  errs on the side of restricting permissions: one exception to
      	this last rule: we totally ignore some bits (write_owner,
      	synchronize, read_named_attributes, etc.) that are completely alien
      	to our filesystem semantics, in some cases even if that would mean
      	ignoring an explicit deny that we have no intention of enforcing.
      	Excepting that, the posix acl produced should be the most
      	permissive acl that is not more permissive than the given nfsv4
      	acl.
      
      And the new code's shorter, too.  Neato.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      09229edb
    • J.Bruce Fields's avatar
      [PATCH] knfsd: nfsd4: clean up exp_pseudoroot · d0ebd9c0
      J.Bruce Fields authored
      The previous patch enables some minor simplification here.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d0ebd9c0
    • J.Bruce Fields's avatar
      [PATCH] knfsd: nfsd4: refactor exp_pseudoroot · f38b20c6
      J.Bruce Fields authored
      We could be using more common code in exp_pseudoroot().  This will also
      simplify some changes we need to make later.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      f38b20c6