1. 22 Oct, 2007 29 commits
    • Christoph Hellwig's avatar
      efs: new export ops · 05da0804
      Christoph Hellwig authored
      Trivial switch over to the new generic helpers.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Neil Brown <neilb@suse.de>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      05da0804
    • Christoph Hellwig's avatar
      ext4: new export ops · 1b961ac0
      Christoph Hellwig authored
      Trivial switch over to the new generic helpers.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Neil Brown <neilb@suse.de>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: <linux-ext4@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1b961ac0
    • Christoph Hellwig's avatar
      ext3: new export ops · 74af0baa
      Christoph Hellwig authored
      Trivial switch over to the new generic helpers.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Neil Brown <neilb@suse.de>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: <linux-ext4@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      74af0baa
    • Christoph Hellwig's avatar
      ext2: new export ops · 2e4c68e3
      Christoph Hellwig authored
      Trivial switch over to the new generic helpers.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Neil Brown <neilb@suse.de>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2e4c68e3
    • Christoph Hellwig's avatar
      exportfs: add new methods · 2596110a
      Christoph Hellwig authored
      Add the guts for the new filesystem API to exportfs.
      
      There's now a fh_to_dentry method that returns a dentry for the object looked
      for given a filehandle fragment, and a fh_to_parent operation that returns the
      dentry for the encoded parent directory in case the file handle contains it.
      
      There are default implementations for these methods that only take a callback
      for an nfs-enhanced iget variant and implement the rest of the semantics.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Neil Brown <neilb@suse.de>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: <linux-ext4@vger.kernel.org>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Anton Altaparmakov <aia21@cantab.net>
      Cc: David Chinner <dgc@sgi.com>
      Cc: Timothy Shimmin <tes@sgi.com>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Chris Mason <mason@suse.com>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: "Vladimir V. Saveliev" <vs@namesys.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Mark Fasheh <mark.fasheh@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2596110a
    • Christoph Hellwig's avatar
      exportfs: add fid type · 6e91ea2b
      Christoph Hellwig authored
      This patchset is a medium scale rewrite of the export operations interface.
      The goal is to make the interface less complex, and easier to understand from
      the filesystem side, aswell as preparing generic support for exporting of
      64bit inode numbers.
      
      This touches all nfs exporting filesystems, and I've done testing on all of
      the filesystems I have here locally (xfs, ext2, ext3, reiserfs, jfs)
      
      This patch:
      
      Add a structured fid type so that we don't have to pass an array of u32 values
      around everywhere.  It's a union of possible layouts.
      
      As a start there's only the u32 array and the traditional 32bit inode format,
      but there will be more in one of my next patchset when I start to document the
      various filehandle formats we have in lowlevel filesystems better.
      
      Also add an enum that gives the various filehandle types human- readable
      names.
      
      Note: Some people might think the struct containing an anonymous union is
      ugly, but I didn't want to pass around a raw union type.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Neil Brown <neilb@suse.de>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: <linux-ext4@vger.kernel.org>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Anton Altaparmakov <aia21@cantab.net>
      Cc: David Chinner <dgc@sgi.com>
      Cc: Timothy Shimmin <tes@sgi.com>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Chris Mason <mason@suse.com>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: "Vladimir V. Saveliev" <vs@namesys.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Mark Fasheh <mark.fasheh@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6e91ea2b
    • Bernhard Walle's avatar
      kexec: add BSS to resource tree · 00bf4098
      Bernhard Walle authored
      Add the BSS to the resource tree just as kernel text and kernel data are in
      the resource tree.  The main reason behind this is to avoid crashkernel
      reservation in that area.
      
      While it's not strictly necessary to have the BSS in the resource tree (the
      actual collision detection is done in the reserve_bootmem() function before),
      the usage of the BSS resource should be presented to the user in /proc/iomem
      just as Kernel data and Kernel code.
      
      Note: The patch currently is only implemented for x86 and ia64 (because
      efi_initialize_iomem_resources() has the same signature on i386 and ia64).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarBernhard Walle <bwalle@suse.de>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Cc: <linux-arch@vger.kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      00bf4098
    • FUJITA Tomonori's avatar
      intel-iommu sg chaining support · c03ab37c
      FUJITA Tomonori authored
      x86_64 defines ARCH_HAS_SG_CHAIN. So if IOMMU implementations don't
      support sg chaining, we will get data corruption.
      Signed-off-by: default avatarFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Acked-by: default avatarAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c03ab37c
    • Keshavamurthy, Anil S's avatar
      intel-iommu: fix for IOMMU early crash · 358dd8ac
      Keshavamurthy, Anil S authored
      pci_dev's->sysdata is highly overloaded and currently IOMMU is broken due
      to IOMMU code depending on this field.
      
      This patch introduces new field in pci_dev's dev.archdata struct to hold
      IOMMU specific per device IOMMU private data.
      Signed-off-by: default avatarAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Acked-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Greg KH <greg@kroah.com>
      Cc: Jeff Garzik <jeff@garzik.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      358dd8ac
    • Keshavamurthy, Anil S's avatar
      intel-iommu: optimize sg map/unmap calls · f76aec76
      Keshavamurthy, Anil S authored
      This patch adds PageSelectiveInvalidation support replacing existing
      DomainSelectiveInvalidation for intel_{map/unmap}_sg() calls and also
      enables to mapping one big contiguous DMA virtual address which is mapped
      to discontiguous physical address for SG map/unmap calls.
      
      "Doamin selective invalidations" wipes out the IOMMU address translation
      cache based on domain ID where as "Page selective invalidations" wipes out
      the IOMMU address translation cache for that address mask range which is
      more cache friendly when compared to Domain selective invalidations.
      
      Here is how it is done.
      1) changes to iova.c
      alloc_iova() now takes a bool size_aligned argument, which
      when when set, returns the io virtual address that is
      naturally aligned to 2 ^ x, where x is the order
      of the size requested.
      
      Returning this io vitual address which is naturally
      aligned helps iommu to do the "page selective
      invalidations" which is IOMMU cache friendly
      over "domain selective invalidations".
      
      2) Changes to driver/pci/intel-iommu.c
      Clean up intel_{map/unmap}_{single/sg} () calls so that
      s/g map/unamp calls is no more dependent on
      intel_{map/unmap}_single()
      
      intel_map_sg() now computes the total DMA virtual address
      required and allocates the size aligned total DMA virtual address
      and maps the discontiguous physical address to the allocated
      contiguous DMA virtual address.
      
      In the intel_unmap_sg() case since the DMA virtual address
      is contiguous and size_aligned, PageSelectiveInvalidation
      is used replacing earlier DomainSelectiveInvalidations.
      Signed-off-by: default avatarAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Suresh B <suresh.b.siddha@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f76aec76
    • Keshavamurthy, Anil S's avatar
      Intel IOMMU: Iommu floppy workaround · 49a0429e
      Keshavamurthy, Anil S authored
      This config option (DMAR_FLPY_WA) sets up 1:1 mapping for the floppy device so
      that the floppy device which does not use DMA api's will continue to work.
      
      Once the floppy driver starts using DMA api's this config option can be turn
      off or this patch can be yanked out of kernel at that time.
      
      [akpm@linux-foundation.org: cleanups, rename things, build fix]
      [jengelh@computergmbh.de: Kconfig fixes]
      Signed-off-by: default avatarAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: default avatarJan Engelhardt <jengelh@gmx.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      49a0429e
    • Keshavamurthy, Anil S's avatar
      Intel IOMMU: Iommu Gfx workaround · e820482c
      Keshavamurthy, Anil S authored
      When we fix all the opensource gfx drivers to use the DMA api's, at that time
      we can yank this config options out.
      
      [jengelh@computergmbh.de: Kconfig fixes]
      Signed-off-by: default avatarAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: default avatarJan Engelhardt <jengelh@gmx.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e820482c
    • Keshavamurthy, Anil S's avatar
      Intel IOMMU: DMAR fault handling support · 3460a6d9
      Keshavamurthy, Anil S authored
      MSI interrupt handler registrations and fault handling support for Intel-IOMMU
      hadrware.
      
      This patch enables the MSI interrupts for the DMA remapping units and in the
      interrupt handler read the fault cause and outputs the same on to the console.
      Signed-off-by: default avatarAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3460a6d9
    • Keshavamurthy, Anil S's avatar
      Intel IOMMU: Intel iommu cmdline option - forcedac · 7d3b03ce
      Keshavamurthy, Anil S authored
      Introduce intel_iommu=forcedac commandline option.  This option is helpful to
      verify the pci device capability of handling physical dma'able address greater
      than 4G.
      Signed-off-by: default avatarAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7d3b03ce
    • Keshavamurthy, Anil S's avatar
      Intel IOMMU: Avoid memory allocation failures in dma map api calls · eb3fa7cb
      Keshavamurthy, Anil S authored
      Intel IOMMU driver needs memory during DMA map calls to setup its internal
      page tables and for other data structures.  As we all know that these DMA map
      calls are mostly called in the interrupt context or with the spinlock held by
      the upper level drivers(network/storage drivers), so in order to avoid any
      memory allocation failure due to low memory issues, this patch makes memory
      allocation by temporarily setting PF_MEMALLOC flags for the current task
      before making memory allocation calls.
      
      We evaluated mempools as a backup when kmem_cache_alloc() fails
      and found that mempools are really not useful here because
       1) We don't know for sure how much to reserve in advance
       2) And mempools are not useful for GFP_ATOMIC case (as we call
          memory alloc functions with GFP_ATOMIC)
      
      (akpm: point 2 is wrong...)
      
      With PF_MEMALLOC flag set in the current->flags, the VM subsystem avoids any
      watermark checks before allocating memory thus guarantee'ing the memory till
      the last free page.  Further, looking at the code in mm/page_alloc.c in
      __alloc_pages() function, looks like this flag is useful only in the
      non-interrupt context.
      
      If we are in the interrupt context and memory allocation in IOMMU driver fails
      for some reason, then the DMA map api's will return failure and it is up to
      the higher level drivers to retry.  Suppose, if upper level driver programs
      the controller with the buggy DMA virtual address, the IOMMU will block that
      DMA transaction when that happens thus preventing any corruption to main
      memory.
      
      So far in our test scenario, we were unable to create any memory allocation
      failure inside dma map api calls.
      Signed-off-by: default avatarAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      eb3fa7cb
    • Keshavamurthy, Anil S's avatar
      Intel IOMMU: Intel IOMMU driver · ba395927
      Keshavamurthy, Anil S authored
      Actual intel IOMMU driver.  Hardware spec can be found at:
      http://www.intel.com/technology/virtualization
      
      This driver sets X86_64 'dma_ops', so hook into standard DMA APIs.  In this
      way, PCI driver will get virtual DMA address.  This change is transparent to
      PCI drivers.
      
      [akpm@linux-foundation.org: remove unneeded cast]
      [akpm@linux-foundation.org: build fix]
      [bunk@stusta.de: fix duplicate CONFIG_DMAR Makefile line]
      Signed-off-by: default avatarAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ba395927
    • Keshavamurthy, Anil S's avatar
      Intel IOMMU: IOVA allocation and management routines · f8de50eb
      Keshavamurthy, Anil S authored
      This code implements a generic IOVA allocation and management.  As per Dave's
      suggestion we are now allocating IO virtual address from Higher DMA limit
      address rather than lower end address and this eliminated the need to preserve
      the IO virtual address for multiple devices sharing the same domain virtual
      address.
      
      Also this code uses red black trees to store the allocated and reserved iova
      nodes.  This showed a good performance improvements over previous linear
      linked list.
      
      [akpm@linux-foundation.org: remove inlines]
      [akpm@linux-foundation.org: coding style fixes]
      Signed-off-by: default avatarAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f8de50eb
    • Keshavamurthy, Anil S's avatar
      Intel IOMMU: clflush_cache_range now takes size param · a9c55b3b
      Keshavamurthy, Anil S authored
      Introduce the size param for clflush_cache_range().
      Signed-off-by: default avatarAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a9c55b3b
    • Keshavamurthy, Anil S's avatar
      Intel IOMMU: PCI generic helper function · 994a65e2
      Keshavamurthy, Anil S authored
      When devices are under a p2p bridge, upstream transactions get replaced by the
      device id of the bridge as it owns the PCIE transaction.  Hence its necessary
      to setup translations on behalf of the bridge as well.  Due to this limitation
      all devices under a p2p share the same domain in a DMAR.
      
      We just cache the type of device, if its a native PCIe device
      or not for later use.
      
      [akpm@linux-foundation.org: BUG_ON -> WARN_ON+recover]
      Signed-off-by: default avatarAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      994a65e2
    • Keshavamurthy, Anil S's avatar
      Intel IOMMU: DMAR detection and parsing logic · 10e5247f
      Keshavamurthy, Anil S authored
      This patch supports the upcomming Intel IOMMU hardware a.k.a.  Intel(R)
      Virtualization Technology for Directed I/O Architecture and the hardware spec
      for the same can be found here
      http://www.intel.com/technology/virtualization/index.htm
      
      FAQ! (questions from akpm, answers from ak)
      
      > So...  what's all this code for?
      >
      > I assume that the intent here is to speed things up under Xen, etc?
      
      Yes in some cases, but not this code.  That would be the Xen version of this
      code that could potentially assign whole devices to guests.  I expect this to
      be only useful in some special cases though because most hardware is not
      virtualizable and you typically want an own instance for each guest.
      
      Ok at some point KVM might implement this too; i likely would use this code
      for this.
      
      > Do we
      > have any benchmark results to help us to decide whether a merge would be
      > justified?
      
      The main advantage for doing it in the normal kernel is not performance, but
      more safety.  Broken devices won't be able to corrupt memory by doing random
      DMA.
      
      Unfortunately that doesn't work for graphics yet, for that need user space
      interfaces for the X server are needed.
      
      There are some potential performance benefits too:
      
      - When you have a device that cannot address the complete address range an
        IOMMU can remap its memory instead of bounce buffering.  Remapping is likely
        cheaper than copying.
      
      - The IOMMU can merge sg lists into a single virtual block.  This could
        potentially speed up SG IO when the device is slow walking SG lists.  [I
        long ago benchmarked 5% on some block benchmark with an old MPT Fusion; but
        it probably depends a lot on the HBA]
      
      And you get better driver debugging because unexpected memory accesses from
      the devices will cause a trappable event.
      
      >
      > Does it slow anything down?
      
      It adds more overhead to each IO so yes.
      
      This patch:
      
      Add support for early detection and parsing of DMAR's (DMA Remapping) reported
      to OS via ACPI tables.
      
      DMA remapping(DMAR) devices support enables independent address translations
      for Direct Memory Access(DMA) from Devices.  These DMA remapping devices are
      reported via ACPI tables and includes pci device scope covered by these DMA
      remapping device.
      
      For detailed info on the specification of "Intel(R) Virtualization Technology
      for Directed I/O Architecture" please see
      http://www.intel.com/technology/virtualization/index.htmSigned-off-by: default avatarAnil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Len Brown <lenb@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      10e5247f
    • Jan Kara's avatar
      ext2: avoid rec_len overflow with 64KB block size · 89910ccc
      Jan Kara authored
      With 64KB blocksize, a directory entry can have size 64KB which does not
      fit into 16 bits we have for entry length.  So we store 0xffff instead and
      convert the value when read from / written to disk.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Cc: <linux-ext4@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      89910ccc
    • J. Bruce Fields's avatar
      dcache: don't expose uninitialized memory in /proc/<pid>/fd/<fd> · 321bcf92
      J. Bruce Fields authored
      Well, it's not especially important that target->d_iname get the contents
      of dentry->d_iname, but it's important that it get initialized with
      *something*, otherwise we're just exposing some random piece of memory to
      anyone who reads the link at /proc/<pid>/fd/<fd> for the deleted file, when
      it's still held open by someone.
      
      I've run a test program that copies a short (<36 character) name ontop of a
      long (>=36 character) name and see that the first time I run it, without
      this patch, I get unpredicatable results out of /proc/<pid>/fd/<fd>.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@citi.umich.edu>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      321bcf92
    • Serge E. Hallyn's avatar
      capabilities: clean up file capability reading · b68680e4
      Serge E. Hallyn authored
      Simplify the vfs_cap_data structure.
      
      Also fix get_file_caps which was declaring
      __le32 v1caps[XATTR_CAPS_SZ] on the stack, but
      XATTR_CAPS_SZ is already * sizeof(__le32).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarSerge E. Hallyn <serue@us.ibm.com>
      Cc: Andrew Morgan <morgan@kernel.org>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b68680e4
    • Yasunori Goto's avatar
      memory hotplug: make kmem_cache_node for SLUB on memory online avoid panic · b9049e23
      Yasunori Goto authored
      Fix a panic due to access NULL pointer of kmem_cache_node at discard_slab()
      after memory online.
      
      When memory online is called, kmem_cache_nodes are created for all SLUBs
      for new node whose memory are available.
      
      slab_mem_going_online_callback() is called to make kmem_cache_node() in
      callback of memory online event.  If it (or other callbacks) fails, then
      slab_mem_offline_callback() is called for rollback.
      
      In memory offline, slab_mem_going_offline_callback() is called to shrink
      all slub cache, then slab_mem_offline_callback() is called later.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: locking fix]
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: default avatarYasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b9049e23
    • Yasunori Goto's avatar
      memory hotplug: rearrange memory hotplug notifier · 7b78d335
      Yasunori Goto authored
      Current memory notifier has some defects yet.  (Fortunately, nothing uses
      it.) This patch is to fix and rearrange for them.
      
        - Add information of start_pfn, nr_pages, and node id if node status is
          changes from/to memoryless node for callback functions.
          Callbacks can't do anything without those information.
        - Add notification going-online status.
          It is necessary for creating per node structure before the node's
          pages are available.
        - Move GOING_OFFLINE status notification after page isolation.
          It is good place for return memory like cache for callback,
          because returned page is not used again.
        - Make CANCEL events for rollingback when error occurs.
        - Delete MEM_MAPPING_INVALID notification. It will be not used.
        - Fix compile error of (un)register_memory_notifier().
      Signed-off-by: default avatarYasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7b78d335
    • Yasunori Goto's avatar
      memory hotplug: document the memory hotplug notifier · 10020ca2
      Yasunori Goto authored
      Add description about event notification callback routine to the document
      Signed-off-by: default avatarYasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      10020ca2
    • Rusty Russell's avatar
      i386: paravirt boot sequence · a24e7851
      Rusty Russell authored
      This patch uses the updated boot protocol to do paravirtualized boot.
      If the boot version is >= 2.07, then it will do two things:
      
       1. Check the bootparams loadflags to see if we should reload the
          segment registers and clear interrupts.  This is appropriate
          for normal native boot and some paravirtualized environments, but
          inapproprate for others.
      
       2. Check the hardware architecture, and dispatch to the appropriate
          kernel entrypoint.  If the bootloader doesn't set this, then we
          simply do the normal boot sequence.
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Acked-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a24e7851
    • Rusty Russell's avatar
    • Rusty Russell's avatar
      update boot spec to 2.07 · e5371ac5
      Rusty Russell authored
      Updates for version 2.07 of the boot protocol.  This includes:
      
      load_flags.KEEP_SEGMENTS- flag to request/inhibit segment reloads
      hardware_subarch	- what subarchitecture we're booting under
      hardware_subarch_data	- per-architecture data
      
      The intention of these changes is to make booting a paravirtualized
      kernel work via the normal Linux boot protocol.
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Acked-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e5371ac5
  2. 21 Oct, 2007 11 commits