1. 16 Mar, 2009 10 commits
    • Nicolas Pitre's avatar
      [ARM] add CONFIG_HIGHMEM option · 053a96ca
      Nicolas Pitre authored
      Here it is... HIGHMEM for the ARM architecture.  :-)
      
      If you don't have enough ram for highmem pages to be allocated and still
      want to test this, then the cmdline option "vmalloc=" can be used with
      a value large enough to force the highmem threshold down.
      
      Successfully tested on a Marvell DB-78x00-BP Development Board with
      2 GB of RAM.
      Signed-off-by: default avatarNicolas Pitre <nico@marvell.com>
      053a96ca
    • Nicolas Pitre's avatar
      [ARM] ignore high memory with VIPT aliasing caches · 3f973e22
      Nicolas Pitre authored
      VIPT aliasing caches have issues of their own which are not yet handled.
      Usage of discard_old_kernel_data() in copypage-v6.c is not highmem ready,
      kmap/fixmap stuff doesn't take account of cache colouring, etc.
      If/when those issues are handled then this could be reverted.
      Signed-off-by: default avatarNicolas Pitre <nico@marvell.com>
      3f973e22
    • Nicolas Pitre's avatar
      [ARM] xsc3: add highmem support to L2 cache handling code · 3902a15e
      Nicolas Pitre authored
      On xsc3, L2 cache ops are possible only on virtual addresses.  The code
      is rearranged so to have a linear progression requiring the least amount
      of pte setups in the highmem case.  To protect the virtual mapping so
      created, interrupts must be disabled currently up to a page worth of
      address range.
      
      The interrupt disabling is done in a way to minimize the overhead within
      the inner loop.  The alternative would consist in separate code for
      the highmem and non highmem compilation which is less preferable.
      Signed-off-by: default avatarNicolas Pitre <nico@marvell.com>
      3902a15e
    • Nicolas Pitre's avatar
      [ARM] Feroceon: add highmem support to L2 cache handling code · 1bb77267
      Nicolas Pitre authored
      The choice is between looping over the physical range and performing
      single cache line operations, or to map highmem pages somewhere, as
      cache range ops are possible only on virtual addresses.
      
      Because L2 range ops are much faster, we go with the later by factoring
      the physical-to-virtual address conversion and use a fixmap entry for it
      in the HIGHMEM case.
      
      Possible future optimizations to avoid the pte setup cost:
      
       - do the pte setup for highmem pages only
      
       - determine a threshold for doing a line-by-line processing on physical
         addresses when the range is small
      Signed-off-by: default avatarNicolas Pitre <nico@marvell.com>
      1bb77267
    • Nicolas Pitre's avatar
      [ARM] make page_to_dma() highmem aware · 58edb515
      Nicolas Pitre authored
      If a machine class has a custom __virt_to_bus() implementation then it
      must provide a __arch_page_to_dma() implementation as well which is
      _not_ based on page_address() to support highmem.
      
      This patch fixes existing __arch_page_to_dma() and provide a default
      implementation otherwise.  The default implementation for highmem is
      based on __pfn_to_bus() which is defined only when no custom
      __virt_to_bus() is provided by the machine class.
      
      That leaves only ebsa110 and footbridge which cannot support highmem
      until they provide their own __arch_page_to_dma() implementation.
      But highmem support on those legacy platforms with limited memory is
      certainly not a priority.
      Signed-off-by: default avatarNicolas Pitre <nico@marvell.com>
      58edb515
    • Nicolas Pitre's avatar
      [ARM] introduce dma_cache_maint_page() · 43377453
      Nicolas Pitre authored
      This is a helper to be used by the DMA mapping API to handle cache
      maintenance for memory identified by a page structure instead of a
      virtual address.  Those pages may or may not be highmem pages, and
      when they're highmem pages, they may or may not be virtually mapped.
      When they're not mapped then there is no L1 cache to worry about. But
      even in that case the L2 cache must be processed since unmapped highmem
      pages can still be L2 cached.
      Signed-off-by: default avatarNicolas Pitre <nico@marvell.com>
      43377453
    • Nicolas Pitre's avatar
      highmem: atomic highmem kmap page pinning · 3297e760
      Nicolas Pitre authored
      Most ARM machines have a non IO coherent cache, meaning that the
      dma_map_*() set of functions must clean and/or invalidate the affected
      memory manually before DMA occurs.  And because the majority of those
      machines have a VIVT cache, the cache maintenance operations must be
      performed using virtual
      addresses.
      
      When a highmem page is kunmap'd, its mapping (and cache) remains in place
      in case it is kmap'd again. However if dma_map_page() is then called with
      such a page, some cache maintenance on the remaining mapping must be
      performed. In that case, page_address(page) is non null and we can use
      that to synchronize the cache.
      
      It is unlikely but still possible for kmap() to race and recycle the
      virtual address obtained above, and use it for another page before some
      on-going cache invalidation loop in dma_map_page() is done. In that case,
      the new mapping could end up with dirty cache lines for another page,
      and the unsuspecting cache invalidation loop in dma_map_page() might
      simply discard those dirty cache lines resulting in data loss.
      
      For example, let's consider this sequence of events:
      
      	- dma_map_page(..., DMA_FROM_DEVICE) is called on a highmem page.
      
      	-->	- vaddr = page_address(page) is non null. In this case
      		it is likely that the page has valid cache lines
      		associated with vaddr. Remember that the cache is VIVT.
      
      		-->	for (i = vaddr; i < vaddr + PAGE_SIZE; i += 32)
      				invalidate_cache_line(i);
      
      	*** preemption occurs in the middle of the loop above ***
      
      	- kmap_high() is called for a different page.
      
      	-->	- last_pkmap_nr wraps to zero and flush_all_zero_pkmaps()
      		  is called.  The pkmap_count value for the page passed
      		  to dma_map_page() above happens to be 1, so the page
      		  is unmapped.  But prior to that, flush_cache_kmaps()
      		  cleared the cache for it.  So far so good.
      
      		- A fresh pkmap entry is assigned for this kmap request.
      		  The Murphy law says this pkmap entry will eventually
      		  happen to use the same vaddr as the one which used to
      		  belong to the other page being processed by
      		  dma_map_page() in the preempted thread above.
      
      	- The kmap_high() caller start dirtying the cache using the
      	  just assigned virtual mapping for its page.
      
      	*** the first thread is rescheduled ***
      
      			- The for(...) loop is resumed, but now cached
      			  data belonging to a different physical page is
      			  being discarded !
      
      And this is not only a preemption issue as ARM can be SMP as well,
      making the above scenario just as likely. Hence the need for some kind
      of pkmap page pinning which can be used in any context, primarily for
      the benefit of dma_map_page() on ARM.
      
      This provides the necessary interface to cope with the above issue if
      ARCH_NEEDS_KMAP_HIGH_GET is defined, otherwise the resulting code is
      unchanged.
      Signed-off-by: default avatarNicolas Pitre <nico@marvell.com>
      Reviewed-by: default avatarMinChan Kim <minchan.kim@gmail.com>
      Acked-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      3297e760
    • Nicolas Pitre's avatar
      3835f6cb
    • Nicolas Pitre's avatar
      [ARM] kmap support · d73cd428
      Nicolas Pitre authored
      The kmap virtual area borrows a 2MB range at the top of the 16MB area
      below PAGE_OFFSET currently reserved for kernel modules and/or the
      XIP kernel.  This 2MB corresponds to the range covered by 2 consecutive
      second-level page tables, or a single pmd entry as seen by the Linux
      page table abstraction.  Because XIP kernels are unlikely to be seen
      on systems needing highmem support, there shouldn't be any shortage of
      VM space for modules (14 MB for modules is still way more than twice the
      typical usage).
      
      Because the virtual mapping of highmem pages can go away at any moment
      after kunmap() is called on them, we need to bypass the delayed cache
      flushing provided by flush_dcache_page() in that case.
      
      The atomic kmap versions are based on fixmaps, and
      __cpuc_flush_dcache_page() is used directly in that case.
      Signed-off-by: default avatarNicolas Pitre <nico@marvell.com>
      d73cd428
    • Nicolas Pitre's avatar
      [ARM] fixmap support · 5f0fbf9e
      Nicolas Pitre authored
      This is the minimum fixmap interface expected to be implemented by
      architectures supporting highmem.
      
      We have a second level page table already allocated and covering
      0xfff00000-0xffffffff because the exception vector page is located
      at 0xffff0000, and various cache tricks already use some entries above
      0xffff0000.  Therefore the PTEs covering 0xfff00000-0xfffeffff are free
      to be used.
      
      However the XScale cache flushing code already uses virtual addresses
      between 0xfffe0000 and 0xfffeffff.
      
      So this reserves the 0xfff00000-0xfffdffff range for fixmap stuff.
      
      The Documentation/arm/memory.txt information is updated accordingly,
      including the information about the actual top of DMA memory mapping
      region which didn't match the code.
      Signed-off-by: default avatarNicolas Pitre <nico@marvell.com>
      5f0fbf9e
  2. 12 Mar, 2009 4 commits
  3. 09 Mar, 2009 1 commit
  4. 06 Mar, 2009 3 commits
  5. 05 Mar, 2009 3 commits
  6. 04 Mar, 2009 10 commits
  7. 03 Mar, 2009 9 commits