1. 18 Oct, 2010 8 commits
    • Dave Chinner's avatar
      xfs: remove debug assert for per-ag reference counting · bd32d25a
      Dave Chinner authored
      When we start taking references per cached buffer to the the perag
      it is cached on, it will blow the current debug maximum reference
      count assert out of the water. The assert has never caught a bug,
      and we have tracing to track changes if there ever is a problem,
      so just remove it.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarAlex Elder <aelder@sgi.com>
      bd32d25a
    • Dave Chinner's avatar
      xfs: reduce the number of CIL lock round trips during commit · d1583a38
      Dave Chinner authored
      When commiting a transaction, we do a lock CIL state lock round trip
      on every single log vector we insert into the CIL. This is resulting
      in the lock being as hot as the inode and dcache locks on 8-way
      create workloads. Rework the insertion loops to bring the number
      of lock round trips to one per transaction for log vectors, and one
      more do the busy extents.
      
      Also change the allocation of the log vector buffer not to zero it
      as we copy over the entire allocated buffer anyway.
      
      This patch also includes a structural cleanup to the CIL item
      insertion provided by Christoph Hellwig.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarAlex Elder <aelder@sgi.com>
      d1583a38
    • Poyo VL's avatar
      xfs: eliminate some newly-reported gcc warnings · 9c169915
      Poyo VL authored
      Ionut Gabriel Popescu <poyo_vl@yahoo.com> submitted a simple change
      to eliminate some "may be used uninitialized" warnings when building
      XFS.  The reported condition seems to be something that GCC did not
      used to recognize or report.  The warnings were produced by:
      
          gcc version 4.5.0 20100604
          [gcc-4_5-branch revision 160292] (SUSE Linux)
      Signed-off-by: default avatarIonut Gabriel Popescu <poyo_vl@yahoo.com>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      9c169915
    • Christoph Hellwig's avatar
      xfs: remove the ->kill_root btree operation · c0e59e1a
      Christoph Hellwig authored
      The implementation os ->kill_root only differ by either simply
      zeroing out the now unused buffer in the btree cursor in the inode
      allocation btree or using xfs_btree_setbuf in the allocation btree.
      
      Initially both of them used xfs_btree_setbuf, but the use in the
      ialloc btree was removed early on because it interacted badly with
      xfs_trans_binval.
      
      In addition to zeroing out the buffer in the cursor xfs_btree_setbuf
      updates the bc_ra array in the btree cursor, and calls
      xfs_trans_brelse on the buffer previous occupying the slot.
      
      The bc_ra update should be done for the alloc btree updated too,
      although the lack of it does not cause serious problems.  The
      xfs_trans_brelse call on the other hand is effectively a no-op in
      the end - it keeps decrementing the bli_recur refcount until it hits
      zero, and then just skips out because the buffer will always be
      dirty at this point.  So removing it for the allocation btree is
      just fine.
      
      So unify the code and move it to xfs_btree.c.  While we're at it
      also replace the call to xfs_btree_setbuf with a NULL bp argument in
      xfs_btree_del_cursor with a direct call to xfs_trans_brelse given
      that the cursor is beeing freed just after this and the state
      updates are superflous.  After this xfs_btree_setbuf is only used
      with a non-NULL bp argument and can thus be simplified.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      c0e59e1a
    • Christoph Hellwig's avatar
      xfs: stop using xfs_qm_dqtobp in xfs_qm_dqflush · acecf1b5
      Christoph Hellwig authored
      In xfs_qm_dqflush we know that q_blkno must be initialized already from a
      previous xfs_qm_dqread.  So instead of calling xfs_qm_dqtobp we can
      simply read the quota buffer directly.  This also saves us from a duplicate
      xfs_qm_dqcheck call check and allows xfs_qm_dqtobp to be simplified now
      that it is always called for a newly initialized inode.  In addition to
      that properly unwind all locks in xfs_qm_dqflush when xfs_qm_dqcheck
      fails.
      
      This mirrors a similar cleanup in the inode lookup done earlier.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      acecf1b5
    • Christoph Hellwig's avatar
      xfs: simplify xfs_qm_dqusage_adjust · 52fda114
      Christoph Hellwig authored
      There is no need to have the users and group/project quota locked at the
      same time.  Get rid of xfs_qm_dqget_noattach and just do a xfs_qm_dqget
      inside xfs_qm_quotacheck_dqadjust for the quota we are operating on
      right now.  The new version of xfs_qm_quotacheck_dqadjust holds the
      inode lock over it's operations, which is not a problem as it simply
      increments counters and there is no concern about log contention
      during mount time.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      52fda114
    • Dave Chinner's avatar
      xfs: Introduce XFS_IOC_ZERO_RANGE · 44722352
      Dave Chinner authored
      XFS_IOC_ZERO_RANGE is the equivalent of an atomic XFS_IOC_UNRESVSP/
      XFS_IOC_RESVSP call pair. It enabled ranges of written data to be
      turned into zeroes without requiring IO or having to free and
      reallocate the extents in the range given as would occur if we had
      to punch and then preallocate them separately.  This enables
      applications to zero parts of files very quickly without changing
      the layout of the files in any way.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      44722352
    • Dave Chinner's avatar
      xfs: use range primitives for xfs page cache operations · 3ae4c9de
      Dave Chinner authored
      While XFS passes ranges to operate on from the core code, the
      functions being called ignore the either the entire range or the end
      of the range. This is historical because when the function were
      written linux didn't have the necessary range operations. Update the
      functions to use the correct operations.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      3ae4c9de
  2. 14 Oct, 2010 4 commits
    • Linus Torvalds's avatar
      Linux 2.6.36-rc8 · cd07202c
      Linus Torvalds authored
      cd07202c
    • Linus Torvalds's avatar
      Un-inline the core-dump helper functions · 3aa0ce82
      Linus Torvalds authored
      Tony Luck reports that the addition of the access_ok() check in commit
      0eead9ab ("Don't dump task struct in a.out core-dumps") broke the
      ia64 compile due to missing the necessary header file includes.
      
      Rather than add yet another include (<asm/unistd.h>) to make everything
      happy, just uninline the silly core dump helper functions and move the
      bodies to fs/exec.c where they make a lot more sense.
      
      dump_seek() in particular was too big to be an inline function anyway,
      and none of them are in any way performance-critical.  And we really
      don't need to mess up our include file headers more than they already
      are.
      Reported-and-tested-by: default avatarTony Luck <tony.luck@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3aa0ce82
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · ae42d8d4
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
        ehea: Fix a checksum issue on the receive path
        net: allow FEC driver to use fixed PHY support
        tg3: restore rx_dropped accounting
        b44: fix carrier detection on bind
        net: clear heap allocations for privileged ethtool actions
        NET: wimax, fix use after free
        ATM: iphase, remove sleep-inside-atomic
        ATM: mpc, fix use after free
        ATM: solos-pci, remove use after free
        net/fec: carrier off initially to avoid root mount failure
        r8169: use device model DMA API
        r8169: allocate with GFP_KERNEL flag when able to sleep
      ae42d8d4
    • Linus Torvalds's avatar
      Don't dump task struct in a.out core-dumps · 0eead9ab
      Linus Torvalds authored
      akiphie points out that a.out core-dumps have that odd task struct
      dumping that was never used and was never really a good idea (it goes
      back into the mists of history, probably the original core-dumping
      code).  Just remove it.
      
      Also do the access_ok() check on dump_write().  It probably doesn't
      matter (since normal filesystems all seem to do it anyway), but he
      points out that it's normally done by the VFS layer, so ...
      
      [ I suspect that we should possibly do "vfs_write()" instead of
        calling ->write directly.  That also does the whole fsnotify and write
        statistics thing, which may or may not be a good idea. ]
      
      And just to be anal, do this all for the x86-64 32-bit a.out emulation
      code too, even though it's not enabled (and won't currently even
      compile)
      Reported-by: default avatarakiphie <akiphie@lavabit.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0eead9ab
  3. 13 Oct, 2010 11 commits
  4. 12 Oct, 2010 13 commits
    • Russell King's avatar
      ARM: relax ioremap prohibition (309caa9c) for -final and -stable · 06c10884
      Russell King authored
      ... but produce a big warning about the problem as encouragement
      for people to fix their drivers.
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      06c10884
    • Russell King's avatar
    • Mika Westerberg's avatar
      ARM: 6440/1: ep93xx: DMA: fix channel_disable · 10d48b39
      Mika Westerberg authored
      When channel_disable() is called, it disables per channel interrupts and
      waits until channels state becomes STATE_STALL, and then disables the
      channel. Now, if the DMA transfer is disabled while the channel is in
      STATE_NEXT we will not wait anything and disable the channel immediately.
      This seems to cause weird data corruption for example in audio transfers.
      
      Fix is to wait while we are in STATE_NEXT or STATE_ON and only then
      disable the channel.
      Signed-off-by: default avatarMika Westerberg <mika.westerberg@iki.fi>
      Acked-by: default avatarRyan Mallon <ryan@bluewatersys.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      10d48b39
    • Linus Torvalds's avatar
      Merge branch 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 0acc1b2a
      Linus Torvalds authored
      * 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: x86: Move TSC reset out of vmcb_init
        KVM: x86: Fix SVM VMCB reset
      0acc1b2a
    • Steven Rostedt's avatar
      ring-buffer: Fix typo of time extends per page · d0134324
      Steven Rostedt authored
      Time stamps for the ring buffer are created by the difference between
      two events. Each page of the ring buffer holds a full 64 bit timestamp.
      Each event has a 27 bit delta stamp from the last event. The unit of time
      is nanoseconds, so 27 bits can hold ~134 milliseconds. If two events
      happen more than 134 milliseconds apart, a time extend is inserted
      to add more bits for the delta. The time extend has 59 bits, which
      is good for ~18 years.
      
      Currently the time extend is committed separately from the event.
      If an event is discarded before it is committed, due to filtering,
      the time extend still exists. If all events are being filtered, then
      after ~134 milliseconds a new time extend will be added to the buffer.
      
      This can only happen till the end of the page. Since each page holds
      a full timestamp, there is no reason to add a time extend to the
      beginning of a page. Time extends can only fill a page that has actual
      data at the beginning, so there is no fear that time extends will fill
      more than a page without any data.
      
      When reading an event, a loop is made to skip over time extends
      since they are only used to maintain the time stamp and are never
      given to the caller. As a paranoid check to prevent the loop running
      forever, with the knowledge that time extends may only fill a page,
      a check is made that tests the iteration of the loop, and if the
      iteration is more than the number of time extends that can fit in a page
      a warning is printed and the ring buffer is disabled (all of ftrace
      is also disabled with it).
      
      There is another event type that is called a TIMESTAMP which can
      hold 64 bits of data in the theoretical case that two events happen
      18 years apart. This code has not been implemented, but the name
      of this event exists, as well as the structure for it. The
      size of a TIMESTAMP is 16 bytes, where as a time extend is only
      8 bytes. The macro used to calculate how many time extends can fit on
      a page used the TIMESTAMP size instead of the time extend size
      cutting the amount in half.
      
      The following test case can easily trigger the warning since we only
      need to have half the page filled with time extends to trigger the
      warning:
      
       # cd /sys/kernel/debug/tracing/
       # echo function > current_tracer
       # echo 'common_pid < 0' > events/ftrace/function/filter
       # echo > trace
       # echo 1 > trace_marker
       # sleep 120
       # cat trace
      
      Enabling the function tracer and then setting the filter to only trace
      functions where the process id is negative (no events), then clearing
      the trace buffer to ensure that we have nothing in the buffer,
      then write to trace_marker to add an event to the beginning of a page,
      sleep for 2 minutes (only 35 seconds is probably needed, but this
      guarantees the bug), and then finally reading the trace which will
      trigger the bug.
      
      This patch fixes the typo and prevents the false positive of that warning.
      Reported-by: default avatarHans J. Koch <hjk@linutronix.de>
      Tested-by: default avatarHans J. Koch <hjk@linutronix.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Stable Kernel <stable@kernel.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      d0134324
    • Deng-Cheng Zhu's avatar
      perf, MIPS: Support cross compiling of tools/perf for MIPS · c1e028ef
      Deng-Cheng Zhu authored
      Changes:
       v4: Fix the cosmetic issue of redundant dot-ops
       v3: Change rmb() to use SYNC
       v2: Include mips unistd.h and define rmb()/cpu_relax() in tools/perf/perf.h
      Signed-off-by: default avatarDeng-Cheng Zhu <dengcheng.zhu@gmail.com>
      Acked-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Cc: David Daney <ddaney@caviumnetworks.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c1e028ef
    • Jean Delvare's avatar
      drm/radeon/kms: Silent spurious error message · a8c051f0
      Jean Delvare authored
      I see the following error message in my kernel log from time to time:
      radeon 0000:07:00.0: ffff88007c334000 reserve failed for wait
      radeon 0000:07:00.0: ffff88007c334000 reserve failed for wait
      
      After investigation, it turns out that there's nothing to be afraid of
      and everything works as intended. So remove the spurious log message.
      Signed-off-by: default avatarJean Delvare <khali@linux-fr.org>
      Reviewed-by: default avatarJerome Glisse <jglisse@redhat.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      a8c051f0
    • Alex Deucher's avatar
      d31dba58
    • Alex Deucher's avatar
      drm/radeon/kms: make TV/DFP table info less verbose · 40f76d81
      Alex Deucher authored
      Make TV standard and DFP table revisions debug only.
      Signed-off-by: default avatarAlex Deucher <alexdeucher@gmail.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      40f76d81
    • Alex Deucher's avatar
      drm/radeon/kms: leave certain CP int bits enabled · 3555e53b
      Alex Deucher authored
      These bits are used for internal communication and should
      be left enabled.  This may fix s/r issues on some systems.
      Signed-off-by: default avatarAlex Deucher <alexdeucher@gmail.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      3555e53b
    • Jerome Glisse's avatar
      drm/radeon/kms: avoid corner case issue with unmappable vram V2 · c919b371
      Jerome Glisse authored
      We should not allocate any object into unmappable vram if we
      have no means to access them which on all GPU means having the
      CP running and on newer GPU having the blit utility working.
      
      This patch limit the vram allocation to visible vram until
      we have acceleration up and running.
      
      Note that it's more than unlikely that we run into any issue
      related to that as when acceleration is not woring userspace
      should allocate any object in vram beside front buffer which
      should fit in visible vram.
      
      V2 use real_vram_size as mc_vram_size could be bigger than
         the actual amount of vram
      
      [airlied: fixup r700_cp_stop case]
      Signed-off-by: default avatarJerome Glisse <jglisse@redhat.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      c919b371
    • John Blackwood's avatar
      perf: Fix incorrect copy_from_user() usage · ad0cf347
      John Blackwood authored
      perf events: repair incorrect use of copy_from_user
      
      This makes the perf_event_period() return 0 instead of
      -EFAULT on success.
      
      Signed-off-by: John Blackwood<john.blackwood@ccur.com>
      Signed-off-by: default avatarJoe Korty <joe.korty@ccur.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20100928220311.GA18145@tsunami.ccur.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ad0cf347
    • Eric Paris's avatar
      fanotify: disable fanotify syscalls · 7c534773
      Eric Paris authored
      This patch disables the fanotify syscalls by just not building them and
      letting the cond_syscall() statements in kernel/sys_ni.c redirect them
      to sys_ni_syscall().
      
      It was pointed out by Tvrtko Ursulin that the fanotify interface did not
      include an explicit prioritization between groups.  This is necessary
      for fanotify to be usable for hierarchical storage management software,
      as they must get first access to the file, before inotify-like notifiers
      see the file.
      
      This feature can be added in an ABI compatible way in the next release
      (by using a number of bits in the flags field to carry the info) but it
      was suggested by Alan that maybe we should just hold off and do it in
      the next cycle, likely with an (new) explicit argument to the syscall.
      I don't like this approach best as I know people are already starting to
      use the current interface, but Alan is all wise and noone on list backed
      me up with just using what we have.  I feel this is needlessly ripping
      the rug out from under people at the last minute, but if others think it
      needs to be a new argument it might be the best way forward.
      
      Three choices:
      Go with what we got (and implement the new feature next cycle).  Add a
      new field right now (and implement the new feature next cycle).  Wait
      till next cycle to release the ABI (and implement the new feature next
      cycle).  This is number 3.
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7c534773
  5. 11 Oct, 2010 4 commits
    • Eric Dumazet's avatar
      tg3: restore rx_dropped accounting · b0057c51
      Eric Dumazet authored
      commit 511d2224 (tg3: 64 bit stats on all arches), overlooked the
      rx_dropped accounting.
      
      We use a full "struct rtnl_link_stats64" to hold rx_dropped value, but
      forgot to report it in tg3_get_stats64().
      
      Use an "unsigned long" instead to shrink "struct tg3" by 176 bytes, and
      report this value to stats readers.
      
      Increment rx_dropped counter for oversized frames.
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      CC: Michael Chan <mchan@broadcom.com>
      CC: Matt Carlson <mcarlson@broadcom.com>
      Acked-by: default avatarMatt Carlson <mcarlson@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b0057c51
    • Paul Fertser's avatar
      b44: fix carrier detection on bind · bcf64aa3
      Paul Fertser authored
      For carrier detection to work properly when binding the driver with a cable
      unplugged, netif_carrier_off() should be called after register_netdev(),
      not before.
      Signed-off-by: default avatarPaul Fertser <fercerpav@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bcf64aa3
    • Yinghai Lu's avatar
      x86, numa: For each node, register the memory blocks actually used · 73cf624d
      Yinghai Lu authored
      Russ reported SGI UV is broken recently. He said:
      
      | The SRAT table shows that memory range is spread over two nodes.
      |
      | SRAT: Node 0 PXM 0 100000000-800000000
      | SRAT: Node 1 PXM 1 800000000-1000000000
      | SRAT: Node 0 PXM 0 1000000000-1080000000
      |
      |Previously, the kernel early_node_map[] would show three entries
      |with the proper node.
      |
      |[    0.000000]     0: 0x00100000 -> 0x00800000
      |[    0.000000]     1: 0x00800000 -> 0x01000000
      |[    0.000000]     0: 0x01000000 -> 0x01080000
      |
      |The problem is recent community kernel early_node_map[] shows
      |only two entries with the node 0 entry overlapping the node 1
      |entry.
      |
      |    0: 0x00100000 -> 0x01080000
      |    1: 0x00800000 -> 0x01000000
      
      After looking at the changelog, Found out that it has been broken for a while by
      following commit
      
      |commit 8716273c
      |Author: David Rientjes <rientjes@google.com>
      |Date:   Fri Sep 25 15:20:04 2009 -0700
      |
      |    x86: Export srat physical topology
      
      Before that commit, register_active_regions() is called for every SRAT memory
      entry right away.
      
      Use nodememblk_range[] instead of nodes[] in order to make sure we
      capture the actual memory blocks registered with each node.  nodes[]
      contains an extended range which spans all memory regions associated
      with a node, but that does not mean that all the memory in between are
      included.
      Reported-by: default avatarRuss Anderson <rja@sgi.com>
      Tested-by: default avatarRuss Anderson <rja@sgi.com>
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      LKML-Reference: <4CB27BDF.5000800@kernel.org>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: <stable@kernel.org> 2.6.33 .34 .35 .36
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      73cf624d
    • Kees Cook's avatar
      net: clear heap allocations for privileged ethtool actions · b00916b1
      Kees Cook authored
      Several other ethtool functions leave heap uncleared (potentially) by
      drivers. Some interfaces appear safe (eeprom, etc), in that the sizes
      are well controlled. In some situations (e.g. unchecked error conditions),
      the heap will remain unchanged in areas before copying back to userspace.
      Note that these are less of an issue since these all require CAP_NET_ADMIN.
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarKees Cook <kees.cook@canonical.com>
      Acked-by: default avatarBen Hutchings <bhutchings@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b00916b1