1. 01 Jun, 2011 1 commit
    • Youquan Song's avatar
      intel-iommu: Enable super page (2MiB, 1GiB, etc.) support · 6dd9a7c7
      Youquan Song authored
      There are no externally-visible changes with this. In the loop in the
      internal __domain_mapping() function, we simply detect if we are mapping:
        - size >= 2MiB, and
        - virtual address aligned to 2MiB, and
        - physical address aligned to 2MiB, and
        - on hardware that supports superpages.
      
      (and likewise for larger superpages).
      
      We automatically use a superpage for such mappings. We never have to
      worry about *breaking* superpages, since we trust that we will always
      *unmap* the same range that was mapped. So all we need to do is ensure
      that dma_pte_clear_range() will also cope with superpages.
      
      Adjust pfn_to_dma_pte() to take a superpage 'level' as an argument, so
      it can return a PTE at the appropriate level rather than always
      extending the page tables all the way down to level 1. Again, this is
      simplified by the fact that we should never encounter existing small
      pages when we're creating a mapping; any old mapping that used the same
      virtual range will have been entirely removed and its obsolete page
      tables freed.
      
      Provide an 'intel_iommu=sp_off' argument on the command line as a
      chicken bit. Not that it should ever be required.
      
      ==
      
      The original commit seen in the iommu-2.6.git was Youquan's
      implementation (and completion) of my own half-baked code which I'd
      typed into an email. Followed by half a dozen subsequent 'fixes'.
      
      I've taken the unusual step of rewriting history and collapsing the
      original commits in order to keep the main history simpler, and make
      life easier for the people who are going to have to backport this to
      older kernels. And also so I can give it a more coherent commit comment
      which (hopefully) gives a better explanation of what's going on.
      
      The original sequence of commits leading to identical code was:
      
      Youquan Song (3):
            intel-iommu: super page support
            intel-iommu: Fix superpage alignment calculation error
            intel-iommu: Fix superpage level calculation error in dma_pfn_level_pte()
      
      David Woodhouse (4):
            intel-iommu: Precalculate superpage support for dmar_domain
            intel-iommu: Fix hardware_largepage_caps()
            intel-iommu: Fix inappropriate use of superpages in __domain_mapping()
            intel-iommu: Fix phys_pfn in __domain_mapping for sglist pages
      Signed-off-by: default avatarYouquan Song <youquan.song@intel.com>
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      6dd9a7c7
  2. 24 May, 2011 3 commits
  3. 19 May, 2011 1 commit
  4. 18 May, 2011 22 commits
  5. 17 May, 2011 10 commits
    • Jeff Layton's avatar
      cifs: fix cifsConvertToUCS() for the mapchars case · 11379b5e
      Jeff Layton authored
      As Metze pointed out, commit 84cdf74e broke mapchars option:
      
          Commit "cifs: fix unaligned accesses in cifsConvertToUCS"
          (84cdf74e) does multiple steps
          in just one commit (moving the function and changing it without
          testing).
      
          put_unaligned_le16(temp, &target[j]); is never called for any
          codepoint the goes via the 'default' switch statement. As a result
          we put just zero (or maybe uninitialized) bytes into the target
          buffer.
      
      His proposed patch looks correct, but doesn't apply to the current head
      of the tree. This patch should also fix it.
      
      Cc: <stable@kernel.org> # .38.x: 581ade4d: cifs: clean up various nits in unicode routines (try #2)
      Reported-by: default avatarStefan Metzmacher <metze@samba.org>
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarSteve French <sfrench@us.ibm.com>
      11379b5e
    • Jeff Layton's avatar
      cifs: add fallback in is_path_accessible for old servers · 221d1d79
      Jeff Layton authored
      The is_path_accessible check uses a QPathInfo call, which isn't
      supported by ancient win9x era servers. Fall back to an older
      SMBQueryInfo call if it fails with the magic error codes.
      
      Cc: stable@kernel.org
      Reported-and-Tested-by: default avatarSandro Bonazzola <sandro.bonazzola@gmail.com>
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarSteve French <sfrench@us.ibm.com>
      221d1d79
    • Linus Torvalds's avatar
      Merge branch 'timers-fixes-for-linus' of... · a085963a
      Linus Torvalds authored
      Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        tick: Clear broadcast active bit when switching to oneshot
        rtc: mc13xxx: Don't call rtc_device_register while holding lock
        rtc: rp5c01: Initialize drvdata before registering device
        rtc: pcap: Initialize drvdata before registering device
        rtc: msm6242: Initialize drvdata before registering device
        rtc: max8998: Initialize drvdata before registering device
        rtc: max8925: Initialize drvdata before registering device
        rtc: m41t80: Initialize clientdata before registering device
        rtc: ds1286: Initialize drvdata before registering device
        rtc: ep93xx: Initialize drvdata before registering device
        rtc: davinci: Initialize drvdata before registering device
        rtc: mxc: Initialize drvdata before registering device
        clocksource: Install completely before selecting
      a085963a
    • Borislav Petkov's avatar
      x86, AMD: Fix ARAT feature setting again · 14fb57dc
      Borislav Petkov authored
      Trying to enable the local APIC timer on early K8 revisions
      uncovers a number of other issues with it, in conjunction with
      the C1E enter path on AMD. Fixing those causes much more churn
      and troubles than the benefit of using that timer brings so
      don't enable it on K8 at all, falling back to the original
      functionality the kernel had wrt to that.
      Reported-and-bisected-by: default avatarNick Bowler <nbowler@elliptictech.com>
      Cc: Boris Ostrovsky <Boris.Ostrovsky@amd.com>
      Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Hans Rosenfeld <hans.rosenfeld@amd.com>
      Cc: Nick Bowler <nbowler@elliptictech.com>
      Cc: Joerg-Volker-Peetz <jvpeetz@web.de>
      Signed-off-by: default avatarBorislav Petkov <borislav.petkov@amd.com>
      Link: http://lkml.kernel.org/r/1305636919-31165-3-git-send-email-bp@amd64.orgSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
      14fb57dc
    • Borislav Petkov's avatar
      Revert "x86, AMD: Fix APIC timer erratum 400 affecting K8 Rev.A-E processors" · 328935e6
      Borislav Petkov authored
      This reverts commit e20a2d20, as it crashes
      certain boxes with specific AMD CPU models.
      
      Moving the lower endpoint of the Erratum 400 check to accomodate
      earlier K8 revisions (A-E) opens a can of worms which is simply
      not worth to fix properly by tweaking the errata checking
      framework:
      
      * missing IntPenging MSR on revisions < CG cause #GP:
      
      http://marc.info/?l=linux-kernel&m=130541471818831
      
      * makes earlier revisions use the LAPIC timer instead of the C1E
      idle routine which switches to HPET, thus not waking up in
      deeper C-states:
      
      http://lkml.org/lkml/2011/4/24/20
      
      Therefore, leave the original boundary starting with K8-revF.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      328935e6
    • Jens Axboe's avatar
      scsi: remove performance regression due to async queue run · 9937a5e2
      Jens Axboe authored
      Commit c21e6beb removed our queue request_fn re-enter
      protection, and defaulted to always running the queues from
      kblockd to be safe. This was a known potential slow down,
      but should be safe.
      
      Unfortunately this is causing big performance regressions for
      some, so we need to improve this logic. Looking into the details
      of the re-enter, the real issue is on requeue of requests.
      
      Requeue of requests upon seeing a BUSY condition from the device
      ends up re-running the queue, causing traces like this:
      
      scsi_request_fn()
              scsi_dispatch_cmd()
                      scsi_queue_insert()
                              __scsi_queue_insert()
                                      scsi_run_queue()
      					scsi_request_fn()
      						...
      
      potentially causing the issue we want to avoid. So special
      case the requeue re-run of the queue, but improve it to offload
      the entire run of local queue and starved queue from a single
      workqueue callback. This is a lot better than potentially
      kicking off a workqueue run for each device seen.
      
      This also fixes the issue of the local device going into recursion,
      since the above mentioned commit never moved that queue run out
      of line.
      Signed-off-by: default avatarJens Axboe <jaxboe@fusionio.com>
      9937a5e2
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · c1d10d18
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
        net: Change netdev_fix_features messages loglevel
        vmxnet3: Fix inconsistent LRO state after initialization
        sfc: Fix oops in register dump after mapping change
        IPVS: fix netns if reading ip_vs_* procfs entries
        bridge: fix forwarding of IPv6
      c1d10d18
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc · 477de0de
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
        Revert "mmc: fix a race between card-detect rescan and clock-gate work instances"
      477de0de
    • Randy Dunlap's avatar
      mm: fix kernel-doc warning in page_alloc.c · b5e6ab58
      Randy Dunlap authored
      Fix new kernel-doc warning in mm/page_alloc.c:
      
        Warning(mm/page_alloc.c:2370): No description found for parameter 'nid'
      Signed-off-by: default avatarRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b5e6ab58
    • Yinghai Lu's avatar
      PCI: Clear bridge resource flags if requested size is 0 · 93d2175d
      Yinghai Lu authored
      During pci remove/rescan testing found:
      
        pci 0000:c0:03.0: PCI bridge to [bus c4-c9]
        pci 0000:c0:03.0:   bridge window [io  0x1000-0x0fff]
        pci 0000:c0:03.0:   bridge window [mem 0xf0000000-0xf00fffff]
        pci 0000:c0:03.0:   bridge window [mem 0xfc180000000-0xfc197ffffff 64bit pref]
        pci 0000:c0:03.0: device not available (can't reserve [io  0x1000-0x0fff])
        pci 0000:c0:03.0: Error enabling bridge (-22), continuing
        pci 0000:c0:03.0: enabling bus mastering
        pci 0000:c0:03.0: setting latency timer to 64
        pcieport 0000:c0:03.0: device not available (can't reserve [io  0x1000-0x0fff])
        pcieport: probe of 0000:c0:03.0 failed with error -22
      
      This bug was caused by commit c8adf9a3 ("PCI: pre-allocate
      additional resources to devices only after successful allocation of
      essential resources.")
      
      After that commit, pci_hotplug_io_size is changed to additional_io_size
      from minium size.  So it will not go through resource_size(res) != 0
      path, and will not be reset.
      
      The root cause is: pci_bridge_check_ranges will set RESOURCE_IO flag for
      pci bridge, and later if children do not need IO resource.  those bridge
      resources will not need to be allocated.  but flags is still there.
      that will confuse the the pci_enable_bridges later.
      
      related code:
      
         static void assign_requested_resources_sorted(struct resource_list *head,
                                          struct resource_list_x *fail_head)
         {
                 struct resource *res;
                 struct resource_list *list;
                 int idx;
      
                 for (list = head->next; list; list = list->next) {
                         res = list->res;
                         idx = res - &list->dev->resource[0];
                         if (resource_size(res) && pci_assign_resource(list->dev, idx)) {
         ...
                                 reset_resource(res);
                         }
                 }
         }
      
      At last, We have to clear the flags in pbus_size_mem/io when requested
      size == 0 and !add_head.  becasue this case it will not go through
      adjust_resources_sorted().
      
      Just make size1 = size0 when !add_head. it will make flags get cleared.
      
      At the same time when requested size == 0, add_size != 0, will still
      have in head and add_list.  because we do not clear the flags for it.
      
      After this, we will get right result:
      
        pci 0000:c0:03.0: PCI bridge to [bus c4-c9]
        pci 0000:c0:03.0:   bridge window [io  disabled]
        pci 0000:c0:03.0:   bridge window [mem 0xf0000000-0xf00fffff]
        pci 0000:c0:03.0:   bridge window [mem 0xfc180000000-0xfc197ffffff 64bit pref]
        pci 0000:c0:03.0: enabling bus mastering
        pci 0000:c0:03.0: setting latency timer to 64
        pcieport 0000:c0:03.0: setting latency timer to 64
        pcieport 0000:c0:03.0: irq 160 for MSI/MSI-X
        pcieport 0000:c0:03.0: Signaling PME through PCIe PME interrupt
        pci 0000:c4:00.0: Signaling PME through PCIe PME interrupt
        pcie_pme 0000:c0:03.0:pcie01: service driver pcie_pme loaded
        aer 0000:c0:03.0:pcie02: service driver aer loaded
        pciehp 0000:c0:03.0:pcie04: Hotplug Controller:
      
      v3: more simple fix. also fix one typo in pbus_size_mem
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Reviewed-by: default avatarRam Pai <linuxram@us.ibm.com>
      Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      93d2175d
  6. 16 May, 2011 3 commits
    • Thomas Gleixner's avatar
      tick: Clear broadcast active bit when switching to oneshot · 07f4beb0
      Thomas Gleixner authored
      The first cpu which switches from periodic to oneshot mode switches
      also the broadcast device into oneshot mode. The broadcast device
      serves as a backup for per cpu timers which stop in deeper
      C-states. To avoid starvation of the cpus which might be in idle and
      depend on broadcast mode it marks the other cpus as broadcast active
      and sets the brodcast expiry value of those cpus to the next tick.
      
      The oneshot mode broadcast bit for the other cpus is sticky and gets
      only cleared when those cpus exit idle. If a cpu was not idle while
      the bit got set in consequence the bit prevents that the broadcast
      device is armed on behalf of that cpu when it enters idle for the
      first time after it switched to oneshot mode.
      
      In most cases that goes unnoticed as one of the other cpus has usually
      a timer pending which keeps the broadcast device armed with a short
      timeout. Now if the only cpu which has a short timer active has the
      bit set then the broadcast device will not be armed on behalf of that
      cpu and will fire way after the expected timer expiry. In the case of
      Christians bug report it took ~145 seconds which is about half of the
      wrap around time of HPET (the limit for that device) due to the fact
      that all other cpus had no timers armed which expired before the 145
      seconds timeframe.
      
      The solution is simply to clear the broadcast active bit
      unconditionally when a cpu switches to oneshot mode after the first
      cpu switched the broadcast device over. It's not idle at that point
      otherwise it would not be executing that code.
      
      [ I fundamentally hate that broadcast crap. Why the heck thought some
        folks that when going into deep idle it's a brilliant concept to
        switch off the last device which brings the cpu back from that
        state? ]
      
      Thanks to Christian for providing all the valuable debug information!
      Reported-and-tested-by: default avatarChristian Hoffmann <email@christianhoffmann.info>
      Cc: John Stultz <johnstul@us.ibm.com>
      Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1105161105170.3078%40ionos%3E
      Cc: stable@kernel.org
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      07f4beb0
    • Michał Mirosław's avatar
      net: Change netdev_fix_features messages loglevel · 6f404e44
      Michał Mirosław authored
      Those reduced to DEBUG can possibly be triggered by unprivileged processes
      and are nothing exceptional. Illegal checksum combinations can only be
      caused by driver bug, so promote those messages to WARN.
      
      Since GSO without SG will now only cause DEBUG message from
      netdev_fix_features(), remove the workaround from register_netdevice().
      Signed-off-by: default avatarMichał Mirosław <mirq-linux@rere.qmqm.pl>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6f404e44
    • Thomas Jarosch's avatar
      vmxnet3: Fix inconsistent LRO state after initialization · ebde6f8a
      Thomas Jarosch authored
      During initialization of vmxnet3, the state of LRO
      gets out of sync with netdev->features.
      
      This leads to very poor TCP performance in a IP forwarding
      setup and is hitting many VMware users.
      
      Simplified call sequence:
      1. vmxnet3_declare_features() initializes "adapter->lro" to true.
      
      2. The kernel automatically disables LRO if IP forwarding is enabled,
      so vmxnet3_set_flags() gets called. This also updates netdev->features.
      
      3. Now vmxnet3_setup_driver_shared() is called. "adapter->lro" is still
      set to true and LRO gets enabled again, even though
      netdev->features shows it's disabled.
      
      Fix it by updating "adapter->lro", too.
      
      The private vmxnet3 adapter flags are scheduled for removal
      in net-next, see commit a0d2730c
      "net: vmxnet3: convert to hw_features".
      
      Patch applies to 2.6.37 / 2.6.38 and 2.6.39-rc6.
      
      Please CC: comments.
      Signed-off-by: default avatarThomas Jarosch <thomas.jarosch@intra2net.com>
      Acked-by: default avatarStephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ebde6f8a