1. 17 Apr, 2015 2 commits
    • Roy Franz's avatar
      x86/efi: Store upper bits of command line buffer address in ext_cmd_line_ptr · 98b228f5
      Roy Franz authored
      Until now, the EFI stub was only setting the 32 bit cmd_line_ptr in
      the setup_header structure, so on 64 bit platforms this could be truncated.
      This patch adds setting the upper bits of the buffer address in
      ext_cmd_line_ptr.  This case was likely never hit, as the allocation
      for this buffer is done at the lowest available address.  Only
      x86_64 kernels have this problem, as the 1-1 mapping mandated
      by EFI ensures that all memory is 32 bit addressable on 32 bit
      platforms.  The EFI stub does not support mixed mode, so the
      32 bit kernel on 64 bit firmware case does not need to be handled.
      Signed-off-by: default avatarRoy Franz <roy.franz@linaro.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      98b228f5
    • Ross Lagerwall's avatar
      efivarfs: Ensure VariableName is NUL-terminated · c57dcb56
      Ross Lagerwall authored
      Some buggy firmware implementations update VariableNameSize on success
      such that it does not include the final NUL character which results in
      garbage in the efivarfs name entries.  Use kzalloc on the efivar_entry
      (as is done in efivars.c) to ensure that the name is always
      NUL-terminated.
      
      The buggy firmware is:
      BIOS Information
              Vendor: Intel Corp.
              Version: S1200RP.86B.02.02.0005.102320140911
              Release Date: 10/23/2014
              BIOS Revision: 4.6
      System Information
              Manufacturer: Intel Corporation
              Product Name: S1200RP_SE
      Signed-off-by: default avatarRoss Lagerwall <ross.lagerwall@citrix.com>
      Acked-by: default avatarMatthew Garrett <mjg59@coreos.com>
      Cc: Jeremy Kerr <jk@ozlabs.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      c57dcb56
  2. 27 Mar, 2015 1 commit
    • Jean Delvare's avatar
      firmware: dmi_scan: Prevent dmi_num integer overflow · bfbaafae
      Jean Delvare authored
      dmi_num is a u16, dmi_len is a u32, so this construct:
      
      	dmi_num = dmi_len / 4;
      
      would result in an integer overflow for a DMI table larger than
      256 kB. I've never see such a large table so far, but SMBIOS 3.0
      makes it possible so maybe we'll see such tables in the future.
      
      So instead of faking a structure count when the entry point does
      not provide it, adjust the loop condition in dmi_table() to properly
      deal with the case where dmi_num is not set.
      
      This bug was introduced with the initial SMBIOS 3.0 support in commit
      fc430262 ("dmi: add support for SMBIOS 3.0 64-bit entry point").
      Signed-off-by: default avatarJean Delvare <jdelvare@suse.de>
      Cc: Matt Fleming <matt.fleming@intel.com>
      Cc: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
      Cc: <stable@vger.kernel.org>
      Acked-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      bfbaafae
  3. 24 Feb, 2015 2 commits
    • Ivan Khoronzhuk's avatar
      firmware: dmi_scan: Fix dmi_len type · 6d9ff473
      Ivan Khoronzhuk authored
      According to SMBIOSv3 specification the length of DMI table can be
      up to 32bits wide. So use appropriate type to avoid overflow.
      
      It's obvious that dmi_num theoretically can be more than u16 also,
      so it's can be changed to u32 or at least it's better to use int
      instead of u16, but on that moment I cannot imagine dmi structure
      count more than 65535 and it can require changing type of vars that
      work with it. So I didn't correct it.
      Acked-by: default avatarArd Biesheuvel <ard@linaro.org>
      Signed-off-by: default avatarIvan Khoronzhuk <ivan.khoronzhuk@linaro.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      6d9ff473
    • Yinghai Lu's avatar
      efi/libstub: Fix boundary checking in efi_high_alloc() · 7ed620bb
      Yinghai Lu authored
      While adding support loading kernel and initrd above 4G to grub2 in legacy
      mode, I was referring to efi_high_alloc().
      That will allocate buffer for kernel and then initrd, and initrd will
      use kernel buffer start as limit.
      
      During testing found two buffers will be overlapped when initrd size is
      very big like 400M.
      
      It turns out efi_high_alloc() boundary checking is not right.
      end - size will be the new start, and should not compare new
      start with max, we need to make sure end is smaller than max.
      
      [ Basically, with the current efi_high_alloc() code it's possible to
        allocate memory above 'max', because efi_high_alloc() doesn't check
        that the tail of the allocation is below 'max'.
      
        If you have an EFI memory map with a single entry that looks like so,
      
         [0xc0000000-0xc0004000]
      
        And want to allocate 0x3000 bytes below 0xc0003000 the current code
        will allocate [0xc0001000-0xc0004000], not [0xc0000000-0xc0003000]
        like you would expect. - Matt ]
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Reviewed-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Tested-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      7ed620bb
  4. 18 Feb, 2015 2 commits
    • Ivan Khoronzhuk's avatar
      firmware: dmi_scan: Fix dmi scan to handle "End of Table" structure · ce204e9a
      Ivan Khoronzhuk authored
      The dmi-sysfs should create "End of Table" entry, that is type 127. But
      after adding initial SMBIOS v3 support fc430262 ("dmi: add support
      for SMBIOS 3.0 64-bit entry point") the 127-0 entry is not handled any
      more, as result it's not created in dmi sysfs for instance. This is
      important because the size of whole DMI table must correspond to sum of
      all DMI entry sizes.
      
      So move the end-of-table check after it's handled by dmi_table.
      Reviewed-by: default avatarArd Biesheuvel <ard@linaro.org>
      Signed-off-by: default avatarIvan Khoronzhuk <ivan.khoronzhuk@linaro.org>
      Cc: <stable@vger.kernel.org> # v3.19
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      ce204e9a
    • Matt Fleming's avatar
      Revert "efi/libstub: Call get_memory_map() to obtain map and desc sizes" · 43a9f696
      Matt Fleming authored
      This reverts commit d1a8d66b.
      
      Ard reported a boot failure when running UEFI under Qemu and Xen and
      experimenting with various Tianocore build options,
      
       "As it turns out, when allocating room for the UEFI memory map using
        UEFI's AllocatePool (), it may result in two new memory map entries
        being created, for instance, when using Tianocore's preallocated region
        feature. For example, the following region
      
        0x00005ead5000-0x00005ebfffff [Conventional Memory|   |  |  |  |  |WB|WT|WC|UC]
      
        may be split like this
      
        0x00005ead5000-0x00005eae2fff [Conventional Memory|   |  |  |  |  |WB|WT|WC|UC]
        0x00005eae3000-0x00005eae4fff [Loader Data        |   |  |  |  |  |WB|WT|WC|UC]
        0x00005eae5000-0x00005ebfffff [Conventional Memory|   |  |  |  |  |WB|WT|WC|UC]
      
        if the preallocated Loader Data region was chosen to be right in the
        middle of the original free space.
      
        After patch d1a8d66b ("efi/libstub: Call get_memory_map() to
        obtain map and desc sizes"), this is not being dealt with correctly
        anymore, as the existing logic to allocate room for a single additional
        entry has become insufficient."
      
      Mark requested to reinstate the old loop we had before commit
      d1a8d66b, which grows the memory map buffer until it's big enough to
      hold the EFI memory map.
      Acked-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      43a9f696
  5. 13 Feb, 2015 1 commit
    • Matt Fleming's avatar
      x86/efi: Avoid triple faults during EFI mixed mode calls · 96738c69
      Matt Fleming authored
      Andy pointed out that if an NMI or MCE is received while we're in the
      middle of an EFI mixed mode call a triple fault will occur. This can
      happen, for example, when issuing an EFI mixed mode call while running
      perf.
      
      The reason for the triple fault is that we execute the mixed mode call
      in 32-bit mode with paging disabled but with 64-bit kernel IDT handlers
      installed throughout the call.
      
      At Andy's suggestion, stop playing the games we currently do at runtime,
      such as disabling paging and installing a 32-bit GDT for __KERNEL_CS. We
      can simply switch to the __KERNEL32_CS descriptor before invoking
      firmware services, and run in compatibility mode. This way, if an
      NMI/MCE does occur the kernel IDT handler will execute correctly, since
      it'll jump to __KERNEL_CS automatically.
      
      However, this change is only possible post-ExitBootServices(). Before
      then the firmware "owns" the machine and expects for its 32-bit IDT
      handlers to be left intact to service interrupts, etc.
      
      So, we now need to distinguish between early boot and runtime
      invocations of EFI services. During early boot, we need to restore the
      GDT that the firmware expects to be present. We can only jump to the
      __KERNEL32_CS code segment for mixed mode calls after ExitBootServices()
      has been invoked.
      
      A liberal sprinkling of comments in the thunking code should make the
      differences in early and late environments more apparent.
      Reported-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Tested-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      96738c69
  6. 29 Jan, 2015 1 commit
    • Ingo Molnar's avatar
      Merge tag 'efi-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/efi · 3c01b74e
      Ingo Molnar authored
      Pull EFI updates from Matt Fleming:
      
      " - Move efivarfs from the misc filesystem section to pseudo filesystem,
          since that's a more logical and accurate place - Leif Lindholm
      
        - Update efibootmgr URL in Kconfig help - Peter Jones
      
        - Improve accuracy of EFI guid function names - Borislav Petkov
      
        - Expose firmware platform size in sysfs for the benefit of EFI boot
          loader installers and other utilities - Steve McIntyre
      
        - Cleanup __init annotations for arm64/efi code - Ard Biesheuvel
      
        - Mark the UIE as unsupported for rtc-efi - Ard Biesheuvel
      
        - Fix memory leak in error code path of runtime map code - Dan Carpenter
      
        - Improve robustness of get_memory_map() by removing assumptions on the
          size of efi_memory_desc_t (which could change in future spec
          versions) and querying the firmware instead of guessing about the
          memmap size - Ard Biesheuvel
      
        - Remove superfluous guid unparse calls - Ivan Khoronzhuk
      
        - Delete unnecessary chosen@0 DT node FDT code since was duplicated
          from code in drivers/of and is entirely unnecessary - Leif Lindholm
      
         There's nothing super scary, mainly cleanups, and a merge from Ricardo who
         kindly picked up some patches from the linux-efi mailing list while I
         was out on annual leave in December.
      
         Perhaps the biggest risk is the get_memory_map() change from Ard, which
         changes the way that both the arm64 and x86 EFI boot stub build the
         early memory map. It would be good to have it bake in linux-next for a
         while.
      "
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3c01b74e
  7. 28 Jan, 2015 1 commit
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · c59c961c
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "This feels larger than I'd like but its for three reasons.
      
         a) amdkfd finalising the API more, this is a new feature introduced
            last merge window, and I'd prefer to make the tweaks to the API
            before it first gets into a stable release.
      
         b) radeon regression required splitting an internal API to fix
            properly, so it just changed a few more lines
      
         c) vmwgfx fix changes a lock from a mutex->spin lock, this is fallout
            from the new sleep checking.
      
        Otherwise there is just some tda998x fixes"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm/radeon: Remove rdev->gart.pages_addr array
        drm/radeon: Restore GART table contents after pinning it in VRAM v3
        drm/radeon: Split off gart_get_page_entry ASIC hook from set_page_entry
        drm/amdkfd: Fix bug in call to init_pipelines()
        drm/amdkfd: Fix bug in pipelines initialization
        drm/radeon: Don't increment pipe_id in kgd_init_pipeline
        drm/i2c: tda998x: set the CEC I2C address based on the slave I2C address
        drm/vmwgfx: Replace the hw mutex with a hw spinlock
        drm/amdkfd: Allow user to limit only queues per device
        drm/amdkfd: PQM handle queue creation fault
        drm: tda998x: Fix EDID read timeout on HDMI connect
        drm: tda998x: Protect the page register
      c59c961c
  8. 27 Jan, 2015 30 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 59343cd7
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Don't OOPS on socket AIO, from Christoph Hellwig.
      
       2) Scheduled scans should be aborted upon RFKILL, from Emmanuel
          Grumbach.
      
       3) Fix sleep in atomic context in kvaser_usb, from Ahmed S Darwish.
      
       4) Fix RCU locking across copy_to_user() in bpf code, from Alexei
          Starovoitov.
      
       5) Lots of crash, memory leak, short TX packet et al bug fixes in
          sh_eth from Ben Hutchings.
      
       6) Fix memory corruption in SCTP wrt.  INIT collitions, from Daniel
          Borkmann.
      
       7) Fix return value logic for poll handlers in netxen, enic, and bnx2x.
          From Eric Dumazet and Govindarajulu Varadarajan.
      
       8) Header length calculation fix in mac80211 from Fred Chou.
      
       9) mv643xx_eth doesn't handle highmem correctly in non-TSO code paths.
          From Ezequiel Garcia.
      
      10) udp_diag has bogus logic in it's hash chain skipping, copy same fix
          tcp diag used.  From Herbert Xu.
      
      11) amd-xgbe programs wrong rx flow control register, from Thomas
          Lendacky.
      
      12) Fix race leading to use after free in ping receive path, from Subash
          Abhinov Kasiviswanathan.
      
      13) Cache redirect routes otherwise we can get a heavy backlog of rcu
          jobs liberating DST_NOCACHE entries.  From Hannes Frederic Sowa.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (48 commits)
        net: don't OOPS on socket aio
        stmmac: prevent probe drivers to crash kernel
        bnx2x: fix napi poll return value for repoll
        ipv6: replacing a rt6_info needs to purge possible propagated rt6_infos too
        sh_eth: Fix DMA-API usage for RX buffers
        sh_eth: Check for DMA mapping errors on transmit
        sh_eth: Ensure DMA engines are stopped before freeing buffers
        sh_eth: Remove RX overflow log messages
        ping: Fix race in free in receive path
        udp_diag: Fix socket skipping within chain
        can: kvaser_usb: Fix state handling upon BUS_ERROR events
        can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT
        can: kvaser_usb: Send correct context to URB completion
        can: kvaser_usb: Do not sleep in atomic context
        ipv4: try to cache dst_entries which would cause a redirect
        samples: bpf: relax test_maps check
        bpf: rcu lock must not be held when calling copy_to_user()
        net: sctp: fix slab corruption from use after free on INIT collisions
        net: mv643xx_eth: Fix highmem support in non-TSO egress path
        sh_eth: Fix serialisation of interrupt disable with interrupt & NAPI handlers
        ...
      59343cd7
    • Christoph Hellwig's avatar
      net: don't OOPS on socket aio · 06539d30
      Christoph Hellwig authored
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      06539d30
    • Andy Shevchenko's avatar
      stmmac: prevent probe drivers to crash kernel · 9afec6ef
      Andy Shevchenko authored
      In the case when alloc_netdev fails we return NULL to a caller. But there is no
      check for NULL in the probe drivers. This patch changes NULL to an error
      pointer. The function description is amended to reflect what we may get
      returned.
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9afec6ef
    • Linus Torvalds's avatar
      Merge tag 'powerpc-3.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux · 7da323bb
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "Two powerpc fixes"
      
      * tag 'powerpc-3.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
        powerpc/powernv: Restore LPCR with LPCR_PECE1 cleared
        powerpc/xmon: Fix another endiannes issue in RTAS call from xmon
      7da323bb
    • Linus Torvalds's avatar
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux · 41592e2f
      Linus Torvalds authored
      Pull one more module fix from Rusty Russell:
       "SCSI was using module_refcount() to figure out when the module was
        unloading: this broke with new atomic refcounting.  The code is still
        suspicious, but this solves the WARN_ON()"
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
        scsi: always increment reference count
      41592e2f
    • Govindarajulu Varadarajan's avatar
      bnx2x: fix napi poll return value for repoll · 24e579c8
      Govindarajulu Varadarajan authored
      With the commit d75b1ade ("net: less interrupt masking in NAPI") napi
      repoll is done only when work_done == budget. When in busy_poll is we return 0
      in napi_poll. We should return budget.
      Signed-off-by: default avatarGovindarajulu Varadarajan <_govind@gmx.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      24e579c8
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · bf693f7b
      David S. Miller authored
      Steffen Klassert says:
      
      ====================
      ipsec 2015-01-26
      
      Just two small fixes for _decode_session6() where we
      might decode to wrong header information in some rare
      situations.
      
      Please pull or let me know if there are problems.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf693f7b
    • Hannes Frederic Sowa's avatar
      ipv6: replacing a rt6_info needs to purge possible propagated rt6_infos too · 6e9e16e6
      Hannes Frederic Sowa authored
      Lubomir Rintel reported that during replacing a route the interface
      reference counter isn't correctly decremented.
      
      To quote bug <https://bugzilla.kernel.org/show_bug.cgi?id=91941>:
      | [root@rhel7-5 lkundrak]# sh -x lal
      | + ip link add dev0 type dummy
      | + ip link set dev0 up
      | + ip link add dev1 type dummy
      | + ip link set dev1 up
      | + ip addr add 2001:db8:8086::2/64 dev dev0
      | + ip route add 2001:db8:8086::/48 dev dev0 proto static metric 20
      | + ip route add 2001:db8:8088::/48 dev dev1 proto static metric 10
      | + ip route replace 2001:db8:8086::/48 dev dev1 proto static metric 20
      | + ip link del dev0 type dummy
      | Message from syslogd@rhel7-5 at Jan 23 10:54:41 ...
      |  kernel:unregister_netdevice: waiting for dev0 to become free. Usage count = 2
      |
      | Message from syslogd@rhel7-5 at Jan 23 10:54:51 ...
      |  kernel:unregister_netdevice: waiting for dev0 to become free. Usage count = 2
      
      During replacement of a rt6_info we must walk all parent nodes and check
      if the to be replaced rt6_info got propagated. If so, replace it with
      an alive one.
      
      Fixes: 4a287eba ("IPv6 routing, NLM_F_* flag support: REPLACE and EXCL flags support, warn about missing CREATE flag")
      Reported-by: default avatarLubomir Rintel <lkundrak@v3.sk>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Tested-by: default avatarLubomir Rintel <lkundrak@v3.sk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e9e16e6
    • David S. Miller's avatar
      Merge branch 'sh_eth' · 22577609
      David S. Miller authored
      Ben Hutchings says:
      
      ====================
      Fixes for sh_eth #3
      
      I'm continuing review and testing of Ethernet support on the R-Car H2
      chip.  This series fixes the last of the more serious issues I've found.
      
      These are not tested on any of the other supported chips.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      22577609
    • Ben Hutchings's avatar
      sh_eth: Fix DMA-API usage for RX buffers · 52b9fa36
      Ben Hutchings authored
      - Use the return value of dma_map_single(), rather than calling
        virt_to_page() separately
      - Check for mapping failue
      - Call dma_unmap_single() rather than dma_sync_single_for_cpu()
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52b9fa36
    • Ben Hutchings's avatar
      sh_eth: Check for DMA mapping errors on transmit · aa3933b8
      Ben Hutchings authored
      dma_map_single() may fail if an IOMMU or swiotlb is in use, so
      we need to check for this.
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aa3933b8
    • Ben Hutchings's avatar
      sh_eth: Ensure DMA engines are stopped before freeing buffers · 740c7f31
      Ben Hutchings authored
      Currently we try to clear EDRRR and EDTRR and immediately continue to
      free buffers.  This is unsafe because:
      
      - In general, register writes are not serialised with DMA, so we still
        have to wait for DMA to complete somehow
      - The R8A7790 (R-Car H2) manual states that the TX running flag cannot
        be cleared by writing to EDTRR
      - The same manual states that clearing the RX running flag only stops
        RX DMA at the next packet boundary
      
      I applied this patch to the driver to detect DMA writes to freed
      buffers:
      
      > --- a/drivers/net/ethernet/renesas/sh_eth.c
      > +++ b/drivers/net/ethernet/renesas/sh_eth.c
      > @@ -1098,7 +1098,14 @@ static void sh_eth_ring_free(struct net_device *ndev)
      >  	/* Free Rx skb ringbuffer */
      >  	if (mdp->rx_skbuff) {
      >  		for (i = 0; i < mdp->num_rx_ring; i++)
      > +			memcpy(mdp->rx_skbuff[i]->data,
      > +			       "Hello, world", 12);
      > +		msleep(100);
      > +		for (i = 0; i < mdp->num_rx_ring; i++) {
      > +			WARN_ON(memcmp(mdp->rx_skbuff[i]->data,
      > +				       "Hello, world", 12));
      >  			dev_kfree_skb(mdp->rx_skbuff[i]);
      > +		}
      >  	}
      >  	kfree(mdp->rx_skbuff);
      >  	mdp->rx_skbuff = NULL;
      
      then ran the loop:
      
          while ethtool -G eth0 rx 128 ; ethtool -G eth0 rx 64; do echo -n .; done
      
      and 'ping -f' toward the sh_eth port from another machine.  The
      warning fired several times a minute.
      
      To fix these issues:
      
      - Deactivate all TX descriptors rather than writing to EDTRR
      - As there seems to be no way of telling when RX DMA is stopped,
        perform a soft reset to ensure that both DMA enginess are stopped
      - To reduce the possibility of the reset truncating a transmitted
        frame, disable egress and wait a reasonable time to reach a
        packet boundary before resetting
      - Update statistics before resetting
      
      (The 'reasonable time' does not allow for CS/CD in half-duplex
      mode, but half-duplex no longer seems reasonable!)
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      740c7f31
    • Ben Hutchings's avatar
      sh_eth: Remove RX overflow log messages · dc1d0e6d
      Ben Hutchings authored
      If RX traffic is overflowing the FIFO or DMA ring, logging every time
      this happens just makes things worse.  These errors are visible in the
      statistics anyway.
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dc1d0e6d
    • David S. Miller's avatar
      Merge tag 'linux-can-fixes-for-3.19-20150127' of... · 8d8d67f1
      David S. Miller authored
      Merge tag 'linux-can-fixes-for-3.19-20150127' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can 2015-01-27
      
      this is another pull request for net/master which consists of 4 patches.
      
      All 4 patches are contributed by Ahmed S. Darwish, he fixes more problems in
      the kvaser_usb driver.
      
      David, please merge net/master to net-next/master, as we have more kvaser_usb
      patches in the queue, that target net-next.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d8d67f1
    • subashab@codeaurora.org's avatar
      ping: Fix race in free in receive path · fc752f1f
      subashab@codeaurora.org authored
      An exception is seen in ICMP ping receive path where the skb
      destructor sock_rfree() tries to access a freed socket. This happens
      because ping_rcv() releases socket reference with sock_put() and this
      internally frees up the socket. Later icmp_rcv() will try to free the
      skb and as part of this, skb destructor is called and which leads
      to a kernel panic as the socket is freed already in ping_rcv().
      
      -->|exception
      -007|sk_mem_uncharge
      -007|sock_rfree
      -008|skb_release_head_state
      -009|skb_release_all
      -009|__kfree_skb
      -010|kfree_skb
      -011|icmp_rcv
      -012|ip_local_deliver_finish
      
      Fix this incorrect free by cloning this skb and processing this cloned
      skb instead.
      
      This patch was suggested by Eric Dumazet
      Signed-off-by: default avatarSubash Abhinov Kasiviswanathan <subashab@codeaurora.org>
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fc752f1f
    • Herbert Xu's avatar
      udp_diag: Fix socket skipping within chain · 86f3cddb
      Herbert Xu authored
      While working on rhashtable walking I noticed that the UDP diag
      dumping code is buggy.  In particular, the socket skipping within
      a chain never happens, even though we record the number of sockets
      that should be skipped.
      
      As this code was supposedly copied from TCP, this patch does what
      TCP does and resets num before we walk a chain.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Acked-by: default avatarPavel Emelyanov <xemul@parallels.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86f3cddb
    • Ahmed S. Darwish's avatar
      can: kvaser_usb: Fix state handling upon BUS_ERROR events · e638642b
      Ahmed S. Darwish authored
      While being in an ERROR_WARNING state, and receiving further
      bus error events with error counters still in the ERROR_WARNING
      range of 97-127 inclusive, the state handling code erroneously
      reverts back to ERROR_ACTIVE.
      
      Per the CAN standard, only revert to ERROR_ACTIVE when the
      error counters are less than 96.
      
      Moreover, in certain Kvaser models, the BUS_ERROR flag is
      always set along with undefined bits in the M16C status
      register. Thus use bitwise operators instead of full equality
      for checking that register against bus errors.
      Signed-off-by: default avatarAhmed S. Darwish <ahmed.darwish@valeo.com>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      e638642b
    • Ahmed S. Darwish's avatar
      can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT · 14c10c2a
      Ahmed S. Darwish authored
      On some x86 laptops, plugging a Kvaser device again after an
      unplug makes the firmware always ignore the very first command.
      For such a case, provide some room for retries instead of
      completely exiting the driver init code.
      Signed-off-by: default avatarAhmed S. Darwish <ahmed.darwish@valeo.com>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      14c10c2a
    • Ahmed S. Darwish's avatar
      can: kvaser_usb: Send correct context to URB completion · 3803fa69
      Ahmed S. Darwish authored
      Send expected argument to the URB completion hander: a CAN
      netdevice instead of the network interface private context
      `kvaser_usb_net_priv'.
      
      This was discovered by having some garbage in the kernel
      log in place of the netdevice names: can0 and can1.
      Signed-off-by: default avatarAhmed S. Darwish <ahmed.darwish@valeo.com>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      3803fa69
    • Ahmed S. Darwish's avatar
      can: kvaser_usb: Do not sleep in atomic context · ded50066
      Ahmed S. Darwish authored
      Upon receiving a hardware event with the BUS_RESET flag set,
      the driver kills all of its anchored URBs and resets all of
      its transmit URB contexts.
      
      Unfortunately it does so under the context of URB completion
      handler `kvaser_usb_read_bulk_callback()', which is often
      called in an atomic context.
      
      While the device is flooded with many received error packets,
      usb_kill_urb() typically sleeps/reschedules till the transfer
      request of each killed URB in question completes, leading to
      the sleep in atomic bug. [3]
      
      In v2 submission of the original driver patch [1], it was
      stated that the URBs kill and tx contexts reset was needed
      since we don't receive any tx acknowledgments later and thus
      such resources will be locked down forever. Fortunately this
      is no longer needed since an earlier bugfix in this patch
      series is now applied: all tx URB contexts are reset upon CAN
      channel close. [2]
      
      Moreover, a BUS_RESET is now treated _exactly_ like a BUS_OFF
      event, which is the recommended handling method advised by
      the device manufacturer.
      
      [1] http://article.gmane.org/gmane.linux.network/239442
          http://www.webcitation.org/6Vr2yagAQ
      
      [2] can: kvaser_usb: Reset all URB tx contexts upon channel close
          889b77f7
      
      [3] Stacktrace:
      
       <IRQ>  [<ffffffff8158de87>] dump_stack+0x45/0x57
       [<ffffffff8158b60c>] __schedule_bug+0x41/0x4f
       [<ffffffff815904b1>] __schedule+0x5f1/0x700
       [<ffffffff8159360a>] ? _raw_spin_unlock_irqrestore+0xa/0x10
       [<ffffffff81590684>] schedule+0x24/0x70
       [<ffffffff8147d0a5>] usb_kill_urb+0x65/0xa0
       [<ffffffff81077970>] ? prepare_to_wait_event+0x110/0x110
       [<ffffffff8147d7d8>] usb_kill_anchored_urbs+0x48/0x80
       [<ffffffffa01f4028>] kvaser_usb_unlink_tx_urbs+0x18/0x50 [kvaser_usb]
       [<ffffffffa01f45d0>] kvaser_usb_rx_error+0xc0/0x400 [kvaser_usb]
       [<ffffffff8108b14a>] ? vprintk_default+0x1a/0x20
       [<ffffffffa01f5241>] kvaser_usb_read_bulk_callback+0x4c1/0x5f0 [kvaser_usb]
       [<ffffffff8147a73e>] __usb_hcd_giveback_urb+0x5e/0xc0
       [<ffffffff8147a8a1>] usb_hcd_giveback_urb+0x41/0x110
       [<ffffffffa0008748>] finish_urb+0x98/0x180 [ohci_hcd]
       [<ffffffff810cd1a7>] ? acct_account_cputime+0x17/0x20
       [<ffffffff81069f65>] ? local_clock+0x15/0x30
       [<ffffffffa000a36b>] ohci_work+0x1fb/0x5a0 [ohci_hcd]
       [<ffffffff814fbb31>] ? process_backlog+0xb1/0x130
       [<ffffffffa000cd5b>] ohci_irq+0xeb/0x270 [ohci_hcd]
       [<ffffffff81479fc1>] usb_hcd_irq+0x21/0x30
       [<ffffffff8108bfd3>] handle_irq_event_percpu+0x43/0x120
       [<ffffffff8108c0ed>] handle_irq_event+0x3d/0x60
       [<ffffffff8108ec84>] handle_fasteoi_irq+0x74/0x110
       [<ffffffff81004dfd>] handle_irq+0x1d/0x30
       [<ffffffff81004727>] do_IRQ+0x57/0x100
       [<ffffffff8159482a>] common_interrupt+0x6a/0x6a
      Signed-off-by: default avatarAhmed S. Darwish <ahmed.darwish@valeo.com>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      ded50066
    • David S. Miller's avatar
      Merge tag 'mac80211-for-davem-2015-01-23' of... · 7d63585b
      David S. Miller authored
      Merge tag 'mac80211-for-davem-2015-01-23' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
      
      Another set of last-minute fixes:
       * fix station double-removal when suspending while associating
       * fix the HT (802.11n) header length calculation
       * fix the CCK radiotap flag used for monitoring, a pretty
         old regression but a simple one-liner
       * fix per-station group-key handling
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d63585b
    • Hannes Frederic Sowa's avatar
      ipv4: try to cache dst_entries which would cause a redirect · df4d9254
      Hannes Frederic Sowa authored
      Not caching dst_entries which cause redirects could be exploited by hosts
      on the same subnet, causing a severe DoS attack. This effect aggravated
      since commit f8864972 ("ipv4: fix dst race in sk_dst_get()").
      
      Lookups causing redirects will be allocated with DST_NOCACHE set which
      will force dst_release to free them via RCU.  Unfortunately waiting for
      RCU grace period just takes too long, we can end up with >1M dst_entries
      waiting to be released and the system will run OOM. rcuos threads cannot
      catch up under high softirq load.
      
      Attaching the flag to emit a redirect later on to the specific skb allows
      us to cache those dst_entries thus reducing the pressure on allocation
      and deallocation.
      
      This issue was discovered by Marcelo Leitner.
      
      Cc: Julian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarMarcelo Leitner <mleitner@redhat.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      df4d9254
    • David S. Miller's avatar
      Merge branch 'bpf' · 412d2907
      David S. Miller authored
      Alexei Starovoitov says:
      
      ====================
      bpf: fix two bugs
      
      Michael Holzheu caught two issues (in bpf syscall and in the test).
      Fix them. Details in corresponding patches.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      412d2907
    • Alexei Starovoitov's avatar
      samples: bpf: relax test_maps check · ba1a68bf
      Alexei Starovoitov authored
      hash map is unordered, so get_next_key() iterator shouldn't
      rely on particular order of elements. So relax this test.
      
      Fixes: ffb65f27 ("bpf: add a testsuite for eBPF maps")
      Reported-by: default avatarMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba1a68bf
    • Alexei Starovoitov's avatar
      bpf: rcu lock must not be held when calling copy_to_user() · 8ebe667c
      Alexei Starovoitov authored
      BUG: sleeping function called from invalid context at mm/memory.c:3732
      in_atomic(): 0, irqs_disabled(): 0, pid: 671, name: test_maps
      1 lock held by test_maps/671:
       #0:  (rcu_read_lock){......}, at: [<0000000000264190>] map_lookup_elem+0xe8/0x260
      Call Trace:
      ([<0000000000115b7e>] show_trace+0x12e/0x150)
       [<0000000000115c40>] show_stack+0xa0/0x100
       [<00000000009b163c>] dump_stack+0x74/0xc8
       [<000000000017424a>] ___might_sleep+0x23a/0x248
       [<00000000002b58e8>] might_fault+0x70/0xe8
       [<0000000000264230>] map_lookup_elem+0x188/0x260
       [<0000000000264716>] SyS_bpf+0x20e/0x840
      
      Fix it by allocating temporary buffer to store map element value.
      
      Fixes: db20fd2b ("bpf: add lookup/update/delete/iterate methods to BPF maps")
      Reported-by: default avatarMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8ebe667c
    • Daniel Borkmann's avatar
      net: sctp: fix slab corruption from use after free on INIT collisions · 600ddd68
      Daniel Borkmann authored
      When hitting an INIT collision case during the 4WHS with AUTH enabled, as
      already described in detail in commit 1be9a950 ("net: sctp: inherit
      auth_capable on INIT collisions"), it can happen that we occasionally
      still remotely trigger the following panic on server side which seems to
      have been uncovered after the fix from commit 1be9a950 ...
      
      [  533.876389] BUG: unable to handle kernel paging request at 00000000ffffffff
      [  533.913657] IP: [<ffffffff811ac385>] __kmalloc+0x95/0x230
      [  533.940559] PGD 5030f2067 PUD 0
      [  533.957104] Oops: 0000 [#1] SMP
      [  533.974283] Modules linked in: sctp mlx4_en [...]
      [  534.939704] Call Trace:
      [  534.951833]  [<ffffffff81294e30>] ? crypto_init_shash_ops+0x60/0xf0
      [  534.984213]  [<ffffffff81294e30>] crypto_init_shash_ops+0x60/0xf0
      [  535.015025]  [<ffffffff8128c8ed>] __crypto_alloc_tfm+0x6d/0x170
      [  535.045661]  [<ffffffff8128d12c>] crypto_alloc_base+0x4c/0xb0
      [  535.074593]  [<ffffffff8160bd42>] ? _raw_spin_lock_bh+0x12/0x50
      [  535.105239]  [<ffffffffa0418c11>] sctp_inet_listen+0x161/0x1e0 [sctp]
      [  535.138606]  [<ffffffff814e43bd>] SyS_listen+0x9d/0xb0
      [  535.166848]  [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b
      
      ... or depending on the the application, for example this one:
      
      [ 1370.026490] BUG: unable to handle kernel paging request at 00000000ffffffff
      [ 1370.026506] IP: [<ffffffff811ab455>] kmem_cache_alloc+0x75/0x1d0
      [ 1370.054568] PGD 633c94067 PUD 0
      [ 1370.070446] Oops: 0000 [#1] SMP
      [ 1370.085010] Modules linked in: sctp kvm_amd kvm [...]
      [ 1370.963431] Call Trace:
      [ 1370.974632]  [<ffffffff8120f7cf>] ? SyS_epoll_ctl+0x53f/0x960
      [ 1371.000863]  [<ffffffff8120f7cf>] SyS_epoll_ctl+0x53f/0x960
      [ 1371.027154]  [<ffffffff812100d3>] ? anon_inode_getfile+0xd3/0x170
      [ 1371.054679]  [<ffffffff811e3d67>] ? __alloc_fd+0xa7/0x130
      [ 1371.080183]  [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b
      
      With slab debugging enabled, we can see that the poison has been overwritten:
      
      [  669.826368] BUG kmalloc-128 (Tainted: G        W     ): Poison overwritten
      [  669.826385] INFO: 0xffff880228b32e50-0xffff880228b32e50. First byte 0x6a instead of 0x6b
      [  669.826414] INFO: Allocated in sctp_auth_create_key+0x23/0x50 [sctp] age=3 cpu=0 pid=18494
      [  669.826424]  __slab_alloc+0x4bf/0x566
      [  669.826433]  __kmalloc+0x280/0x310
      [  669.826453]  sctp_auth_create_key+0x23/0x50 [sctp]
      [  669.826471]  sctp_auth_asoc_create_secret+0xcb/0x1e0 [sctp]
      [  669.826488]  sctp_auth_asoc_init_active_key+0x68/0xa0 [sctp]
      [  669.826505]  sctp_do_sm+0x29d/0x17c0 [sctp] [...]
      [  669.826629] INFO: Freed in kzfree+0x31/0x40 age=1 cpu=0 pid=18494
      [  669.826635]  __slab_free+0x39/0x2a8
      [  669.826643]  kfree+0x1d6/0x230
      [  669.826650]  kzfree+0x31/0x40
      [  669.826666]  sctp_auth_key_put+0x19/0x20 [sctp]
      [  669.826681]  sctp_assoc_update+0x1ee/0x2d0 [sctp]
      [  669.826695]  sctp_do_sm+0x674/0x17c0 [sctp]
      
      Since this only triggers in some collision-cases with AUTH, the problem at
      heart is that sctp_auth_key_put() on asoc->asoc_shared_key is called twice
      when having refcnt 1, once directly in sctp_assoc_update() and yet again
      from within sctp_auth_asoc_init_active_key() via sctp_assoc_update() on
      the already kzfree'd memory, which is also consistent with the observation
      of the poison decrease from 0x6b to 0x6a (note: the overwrite is detected
      at a later point in time when poison is checked on new allocation).
      
      Reference counting of auth keys revisited:
      
      Shared keys for AUTH chunks are being stored in endpoints and associations
      in endpoint_shared_keys list. On endpoint creation, a null key is being
      added; on association creation, all endpoint shared keys are being cached
      and thus cloned over to the association. struct sctp_shared_key only holds
      a pointer to the actual key bytes, that is, struct sctp_auth_bytes which
      keeps track of users internally through refcounting. Naturally, on assoc
      or enpoint destruction, sctp_shared_key are being destroyed directly and
      the reference on sctp_auth_bytes dropped.
      
      User space can add keys to either list via setsockopt(2) through struct
      sctp_authkey and by passing that to sctp_auth_set_key() which replaces or
      adds a new auth key. There, sctp_auth_create_key() creates a new sctp_auth_bytes
      with refcount 1 and in case of replacement drops the reference on the old
      sctp_auth_bytes. A key can be set active from user space through setsockopt()
      on the id via sctp_auth_set_active_key(), which iterates through either
      endpoint_shared_keys and in case of an assoc, invokes (one of various places)
      sctp_auth_asoc_init_active_key().
      
      sctp_auth_asoc_init_active_key() computes the actual secret from local's
      and peer's random, hmac and shared key parameters and returns a new key
      directly as sctp_auth_bytes, that is asoc->asoc_shared_key, plus drops
      the reference if there was a previous one. The secret, which where we
      eventually double drop the ref comes from sctp_auth_asoc_set_secret() with
      intitial refcount of 1, which also stays unchanged eventually in
      sctp_assoc_update(). This key is later being used for crypto layer to
      set the key for the hash in crypto_hash_setkey() from sctp_auth_calculate_hmac().
      
      To close the loop: asoc->asoc_shared_key is freshly allocated secret
      material and independant of the sctp_shared_key management keeping track
      of only shared keys in endpoints and assocs. Hence, also commit 4184b2a7
      ("net: sctp: fix memory leak in auth key management") is independant of
      this bug here since it concerns a different layer (though same structures
      being used eventually). asoc->asoc_shared_key is reference dropped correctly
      on assoc destruction in sctp_association_free() and when active keys are
      being replaced in sctp_auth_asoc_init_active_key(), it always has a refcount
      of 1. Hence, it's freed prematurely in sctp_assoc_update(). Simple fix is
      to remove that sctp_auth_key_put() from there which fixes these panics.
      
      Fixes: 730fc3d0 ("[SCTP]: Implete SCTP-AUTH parameter processing")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      600ddd68
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew Morton) · 4adca1cb
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "Six fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        drivers/rtc/rtc-s5m.c: terminate s5m_rtc_id array with empty element
        printk: add dummy routine for when CONFIG_PRINTK=n
        mm/vmscan: fix highidx argument type
        memcg: remove extra newlines from memcg oom kill log
        x86, build: replace Perl script with Shell script
        mm: page_alloc: embed OOM killing naturally into allocation slowpath
      4adca1cb
    • Ezequiel Garcia's avatar
      net: mv643xx_eth: Fix highmem support in non-TSO egress path · 9e911414
      Ezequiel Garcia authored
      Commit 69ad0dd7
      Author: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
      Date:   Mon May 19 13:59:59 2014 -0300
      
          net: mv643xx_eth: Use dma_map_single() to map the skb fragments
      
      caused a nasty regression by removing the support for highmem skb
      fragments. By using page_address() to get the address of a fragment's
      page, we are assuming a lowmem page. However, such assumption is incorrect,
      as fragments can be in highmem pages, resulting in very nasty issues.
      
      This commit fixes this by using the skb_frag_dma_map() helper,
      which takes care of mapping the skb fragment properly. Additionally,
      the type of mapping is now tracked, so it can be unmapped using
      dma_unmap_page or dma_unmap_single when appropriate.
      
      This commit also fixes the error path in txq_init() to release the
      resources properly.
      
      Fixes: 69ad0dd7 ("net: mv643xx_eth: Use dma_map_single() to map the skb fragments")
      Reported-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarEzequiel Garcia <ezequiel.garcia@free-electrons.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9e911414
    • David S. Miller's avatar
      Merge branch 'sh_eth' · 9d08da96
      David S. Miller authored
      Ben Hutchings says:
      
      ====================
      Fixes for sh_eth #2
      
      I'm continuing review and testing of Ethernet support on the R-Car H2
      chip.  This series fixes more of the issues I've found, but it won't be
      the last set.
      
      These are not tested on any of the other supported chips.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d08da96
    • Ben Hutchings's avatar
      sh_eth: Fix serialisation of interrupt disable with interrupt & NAPI handlers · 283e38db
      Ben Hutchings authored
      In order to stop the RX path accessing the RX ring while it's being
      stopped or resized, we clear the interrupt mask (EESIPR) and then call
      free_irq() or synchronise_irq().  This is insufficient because the
      interrupt handler or NAPI poller may set EESIPR again after we clear
      it.  Also, in sh_eth_set_ringparam() we currently don't disable NAPI
      polling at all.
      
      I could easily trigger a crash by running the loop:
      
         while ethtool -G eth0 rx 128 && ethtool -G eth0 rx 64; do echo -n .; done
      
      and 'ping -f' toward the sh_eth port from another machine.
      
      To fix this:
      - Add a software flag (irq_enabled) to signal whether interrupts
        should be enabled
      - In the interrupt handler, if the flag is clear then clear EESIPR
        and return
      - In the NAPI poller, if the flag is clear then don't set EESIPR
      - Set the flag before enabling interrupts in sh_eth_dev_init() and
        sh_eth_set_ringparam()
      - Clear the flag and serialise with the interrupt and NAPI
        handlers before clearing EESIPR in sh_eth_close() and
        sh_eth_set_ringparam()
      
      After this, I could run the loop for 100,000 iterations successfully.
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      283e38db