1. 29 Jan, 2016 40 commits
    • Willy Tarreau's avatar
      Linux 2.6.32.70 · 1a5b69df
      Willy Tarreau authored
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1a5b69df
    • Paolo Bonzini's avatar
      kvm: x86: only channel 0 of the i8254 is linked to the HPET · 24c14705
      Paolo Bonzini authored
      commit e5e57e7a upstream.
      
      While setting the KVM PIT counters in 'kvm_pit_load_count', if
      'hpet_legacy_start' is set, the function disables the timer on
      channel[0], instead of the respective index 'channel'. This is
      because channels 1-3 are not linked to the HPET.  Fix the caller
      to only activate the special HPET processing for channel 0.
      Reported-by: default avatarP J P <pjp@fedoraproject.org>
      Fixes: 0185604cSigned-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit ef90cf3d)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      24c14705
    • Andrew Honig's avatar
      KVM: x86: Reload pit counters for all channels when restoring state · 4d1805f9
      Andrew Honig authored
      commit 0185604c upstream.
      
      Currently if userspace restores the pit counters with a count of 0
      on channels 1 or 2 and the guest attempts to read the count on those
      channels, then KVM will perform a mod of 0 and crash.  This will ensure
      that 0 values are converted to 65536 as per the spec.
      
      This is CVE-2015-7513.
      Signed-off-by: default avatarAndy Honig <ahonig@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 08b8d1a6)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      4d1805f9
    • Andrew Banman's avatar
      mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone() · c7bde200
      Andrew Banman authored
      commit 5f0f2887 upstream.
      
      test_pages_in_a_zone() does not account for the possibility of missing
      sections in the given pfn range.  pfn_valid_within always returns 1 when
      CONFIG_HOLES_IN_ZONE is not set, allowing invalid pfns from missing
      sections to pass the test, leading to a kernel oops.
      
      Wrap an additional pfn loop with PAGES_PER_SECTION granularity to check
      for missing sections before proceeding into the zone-check code.
      
      This also prevents a crash from offlining memory devices with missing
      sections.  Despite this, it may be a good idea to keep the related patch
      '[PATCH 3/3] drivers: memory: prohibit offlining of memory blocks with
      missing sections' because missing sections in a memory block may lead to
      other problems not covered by the scope of this fix.
      Signed-off-by: default avatarAndrew Banman <abanman@sgi.com>
      Acked-by: default avatarAlex Thorlton <athorlton@sgi.com>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Greg KH <greg@kroah.com>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 17f6a291)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      c7bde200
    • Daniel Kiper's avatar
      mm: add SECTION_ALIGN_UP() and SECTION_ALIGN_DOWN() macro · 73734375
      Daniel Kiper authored
      commit a539f353 upstream.
      
      Add SECTION_ALIGN_UP() and SECTION_ALIGN_DOWN() macro which aligns given
      pfn to upper section and lower section boundary accordingly.
      
      Required for the latest memory hotplug support for the Xen balloon driver.
      Signed-off-by: default avatarDaniel Kiper <dkiper@net-space.pl>
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      David Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [wt: only needed for next patch]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      73734375
    • Andrey Ryabinin's avatar
      ipv6/addrlabel: fix ip6addrlbl_get() · 5a7fbabb
      Andrey Ryabinin authored
      commit e459dfee upstream.
      
      ip6addrlbl_get() has never worked. If ip6addrlbl_hold() succeeded,
      ip6addrlbl_get() will exit with '-ESRCH'. If ip6addrlbl_hold() failed,
      ip6addrlbl_get() will use about to be free ip6addrlbl_entry pointer.
      
      Fix this by inverting ip6addrlbl_hold() check.
      
      Fixes: 2a8cc6c8 ("[IPV6] ADDRCONF: Support RFC3484 configurable address selection policy table.")
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reviewed-by: default avatarCong Wang <cwang@twopensource.com>
      Acked-by: default avatarYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 39b214ba)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      5a7fbabb
    • Helge Deller's avatar
      parisc: Fix syscall restarts · 3bf5fe19
      Helge Deller authored
      commit 71a71fb5 upstream.
      
      On parisc syscalls which are interrupted by signals sometimes failed to
      restart and instead returned -ENOSYS which in the worst case lead to
      userspace crashes.
      A similiar problem existed on MIPS and was fixed by commit e967ef02
      ("MIPS: Fix restart of indirect syscalls").
      
      On parisc the current syscall restart code assumes that all syscall
      callers load the syscall number in the delay slot of the ble
      instruction. That's how it is e.g. done in the unistd.h header file:
      	ble 0x100(%sr2, %r0)
      	ldi #syscall_nr, %r20
      Because of that assumption the current code never restored %r20 before
      returning to userspace.
      
      This assumption is at least not true for code which uses the glibc
      syscall() function, which instead uses this syntax:
      	ble 0x100(%sr2, %r0)
      	copy regX, %r20
      where regX depend on how the compiler optimizes the code and register
      usage.
      
      This patch fixes this problem by adding code to analyze how the syscall
      number is loaded in the delay branch and - if needed - copy the syscall
      number to regX prior returning to userspace for the syscall restart.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 9f2dcffe)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      3bf5fe19
    • Ed Swierk's avatar
      MIPS: Fix restart of indirect syscalls · 912fcae9
      Ed Swierk authored
      commit e967ef02 upstream.
      
      When 32-bit MIPS userspace invokes a syscall indirectly via syscall(number,
      arg1, ..., arg7), the kernel looks up the actual syscall based on the given
      number, shifts the other arguments to the left, and jumps to the syscall.
      
      If the syscall is interrupted by a signal and indicates it needs to be
      restarted by the kernel (by returning ERESTARTNOINTR for example), the
      syscall must be called directly, since the number is no longer the first
      argument, and the other arguments are now staged for a direct call.
      
      Before shifting the arguments, store the syscall number in pt_regs->regs[2].
      This gets copied temporarily into pt_regs->regs[0] after the syscall returns.
      If the syscall needs to be restarted, handle_signal()/do_signal() copies the
      number back to pt_regs->reg[2], which ends up in $v0 once control returns to
      userspace.
      Signed-off-by: default avatarEd Swierk <eswierk@skyportsystems.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/8929/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 08f865bb)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      912fcae9
    • Alan Stern's avatar
      USB: fix invalid memory access in hub_activate() · 533030ca
      Alan Stern authored
      commit e50293ef upstream.
      
      Commit 8520f380 ("USB: change hub initialization sleeps to
      delayed_work") changed the hub_activate() routine to make part of it
      run in a workqueue.  However, the commit failed to take a reference to
      the usb_hub structure or to lock the hub interface while doing so.  As
      a result, if a hub is plugged in and quickly unplugged before the work
      routine can run, the routine will try to access memory that has been
      deallocated.  Or, if the hub is unplugged while the routine is
      running, the memory may be deallocated while it is in active use.
      
      This patch fixes the problem by taking a reference to the usb_hub at
      the start of hub_activate() and releasing it at the end (when the work
      is finished), and by locking the hub interface while the work routine
      is running.  It also adds a check at the start of the routine to see
      if the hub has already been disconnected, in which nothing should be
      done.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Reported-by: default avatarAlexandru Cornea <alexandru.cornea@intel.com>
      Tested-by: default avatarAlexandru Cornea <alexandru.cornea@intel.com>
      Fixes: 8520f380 ("USB: change hub initialization sleeps to delayed_work")
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [bwh: Backported to 3.2: add prototype for hub_release() before first use]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 10037421)
      [wt: made a few changes :
        - adjusted context due to some autopm code being added only in 2.6.33
        - no device_{lock,unlock}() in 2.6.32, use up/down(&->sem) instead]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      533030ca
    • Dan Carpenter's avatar
      USB: ipaq.c: fix a timeout loop · 00972bcd
      Dan Carpenter authored
      commit abdc9a3b upstream.
      
      The code expects the loop to end with "retries" set to zero but, because
      it is a post-op, it will end set to -1.  I have fixed this by moving the
      decrement inside the loop.
      
      Fixes: 014aa2a3 ('USB: ipaq: minor ipaq_open() cleanup.')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 53a68d3f)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      00972bcd
    • Michael Holzheu's avatar
      s390/dis: Fix handling of format specifiers · 22a1caa0
      Michael Holzheu authored
      commit 272fa59c upstream.
      
      The print_insn() function returns strings like "lghi %r1,0". To escape the
      '%' character in sprintf() a second '%' is used. For example "lghi %%r1,0"
      is converted into "lghi %r1,0".
      
      After print_insn() the output string is passed to printk(). Because format
      specifiers like "%r" or "%f" are ignored by printk() this works by chance
      most of the time. But for instructions with control registers like
      "lctl %c6,%c6,780" this fails because printk() interprets "%c" as
      character format specifier.
      
      Fix this problem and escape the '%' characters twice.
      
      For example "lctl %%%%c6,%%%%c6,780" is then converted by sprintf()
      into "lctl %%c6,%%c6,780" and by printk() into "lctl %c6,%c6,780".
      Signed-off-by: default avatarMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      [bwh: Backported to 3.2: drop the OPERAND_VR case]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 45f32e35)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      22a1caa0
    • Johan Hovold's avatar
      spi: fix parent-device reference leak · 2bafc5d5
      Johan Hovold authored
      commit 157f38f9 upstream.
      
      Fix parent-device reference leak due to SPI-core taking an unnecessary
      reference to the parent when allocating the master structure, a
      reference that was never released.
      
      Note that driver core takes its own reference to the parent when the
      master device is registered.
      
      Fixes: 49dce689 ("spi doesn't need class_device")
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 40ddf30c)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      2bafc5d5
    • Tilman Schmidt's avatar
      ser_gigaset: fix deallocation of platform device structure · 0e0eaf7e
      Tilman Schmidt authored
      commit 4c5e354a upstream.
      
      When shutting down the device, the struct ser_cardstate must not be
      kfree()d immediately after the call to platform_device_unregister()
      since the embedded struct platform_device is still in use.
      Move the kfree() call to the release method instead.
      Signed-off-by: default avatarTilman Schmidt <tilman@imap.cc>
      Fixes: 2869b23e ("drivers/isdn/gigaset: new M101 driver (v2)")
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarPaul Bolle <pebolle@tiscali.nl>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 521e4101)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      0e0eaf7e
    • Dan Carpenter's avatar
      mISDN: fix a loop count · 1fe6e687
      Dan Carpenter authored
      commit 40d24c4d upstream.
      
      There are two issue here.
      1)  cnt starts as maxloop + 1 so all these loops iterate one more time
          than intended.
      2)  At the end of the loop we test for "if (maxloop && !cnt)" but for
          the first two loops, we end with cnt equal to -1.  Changing this to
          a pre-op means we end with cnt set to 0.
      
      Fixes: cae86d4a ('mISDN: Add driver for Infineon ISDN chipset family')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 7eb2a015)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      1fe6e687
    • Peter Hurley's avatar
      tty: Fix GPF in flush_to_ldisc() · 0dbf4344
      Peter Hurley authored
      commit 9ce119f3 upstream.
      
      A line discipline which does not define a receive_buf() method can
      can cause a GPF if data is ever received [1]. Oddly, this was known
      to the author of n_tracesink in 2011, but never fixed.
      
      [1] GPF report
          BUG: unable to handle kernel NULL pointer dereference at           (null)
          IP: [<          (null)>]           (null)
          PGD 3752d067 PUD 37a7b067 PMD 0
          Oops: 0010 [#1] SMP KASAN
          Modules linked in:
          CPU: 2 PID: 148 Comm: kworker/u10:2 Not tainted 4.4.0-rc2+ #51
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
          Workqueue: events_unbound flush_to_ldisc
          task: ffff88006da94440 ti: ffff88006db60000 task.ti: ffff88006db60000
          RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)
          RSP: 0018:ffff88006db67b50  EFLAGS: 00010246
          RAX: 0000000000000102 RBX: ffff88003ab32f88 RCX: 0000000000000102
          RDX: 0000000000000000 RSI: ffff88003ab330a6 RDI: ffff88003aabd388
          RBP: ffff88006db67c48 R08: ffff88003ab32f9c R09: ffff88003ab31fb0
          R10: ffff88003ab32fa8 R11: 0000000000000000 R12: dffffc0000000000
          R13: ffff88006db67c20 R14: ffffffff863df820 R15: ffff88003ab31fb8
          FS:  0000000000000000(0000) GS:ffff88006dc00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
          CR2: 0000000000000000 CR3: 0000000037938000 CR4: 00000000000006e0
          Stack:
           ffffffff829f46f1 ffff88006da94bf8 ffff88006da94bf8 0000000000000000
           ffff88003ab31fb0 ffff88003aabd438 ffff88003ab31ff8 ffff88006430fd90
           ffff88003ab32f9c ffffed0007557a87 1ffff1000db6cf78 ffff88003ab32078
          Call Trace:
           [<ffffffff8127cf91>] process_one_work+0x8f1/0x17a0 kernel/workqueue.c:2030
           [<ffffffff8127df14>] worker_thread+0xd4/0x1180 kernel/workqueue.c:2162
           [<ffffffff8128faaf>] kthread+0x1cf/0x270 drivers/block/aoe/aoecmd.c:1302
           [<ffffffff852a7c2f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
          Code:  Bad RIP value.
          RIP  [<          (null)>]           (null)
           RSP <ffff88006db67b50>
          CR2: 0000000000000000
          ---[ end trace a587f8947e54d6ea ]---
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit b23324ff)
      [wt: applied to drivers/char/tty_buffer.c instead]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      0dbf4344
    • James Bottomley's avatar
      ses: fix additional element traversal bug · d4b6a10d
      James Bottomley authored
      commit 5e103356 upstream.
      
      KASAN found that our additional element processing scripts drop off
      the end of the VPD page into unallocated space.  The reason is that
      not every element has additional information but our traversal
      routines think they do, leading to them expecting far more additional
      information than is present.  Fix this by adding a gate to the
      traversal routine so that it only processes elements that are expected
      to have additional information (list is in SES-2 section 6.1.13.1:
      Additional Element Status diagnostic page overview)
      Reported-by: default avatarPavel Tikhomirov <ptikhomirov@virtuozzo.com>
      Tested-by: default avatarPavel Tikhomirov <ptikhomirov@virtuozzo.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 344d6d02)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d4b6a10d
    • James Bottomley's avatar
      ses: Fix problems with simple enclosures · f49fbe9e
      James Bottomley authored
      commit 3417c1b5 upstream.
      
      Simple enclosure implementations (mostly USB) are allowed to return only
      page 8 to every diagnostic query.  That really confuses our
      implementation because we assume the return is the page we asked for and
      end up doing incorrect offsets based on bogus information leading to
      accesses outside of allocated ranges.  Fix that by checking the page
      code of the return and giving an error if it isn't the one we asked for.
      This should fix reported bugs with USB storage by simply refusing to
      attach to enclosures that behave like this.  It's also good defensive
      practise now that we're starting to see more USB enclosures.
      Reported-by: default avatarAndrea Gelmini <andrea.gelmini@gelma.net>
      Reviewed-by: default avatarEwan D. Milne <emilne@redhat.com>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 25ef9385)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      f49fbe9e
    • Johannes Berg's avatar
      rfkill: copy the name into the rfkill struct · 8a90c757
      Johannes Berg authored
      commit b7bb1100 upstream.
      
      Some users of rfkill, like NFC and cfg80211, use a dynamic name when
      allocating rfkill, in those cases dev_name(). Therefore, the pointer
      passed to rfkill_alloc() might not be valid forever, I specifically
      found the case that the rfkill name was quite obviously an invalid
      pointer (or at least garbage) when the wiphy had been renamed.
      
      Fix this by making a copy of the rfkill name in rfkill_alloc().
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 6f23bc6f)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      8a90c757
    • Eric Dumazet's avatar
      af_unix: fix a fatal race with bit fields · 24ba3c53
      Eric Dumazet authored
      commit 60bc851a upstream.
      
      Using bit fields is dangerous on ppc64/sparc64, as the compiler [1]
      uses 64bit instructions to manipulate them.
      If the 64bit word includes any atomic_t or spinlock_t, we can lose
      critical concurrent changes.
      
      This is happening in af_unix, where unix_sk(sk)->gc_candidate/
      gc_maybe_cycle/lock share the same 64bit word.
      
      This leads to fatal deadlock, as one/several cpus spin forever
      on a spinlock that will never be available again.
      
      A safer way would be to use a long to store flags.
      This way we are sure compiler/arch wont do bad things.
      
      As we own unix_gc_lock spinlock when clearing or setting bits,
      we can use the non atomic __set_bit()/__clear_bit().
      
      recursion_level can share the same 64bit location with the spinlock,
      as it is set only with this spinlock held.
      
      [1] bug fixed in gcc-4.8.0 :
      http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52080Reported-by: default avatarAmbrose Feinstein <ambrose@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 2ee9cbe7)
      [wt: adjusted context]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      24ba3c53
    • Marcelo Ricardo Leitner's avatar
      sctp: update the netstamp_needed counter when copying sockets · 7e0e67b0
      Marcelo Ricardo Leitner authored
      [ Upstream commit 01ce63c9 ]
      
      Dmitry Vyukov reported that SCTP was triggering a WARN on socket destroy
      related to disabling sock timestamp.
      
      When SCTP accepts an association or peel one off, it copies sock flags
      but forgot to call net_enable_timestamp() if a packet timestamping flag
      was copied, leading to extra calls to net_disable_timestamp() whenever
      such clones were closed.
      
      The fix is to call net_enable_timestamp() whenever we copy a sock with
      that flag on, like tcp does.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2: SK_FLAGS_TIMESTAMP is newly defined]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit d85242d9)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      7e0e67b0
    • Daniel Borkmann's avatar
      net, scm: fix PaX detected msg_controllen overflow in scm_detach_fds · 224994fa
      Daniel Borkmann authored
      [ Upstream commit 6900317f ]
      
      David and HacKurx reported a following/similar size overflow triggered
      in a grsecurity kernel, thanks to PaX's gcc size overflow plugin:
      
      (Already fixed in later grsecurity versions by Brad and PaX Team.)
      
      [ 1002.296137] PAX: size overflow detected in function scm_detach_fds net/core/scm.c:314
                     cicus.202_127 min, count: 4, decl: msg_controllen; num: 0; context: msghdr;
      [ 1002.296145] CPU: 0 PID: 3685 Comm: scm_rights_recv Not tainted 4.2.3-grsec+ #7
      [ 1002.296149] Hardware name: Apple Inc. MacBookAir5,1/Mac-66F35F19FE2A0D05, [...]
      [ 1002.296153]  ffffffff81c27366 0000000000000000 ffffffff81c27375 ffffc90007843aa8
      [ 1002.296162]  ffffffff818129ba 0000000000000000 ffffffff81c27366 ffffc90007843ad8
      [ 1002.296169]  ffffffff8121f838 fffffffffffffffc fffffffffffffffc ffffc90007843e60
      [ 1002.296176] Call Trace:
      [ 1002.296190]  [<ffffffff818129ba>] dump_stack+0x45/0x57
      [ 1002.296200]  [<ffffffff8121f838>] report_size_overflow+0x38/0x60
      [ 1002.296209]  [<ffffffff816a979e>] scm_detach_fds+0x2ce/0x300
      [ 1002.296220]  [<ffffffff81791899>] unix_stream_read_generic+0x609/0x930
      [ 1002.296228]  [<ffffffff81791c9f>] unix_stream_recvmsg+0x4f/0x60
      [ 1002.296236]  [<ffffffff8178dc00>] ? unix_set_peek_off+0x50/0x50
      [ 1002.296243]  [<ffffffff8168fac7>] sock_recvmsg+0x47/0x60
      [ 1002.296248]  [<ffffffff81691522>] ___sys_recvmsg+0xe2/0x1e0
      [ 1002.296257]  [<ffffffff81693496>] __sys_recvmsg+0x46/0x80
      [ 1002.296263]  [<ffffffff816934fc>] SyS_recvmsg+0x2c/0x40
      [ 1002.296271]  [<ffffffff8181a3ab>] entry_SYSCALL_64_fastpath+0x12/0x85
      
      Further investigation showed that this can happen when an *odd* number of
      fds are being passed over AF_UNIX sockets.
      
      In these cases CMSG_LEN(i * sizeof(int)) and CMSG_SPACE(i * sizeof(int)),
      where i is the number of successfully passed fds, differ by 4 bytes due
      to the extra CMSG_ALIGN() padding in CMSG_SPACE() to an 8 byte boundary
      on 64 bit. The padding is used to align subsequent cmsg headers in the
      control buffer.
      
      When the control buffer passed in from the receiver side *lacks* these 4
      bytes (e.g. due to buggy/wrong API usage), then msg->msg_controllen will
      overflow in scm_detach_fds():
      
        int cmlen = CMSG_LEN(i * sizeof(int));  <--- cmlen w/o tail-padding
        err = put_user(SOL_SOCKET, &cm->cmsg_level);
        if (!err)
          err = put_user(SCM_RIGHTS, &cm->cmsg_type);
        if (!err)
          err = put_user(cmlen, &cm->cmsg_len);
        if (!err) {
          cmlen = CMSG_SPACE(i * sizeof(int));  <--- cmlen w/ 4 byte extra tail-padding
          msg->msg_control += cmlen;
          msg->msg_controllen -= cmlen;         <--- iff no tail-padding space here ...
        }                                            ... wrap-around
      
      F.e. it will wrap to a length of 18446744073709551612 bytes in case the
      receiver passed in msg->msg_controllen of 20 bytes, and the sender
      properly transferred 1 fd to the receiver, so that its CMSG_LEN results
      in 20 bytes and CMSG_SPACE in 24 bytes.
      
      In case of MSG_CMSG_COMPAT (scm_detach_fds_compat()), I haven't seen an
      issue in my tests as alignment seems always on 4 byte boundary. Same
      should be in case of native 32 bit, where we end up with 4 byte boundaries
      as well.
      
      In practice, passing msg->msg_controllen of 20 to recvmsg() while receiving
      a single fd would mean that on successful return, msg->msg_controllen is
      being set by the kernel to 24 bytes instead, thus more than the input
      buffer advertised. It could f.e. become an issue if such application later
      on zeroes or copies the control buffer based on the returned msg->msg_controllen
      elsewhere.
      
      Maximum number of fds we can send is a hard upper limit SCM_MAX_FD (253).
      
      Going over the code, it seems like msg->msg_controllen is not being read
      after scm_detach_fds() in scm_recv() anymore by the kernel, good!
      
      Relevant recvmsg() handler are unix_dgram_recvmsg() (unix_seqpacket_recvmsg())
      and unix_stream_recvmsg(). Both return back to their recvmsg() caller,
      and ___sys_recvmsg() places the updated length, that is, new msg_control -
      old msg_control pointer into msg->msg_controllen (hence the 24 bytes seen
      in the example).
      
      Long time ago, Wei Yongjun fixed something related in commit 1ac70e7a
      ("[NET]: Fix function put_cmsg() which may cause usr application memory
      overflow").
      
      RFC3542, section 20.2. says:
      
        The fields shown as "XX" are possible padding, between the cmsghdr
        structure and the data, and between the data and the next cmsghdr
        structure, if required by the implementation. While sending an
        application may or may not include padding at the end of last
        ancillary data in msg_controllen and implementations must accept both
        as valid. On receiving a portable application must provide space for
        padding at the end of the last ancillary data as implementations may
        copy out the padding at the end of the control message buffer and
        include it in the received msg_controllen. When recvmsg() is called
        if msg_controllen is too small for all the ancillary data items
        including any trailing padding after the last item an implementation
        may set MSG_CTRUNC.
      
      Since we didn't place MSG_CTRUNC for already quite a long time, just do
      the same as in 1ac70e7a to avoid an overflow.
      
      Btw, even man-page author got this wrong :/ See db939c9b26e9 ("cmsg.3: Fix
      error in SCM_RIGHTS code sample"). Some people must have copied this (?),
      thus it got triggered in the wild (reported several times during boot by
      David and HacKurx).
      
      No Fixes tag this time as pre 2002 (that is, pre history tree).
      Reported-by: default avatarDavid Sterba <dave@jikos.cz>
      Reported-by: default avatarHacKurx <hackurx@gmail.com>
      Cc: PaX Team <pageexec@freemail.hu>
      Cc: Emese Revfy <re.emese@gmail.com>
      Cc: Brad Spengler <spender@grsecurity.net>
      Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
      Cc: Eric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 831a2a17)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      224994fa
    • Eric Dumazet's avatar
      tcp: initialize tp->copied_seq in case of cross SYN connection · 03bc31b9
      Eric Dumazet authored
      [ Upstream commit 142a2e7e ]
      
      Dmitry provided a syzkaller (http://github.com/google/syzkaller)
      generated program that triggers the WARNING at
      net/ipv4/tcp.c:1729 in tcp_recvmsg() :
      
      WARN_ON(tp->copied_seq != tp->rcv_nxt &&
              !(flags & (MSG_PEEK | MSG_TRUNC)));
      
      His program is specifically attempting a Cross SYN TCP exchange,
      that we support (for the pleasure of hackers ?), but it looks we
      lack proper tcp->copied_seq initialization.
      
      Thanks again Dmitry for your report and testings.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 6cfa9781)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      03bc31b9
    • Jan Stancek's avatar
      ipmi: move timer init to before irq is setup · ac5da7cf
      Jan Stancek authored
      commit 27f972d3 upstream.
      
      We encountered a panic on boot in ipmi_si on a dell per320 due to an
      uninitialized timer as follows.
      
      static int smi_start_processing(void       *send_info,
                                      ipmi_smi_t intf)
      {
              /* Try to claim any interrupts. */
              if (new_smi->irq_setup)
                      new_smi->irq_setup(new_smi);
      
       --> IRQ arrives here and irq handler tries to modify uninitialized timer
      
          which triggers BUG_ON(!timer->function) in __mod_timer().
      
       Call Trace:
         <IRQ>
         [<ffffffffa0532617>] start_new_msg+0x47/0x80 [ipmi_si]
         [<ffffffffa053269e>] start_check_enables+0x4e/0x60 [ipmi_si]
         [<ffffffffa0532bd8>] smi_event_handler+0x1e8/0x640 [ipmi_si]
         [<ffffffff810f5584>] ? __rcu_process_callbacks+0x54/0x350
         [<ffffffffa053327c>] si_irq_handler+0x3c/0x60 [ipmi_si]
         [<ffffffff810efaf0>] handle_IRQ_event+0x60/0x170
         [<ffffffff810f245e>] handle_edge_irq+0xde/0x180
         [<ffffffff8100fc59>] handle_irq+0x49/0xa0
         [<ffffffff8154643c>] do_IRQ+0x6c/0xf0
         [<ffffffff8100ba53>] ret_from_intr+0x0/0x11
      
              /* Set up the timer that drives the interface. */
              setup_timer(&new_smi->si_timer, smi_timeout, (long)new_smi);
      
      The following patch fixes the problem.
      
      To: Openipmi-developer@lists.sourceforge.net
      To: Corey Minyard <minyard@acm.org>
      CC: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarJan Stancek <jstancek@redhat.com>
      Signed-off-by: default avatarTony Camuso <tcamuso@redhat.com>
      Signed-off-by: default avatarCorey Minyard <cminyard@mvista.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 207ffa8c)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      ac5da7cf
    • Sasha Levin's avatar
      sched/core: Remove false-positive warning from wake_up_process() · 788c44ff
      Sasha Levin authored
      commit 119d6f6a upstream.
      
      Because wakeups can (fundamentally) be late, a task might not be in
      the expected state. Therefore testing against a task's state is racy,
      and can yield false positives.
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: oleg@redhat.com
      Fixes: 9067ac85 ("wake_up_process() should be never used to wakeup a TASK_STOPPED/TRACED task")
      Link: http://lkml.kernel.org/r/1448933660-23082-1-git-send-email-sasha.levin@oracle.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      [bwh: Backported to 3.2: adjust filename]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 0e796c1b)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      788c44ff
    • Andrew Lunn's avatar
      ipv4: igmp: Allow removing groups from a removed interface · 5f487676
      Andrew Lunn authored
      commit 4eba7bb1 upstream.
      
      When a multicast group is joined on a socket, a struct ip_mc_socklist
      is appended to the sockets mc_list containing information about the
      joined group.
      
      If the interface is hot unplugged, this entry becomes stale. Prior to
      commit 52ad353a ("igmp: fix the problem when mc leave group") it
      was possible to remove the stale entry by performing a
      IP_DROP_MEMBERSHIP, passing either the old ifindex or ip address on
      the interface. However, this fix enforces that the interface must
      still exist. Thus with time, the number of stale entries grows, until
      sysctl_igmp_max_memberships is reached and then it is not possible to
      join and more groups.
      
      The previous patch fixes an issue where a IP_DROP_MEMBERSHIP is
      performed without specifying the interface, either by ifindex or ip
      address. However here we do supply one of these. So loosen the
      restriction on device existence to only apply when the interface has
      not been specified. This then restores the ability to clean up the
      stale entries.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Fixes: 52ad353a "(igmp: fix the problem when mc leave group")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit b1fa8526)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      5f487676
    • Peter Hurley's avatar
      wan/x25: Fix use-after-free in x25_asy_open_tty() · abffc10c
      Peter Hurley authored
      commit ee9159dd upstream.
      
      The N_X25 line discipline may access the previous line discipline's closed
      and already-freed private data on open [1].
      
      The tty->disc_data field _never_ refers to valid data on entry to the
      line discipline's open() method. Rather, the ldisc is expected to
      initialize that field for its own use for the lifetime of the instance
      (ie. from open() to close() only).
      
      [1]
          [  634.336761] ==================================================================
          [  634.338226] BUG: KASAN: use-after-free in x25_asy_open_tty+0x13d/0x490 at addr ffff8800a743efd0
          [  634.339558] Read of size 4 by task syzkaller_execu/8981
          [  634.340359] =============================================================================
          [  634.341598] BUG kmalloc-512 (Not tainted): kasan: bad access detected
          ...
          [  634.405018] Call Trace:
          [  634.405277] dump_stack (lib/dump_stack.c:52)
          [  634.405775] print_trailer (mm/slub.c:655)
          [  634.406361] object_err (mm/slub.c:662)
          [  634.406824] kasan_report_error (mm/kasan/report.c:138 mm/kasan/report.c:236)
          [  634.409581] __asan_report_load4_noabort (mm/kasan/report.c:279)
          [  634.411355] x25_asy_open_tty (drivers/net/wan/x25_asy.c:559 (discriminator 1))
          [  634.413997] tty_ldisc_open.isra.2 (drivers/tty/tty_ldisc.c:447)
          [  634.414549] tty_set_ldisc (drivers/tty/tty_ldisc.c:567)
          [  634.415057] tty_ioctl (drivers/tty/tty_io.c:2646 drivers/tty/tty_io.c:2879)
          [  634.423524] do_vfs_ioctl (fs/ioctl.c:43 fs/ioctl.c:607)
          [  634.427491] SyS_ioctl (fs/ioctl.c:622 fs/ioctl.c:613)
          [  634.427945] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:188)
      Reported-and-tested-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 485274cf)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      abffc10c
    • Jeff Layton's avatar
      nfs: if we have no valid attrs, then don't declare the attribute cache valid · 86ba5011
      Jeff Layton authored
      commit c812012f upstream.
      
      If we pass in an empty nfs_fattr struct to nfs_update_inode, it will
      (correctly) not update any of the attributes, but it then clears the
      NFS_INO_INVALID_ATTR flag, which indicates that the attributes are
      up to date. Don't clear the flag if the fattr struct has no valid
      attrs to apply.
      Reviewed-by: default avatarSteve French <steve.french@primarydata.com>
      Signed-off-by: default avatarJeff Layton <jeff.layton@primarydata.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit ddab0155)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      86ba5011
    • David Turner's avatar
      ext4: Fix handling of extended tv_sec · 74147da1
      David Turner authored
      commit a4dad1ae upstream.
      
      In ext4, the bottom two bits of {a,c,m}time_extra are used to extend
      the {a,c,m}time fields, deferring the year 2038 problem to the year
      2446.
      
      When decoding these extended fields, for times whose bottom 32 bits
      would represent a negative number, sign extension causes the 64-bit
      extended timestamp to be negative as well, which is not what's
      intended.  This patch corrects that issue, so that the only negative
      {a,c,m}times are those between 1901 and 1970 (as per 32-bit signed
      timestamps).
      
      Some older kernels might have written pre-1970 dates with 1,1 in the
      extra bits.  This patch treats those incorrectly-encoded dates as
      pre-1970, instead of post-2311, until kernel 4.20 is released.
      Hopefully by then e2fsck will have fixed up the bad data.
      
      Also add a comment explaining the encoding of ext4's extra {a,c,m}time
      bits.
      Signed-off-by: default avatarDavid Turner <novalis@novalis.org>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reported-by: default avatarMark Harris <mh8928@yahoo.com>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=23732Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 6dfd5f6a)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      74147da1
    • Jan Kara's avatar
      vfs: Avoid softlockups with sendfile(2) · dce8179b
      Jan Kara authored
      commit c2489e07 upstream.
      
      The following test program from Dmitry can cause softlockups or RCU
      stalls as it copies 1GB from tmpfs into eventfd and we don't have any
      scheduling point at that path in sendfile(2) implementation:
      
              int r1 = eventfd(0, 0);
              int r2 = memfd_create("", 0);
              unsigned long n = 1<<30;
              fallocate(r2, 0, 0, n);
              sendfile(r1, r2, 0, n);
      
      Add cond_resched() into __splice_from_pipe() to fix the problem.
      
      CC: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 47ae562e)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      dce8179b
    • Al Viro's avatar
      fix sysvfs symlinks · 54146dba
      Al Viro authored
      commit 0ebf7f10 upstream.
      
      The thing got broken back in 2002 - sysvfs does *not* have inline
      symlinks; even short ones have bodies stored in the first block
      of file.  sysv_symlink() handles that correctly; unfortunately,
      attempting to look an existing symlink up will end up confusing
      them for inline symlinks, and interpret the block number containing
      the body as the body itself.
      
      Nobody has noticed until now, which says something about the level
      of testing sysvfs gets ;-/
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      [bwh: Backported to 3.2:
       - Adjust context
       - Also delete unused sysv_fast_symlink_inode_operations]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 081c7697)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      54146dba
    • Roman Gushchin's avatar
      fuse: break infinite loop in fuse_fill_write_pages() · de743b3d
      Roman Gushchin authored
      commit 3ca8138f upstream.
      
      I got a report about unkillable task eating CPU. Further
      investigation shows, that the problem is in the fuse_fill_write_pages()
      function. If iov's first segment has zero length, we get an infinite
      loop, because we never reach iov_iter_advance() call.
      
      Fix this by calling iov_iter_advance() before repeating an attempt to
      copy data from userspace.
      
      A similar problem is described in 124d3b70 ("fix writev regression:
      pan hanging unkillable and un-straceable"). If zero-length segmend
      is followed by segment with invalid address,
      iov_iter_fault_in_readable() checks only first segment (zero-length),
      iov_iter_copy_from_user_atomic() skips it, fails at second and
      returns zero -> goto again without skipping zero-length segment.
      
      Patch calls iov_iter_advance() before goto again: we'll skip zero-length
      segment at second iteraction and iov_iter_fault_in_readable() will detect
      invalid address.
      
      Special thanks to Konstantin Khlebnikov, who helped a lot with the commit
      description.
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Maxim Patlasov <mpatlasov@parallels.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: default avatarRoman Gushchin <klamm@yandex-team.ru>
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Fixes: ea9b9907 ("fuse: implement perform_write")
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit a5b23416)
      [wt: adjusted context, as commit 478e0841 from 3.1 was never backported
       to call mark_page_accessed() eventhough it seems it should have been]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      de743b3d
    • lucien's avatar
      sctp: translate host order to network order when setting a hmacid · 0ee43751
      lucien authored
      commit ed5a377d upstream.
      
      now sctp auth cannot work well when setting a hmacid manually, which
      is caused by that we didn't use the network order for hmacid, so fix
      it by adding the transformation in sctp_auth_ep_set_hmacs.
      
      even we set hmacid with the network order in userspace, it still
      can't work, because of this condition in sctp_auth_ep_set_hmacs():
      
      		if (id > SCTP_AUTH_HMAC_ID_MAX)
      			return -EOPNOTSUPP;
      
      so this wasn't working before and thus it won't break compatibility.
      
      Fixes: 65b07e5d ("[SCTP]: API updates to suport SCTP-AUTH extensions.")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 98d37e7f)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      0ee43751
    • David S. Miller's avatar
      ba76e374
    • Hannes Frederic Sowa's avatar
      net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration · 073a63f1
      Hannes Frederic Sowa authored
      commit 7bbadd2d upstream.
      
      Docbook does not like the definition of macros inside a field declaration
      and adds a warning. Move the definition out.
      
      Fixes: 79462ad0 ("net: add validation for the socket syscall protocol argument")
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2: keep open-coding U8_MAX]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 35da6d62)
      [wt: adjusted context]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      073a63f1
    • Hannes Frederic Sowa's avatar
      net: add validation for the socket syscall protocol argument · be9b6c29
      Hannes Frederic Sowa authored
      commit 79462ad0 upstream.
      
      郭永刚 reported that one could simply crash the kernel as root by
      using a simple program:
      
      	int socket_fd;
      	struct sockaddr_in addr;
      	addr.sin_port = 0;
      	addr.sin_addr.s_addr = INADDR_ANY;
      	addr.sin_family = 10;
      
      	socket_fd = socket(10,3,0x40000000);
      	connect(socket_fd , &addr,16);
      
      AF_INET, AF_INET6 sockets actually only support 8-bit protocol
      identifiers. inet_sock's skc_protocol field thus is sized accordingly,
      thus larger protocol identifiers simply cut off the higher bits and
      store a zero in the protocol fields.
      
      This could lead to e.g. NULL function pointer because as a result of
      the cut off inet_num is zero and we call down to inet_autobind, which
      is NULL for raw sockets.
      
      kernel: Call Trace:
      kernel:  [<ffffffff816db90e>] ? inet_autobind+0x2e/0x70
      kernel:  [<ffffffff816db9a4>] inet_dgram_connect+0x54/0x80
      kernel:  [<ffffffff81645069>] SYSC_connect+0xd9/0x110
      kernel:  [<ffffffff810ac51b>] ? ptrace_notify+0x5b/0x80
      kernel:  [<ffffffff810236d8>] ? syscall_trace_enter_phase2+0x108/0x200
      kernel:  [<ffffffff81645e0e>] SyS_connect+0xe/0x10
      kernel:  [<ffffffff81779515>] tracesys_phase2+0x84/0x89
      
      I found no particular commit which introduced this problem.
      
      CVE: CVE-2015-8543
      Cc: Cong Wang <cwang@twopensource.com>
      Reported-by: default avatar郭永刚 <guoyonggang@360.cn>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 2.6.32: open-code U8_MAX; adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      be9b6c29
    • David Howells's avatar
      KEYS: Fix race between read and revoke · 67fbe958
      David Howells authored
      commit b4a1b4f5 upstream.
      
      This fixes CVE-2015-7550.
      
      There's a race between keyctl_read() and keyctl_revoke().  If the revoke
      happens between keyctl_read() checking the validity of a key and the key's
      semaphore being taken, then the key type read method will see a revoked key.
      
      This causes a problem for the user-defined key type because it assumes in
      its read method that there will always be a payload in a non-revoked key
      and doesn't check for a NULL pointer.
      
      Fix this by making keyctl_read() check the validity of a key after taking
      semaphore instead of before.
      
      I think the bug was introduced with the original keyrings code.
      
      This was discovered by a multithreaded test program generated by syzkaller
      (http://github.com/google/syzkaller).  Here's a cleaned up version:
      
      	#include <sys/types.h>
      	#include <keyutils.h>
      	#include <pthread.h>
      	void *thr0(void *arg)
      	{
      		key_serial_t key = (unsigned long)arg;
      		keyctl_revoke(key);
      		return 0;
      	}
      	void *thr1(void *arg)
      	{
      		key_serial_t key = (unsigned long)arg;
      		char buffer[16];
      		keyctl_read(key, buffer, 16);
      		return 0;
      	}
      	int main()
      	{
      		key_serial_t key = add_key("user", "%", "foo", 3, KEY_SPEC_USER_KEYRING);
      		pthread_t th[5];
      		pthread_create(&th[0], 0, thr0, (void *)(unsigned long)key);
      		pthread_create(&th[1], 0, thr1, (void *)(unsigned long)key);
      		pthread_create(&th[2], 0, thr0, (void *)(unsigned long)key);
      		pthread_create(&th[3], 0, thr1, (void *)(unsigned long)key);
      		pthread_join(th[0], 0);
      		pthread_join(th[1], 0);
      		pthread_join(th[2], 0);
      		pthread_join(th[3], 0);
      		return 0;
      	}
      
      Build as:
      
      	cc -o keyctl-race keyctl-race.c -lkeyutils -lpthread
      
      Run as:
      
      	while keyctl-race; do :; done
      
      as it may need several iterations to crash the kernel.  The crash can be
      summarised as:
      
      	BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
      	IP: [<ffffffff81279b08>] user_read+0x56/0xa3
      	...
      	Call Trace:
      	 [<ffffffff81276aa9>] keyctl_read_key+0xb6/0xd7
      	 [<ffffffff81277815>] SyS_keyctl+0x83/0xe0
      	 [<ffffffff815dbb97>] entry_SYSCALL_64_fastpath+0x12/0x6f
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarJames Morris <james.l.morris@oracle.com>
      [bwh: Backported to 2.6.32: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      67fbe958
    • Eric Dumazet's avatar
      udp: properly support MSG_PEEK with truncated buffers · 5ccf7b4d
      Eric Dumazet authored
      commit 197c949e upstream.
      
      Backport of this upstream commit into stable kernels :
      89c22d8c ("net: Fix skb csum races when peeking")
      exposed a bug in udp stack vs MSG_PEEK support, when user provides
      a buffer smaller than skb payload.
      
      In this case,
      skb_copy_and_csum_datagram_iovec(skb, sizeof(struct udphdr),
                                       msg->msg_iov);
      returns -EFAULT.
      
      This bug does not happen in upstream kernels since Al Viro did a great
      job to replace this into :
      skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr), msg);
      This variant is safe vs short buffers.
      
      For the time being, instead reverting Herbert Xu patch and add back
      skb->ip_summed invalid changes, simply store the result of
      udp_lib_checksum_complete() so that we avoid computing the checksum a
      second time, and avoid the problematic
      skb_copy_and_csum_datagram_iovec() call.
      
      This patch can be applied on recent kernels as it avoids a double
      checksumming, then backported to stable kernels as a bug fix.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 18a6eba2)
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      5ccf7b4d
    • Willy Tarreau's avatar
      Revert "net: add length argument to skb_copy_and_csum_datagram_iovec" · d699b47f
      Willy Tarreau authored
      This reverts commit c507639b.
      As reported by Michal Kubecek, this fix doesn't handle truncated
      reads correctly. Next patch from Eric fixes it better.
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d699b47f
    • Ben Hutchings's avatar
      ext4: Fix null dereference in ext4_fill_super() · 6e5577bf
      Ben Hutchings authored
      Fix failure paths in ext4_fill_super() that can lead to a null
      dereference.  This was designated CVE-2015-8324.
      
      Mostly extracted from commit 744692dc ("ext4: use
      ext4_get_block_write in buffer write").
      
      However there's one more incorrect goto to fix, removed upstream in
      commit cf40db13 ("ext4: remove failed journal checksum check").
      
      Reference: https://bugs.openvz.org/browse/OVZ-6541Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      6e5577bf
    • Rainer Weikusat's avatar
      unix: avoid use-after-free in ep_remove_wait_queue · 60bc0106
      Rainer Weikusat authored
      commit 7d267278 upstream.
      
      Rainer Weikusat <rweikusat@mobileactivedefense.com> writes:
      An AF_UNIX datagram socket being the client in an n:1 association with
      some server socket is only allowed to send messages to the server if the
      receive queue of this socket contains at most sk_max_ack_backlog
      datagrams. This implies that prospective writers might be forced to go
      to sleep despite none of the message presently enqueued on the server
      receive queue were sent by them. In order to ensure that these will be
      woken up once space becomes again available, the present unix_dgram_poll
      routine does a second sock_poll_wait call with the peer_wait wait queue
      of the server socket as queue argument (unix_dgram_recvmsg does a wake
      up on this queue after a datagram was received). This is inherently
      problematic because the server socket is only guaranteed to remain alive
      for as long as the client still holds a reference to it. In case the
      connection is dissolved via connect or by the dead peer detection logic
      in unix_dgram_sendmsg, the server socket may be freed despite "the
      polling mechanism" (in particular, epoll) still has a pointer to the
      corresponding peer_wait queue. There's no way to forcibly deregister a
      wait queue with epoll.
      
      Based on an idea by Jason Baron, the patch below changes the code such
      that a wait_queue_t belonging to the client socket is enqueued on the
      peer_wait queue of the server whenever the peer receive queue full
      condition is detected by either a sendmsg or a poll. A wake up on the
      peer queue is then relayed to the ordinary wait queue of the client
      socket via wake function. The connection to the peer wait queue is again
      dissolved if either a wake up is about to be relayed or the client
      socket reconnects or a dead peer is detected or the client socket is
      itself closed. This enables removing the second sock_poll_wait from
      unix_dgram_poll, thus avoiding the use-after-free, while still ensuring
      that no blocked writer sleeps forever.
      Signed-off-by: default avatarRainer Weikusat <rweikusat@mobileactivedefense.com>
      Fixes: ec0d215f ("af_unix: fix 'poll for write'/connected DGRAM sockets")
      Reviewed-by: default avatarJason Baron <jbaron@akamai.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 2.6.32:
       - Access sk_sleep directly, not through sk_sleep() function
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      60bc0106