1. 21 Apr, 2015 1 commit
  2. 13 Apr, 2015 28 commits
  3. 10 Apr, 2015 11 commits
    • D.S. Ljungmark's avatar
      ipv6: Don't reduce hop limit for an interface · 150193b9
      D.S. Ljungmark authored
      commit 6fd99094 upstream.
      
      A local route may have a lower hop_limit set than global routes do.
      
      RFC 3756, Section 4.2.7, "Parameter Spoofing"
      
      >   1.  The attacker includes a Current Hop Limit of one or another small
      >       number which the attacker knows will cause legitimate packets to
      >       be dropped before they reach their destination.
      
      >   As an example, one possible approach to mitigate this threat is to
      >   ignore very small hop limits.  The nodes could implement a
      >   configurable minimum hop limit, and ignore attempts to set it below
      >   said limit.
      Signed-off-by: default avatarD.S. Ljungmark <ljungmark@modio.se>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      150193b9
    • Tejun Heo's avatar
      writeback: fix possible underflow in write bandwidth calculation · 20a63e55
      Tejun Heo authored
      commit c72efb65 upstream.
      
      From 1ebf33901ecc75d9496862dceb1ef0377980587c Mon Sep 17 00:00:00 2001
      From: Tejun Heo <tj@kernel.org>
      Date: Mon, 23 Mar 2015 00:08:19 -0400
      
      2f800fbd ("writeback: fix dirtied pages accounting on redirty")
      introduced account_page_redirty() which reverts stat updates for a
      redirtied page, making BDI_DIRTIED no longer monotonically increasing.
      
      bdi_update_write_bandwidth() uses the delta in BDI_DIRTIED as the
      basis for bandwidth calculation.  While unlikely, since the above
      patch, the newer value may be lower than the recorded past value and
      nunderflow the bandwidth calculation leading to a wild result.
      
      Fix it by subtracing min of the old and new values when calculating
      delta.  AFAIK, there hasn't been any report of it happening but the
      resulting erratic behavior would be non-critical and temporary, so
      it's possible that the issue is happening without being reported.  The
      risk of the fix is very low, so tagged for -stable.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Greg Thelen <gthelen@google.com>
      Fixes: 2f800fbd ("writeback: fix dirtied pages accounting on redirty")
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      20a63e55
    • Vineet Gupta's avatar
      ARC: SA_SIGINFO ucontext regs off-by-one · 9084ed5e
      Vineet Gupta authored
      commit 6914e1e3 upstream.
      
      The regfile provided to SA_SIGINFO signal handler as ucontext was off by
      one due to pt_regs gutter cleanups in 2013.
      
      Before handling signal, user pt_regs are copied onto user_regs_struct and copied
      back later. Both structs are binary compatible. This was all fine until
      commit 2fa91904 (ARC: pt_regs update #2) which removed the empty stack slot
      at top of pt_regs (corresponding to first pad) and made the corresponding
      fixup in struct user_regs_struct (the pad in there was moved out of
      @scratch - not removed altogether as it is part of ptrace ABI)
      
       struct user_regs_struct {
      +       long pad;
              struct {
      -               long pad;
                      long bta, lp_start, lp_end,....
              } scratch;
       ...
       }
      
      This meant that now user_regs_struct was off by 1 reg w.r.t pt_regs and
      signal code needs to user_regs_struct.scratch to reflect it as pt_regs,
      which is what this commit does.
      
      This problem was hidden for 2 years, because both save/restore, despite
      using wrong location, were using the same location. Only an interim
      inspection (reproducer below) exposed the issue.
      
           void handle_segv(int signo, siginfo_t *info, void *context)
           {
       	ucontext_t *uc = context;
      	struct user_regs_struct *regs = &(uc->uc_mcontext.regs);
      
      	printf("regs %x %x\n",               <=== prints 7 8 (vs. 8 9)
                     regs->scratch.r8, regs->scratch.r9);
           }
      
           int main()
           {
      	struct sigaction sa;
      
      	sa.sa_sigaction = handle_segv;
      	sa.sa_flags = SA_SIGINFO;
      	sigemptyset(&sa.sa_mask);
      	sigaction(SIGSEGV, &sa, NULL);
      
      	asm volatile(
      	"mov	r7, 7	\n"
      	"mov	r8, 8	\n"
      	"mov	r9, 9	\n"
      	"mov	r10, 10	\n"
      	:::"r7","r8","r9","r10");
      
      	*((unsigned int*)0x10) = 0;
           }
      
      Fixes: 2fa91904 "ARC: pt_regs update #2: Remove unused gutter at start of pt_regs"
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      9084ed5e
    • Sergei Antonov's avatar
      hfsplus: fix B-tree corruption after insertion at position 0 · c26a3928
      Sergei Antonov authored
      commit 98cf21c6 upstream.
      
      Fix B-tree corruption when a new record is inserted at position 0 in the
      node in hfs_brec_insert().  In this case a hfs_brec_update_parent() is
      called to update the parent index node (if exists) and it is passed
      hfs_find_data with a search_key containing a newly inserted key instead
      of the key to be updated.  This results in an inconsistent index node.
      The bug reproduces on my machine after an extents overflow record for
      the catalog file (CNID=4) is inserted into the extents overflow B-tree.
      Because of a low (reserved) value of CNID=4, it has to become the first
      record in the first leaf node.
      
      The resulting first leaf node is correct:
      
        ----------------------------------------------------
        | key0.CNID=4 | key1.CNID=123 | key2.CNID=456, ... |
        ----------------------------------------------------
      
      But the parent index key0 still contains the previous key CNID=123:
      
        -----------------------
        | key0.CNID=123 | ... |
        -----------------------
      
      A change in hfs_brec_insert() makes hfs_brec_update_parent() work
      correctly by preventing it from getting fd->record=-1 value from
      __hfs_brec_find().
      
      Along the way, I removed duplicate code with unification of the if
      condition.  The resulting code is equivalent to the original code
      because node is never 0.
      
      Also hfs_brec_update_parent() will now return an error after getting a
      negative fd->record value.  However, the return value of
      hfs_brec_update_parent() is not checked anywhere in the file and I'm
      leaving it unchanged by this patch.  brec.c lacks error checking after
      some other calls too, but this issue is of less importance than the one
      being fixed by this patch.
      Signed-off-by: default avatarSergei Antonov <saproj@gmail.com>
      Cc: Joe Perches <joe@perches.com>
      Reviewed-by: default avatarVyacheslav Dubeyko <slava@dubeyko.com>
      Acked-by: default avatarHin-Tak Leung <htl10@users.sourceforge.net>
      Cc: Anton Altaparmakov <aia21@cam.ac.uk>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      c26a3928
    • Gu Zheng's avatar
      mm/memory hotplug: postpone the reset of obsolete pgdat · 9270aa9e
      Gu Zheng authored
      commit b0dc3a34 upstream.
      
      Qiu Xishi reported the following BUG when testing hot-add/hot-remove node under
      stress condition:
      
        BUG: unable to handle kernel paging request at 0000000000025f60
        IP: next_online_pgdat+0x1/0x50
        PGD 0
        Oops: 0000 [#1] SMP
        ACPI: Device does not support D3cold
        Modules linked in: fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod coretemp mperf crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 pcspkr microcode igb dca i2c_algo_bit ipv6 megaraid_sas iTCO_wdt i2c_i801 i2c_core iTCO_vendor_support tg3 sg hwmon ptp lpc_ich pps_core mfd_core acpi_pad rtc_cmos button ext3 jbd mbcache sd_mod crc_t10dif scsi_dh_alua scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh ahci libahci libata scsi_mod [last unloaded: rasf]
        CPU: 23 PID: 238 Comm: kworker/23:1 Tainted: G           O 3.10.15-5885-euler0302 #1
        Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. Huawei N1/Huawei N1, BIOS V100R001 03/02/2015
        Workqueue: events vmstat_update
        task: ffffa800d32c0000 ti: ffffa800d32ae000 task.ti: ffffa800d32ae000
        RIP: 0010: next_online_pgdat+0x1/0x50
        RSP: 0018:ffffa800d32afce8  EFLAGS: 00010286
        RAX: 0000000000001440 RBX: ffffffff81da53b8 RCX: 0000000000000082
        RDX: 0000000000000000 RSI: 0000000000000082 RDI: 0000000000000000
        RBP: ffffa800d32afd28 R08: ffffffff81c93bfc R09: ffffffff81cbdc96
        R10: 00000000000040ec R11: 00000000000000a0 R12: ffffa800fffb3440
        R13: ffffa800d32afd38 R14: 0000000000000017 R15: ffffa800e6616800
        FS:  0000000000000000(0000) GS:ffffa800e6600000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000025f60 CR3: 0000000001a0b000 CR4: 00000000001407e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
          refresh_cpu_vm_stats+0xd0/0x140
          vmstat_update+0x11/0x50
          process_one_work+0x194/0x3d0
          worker_thread+0x12b/0x410
          kthread+0xc6/0xd0
          ret_from_fork+0x7c/0xb0
      
      The cause is the "memset(pgdat, 0, sizeof(*pgdat))" at the end of
      try_offline_node, which will reset all the content of pgdat to 0, as the
      pgdat is accessed lock-free, so that the users still using the pgdat
      will panic, such as the vmstat_update routine.
      
      process A:				offline node XX:
      
      vmstat_updat()
         refresh_cpu_vm_stats()
           for_each_populated_zone()
             find online node XX
           cond_resched()
      					offline cpu and memory, then try_offline_node()
      					node_set_offline(nid), and memset(pgdat, 0, sizeof(*pgdat))
             zone = next_zone(zone)
               pg_data_t *pgdat = zone->zone_pgdat;  // here pgdat is NULL now
                 next_online_pgdat(pgdat)
                   next_online_node(pgdat->node_id);  // NULL pointer access
      
      So the solution here is postponing the reset of obsolete pgdat from
      try_offline_node() to hotadd_new_pgdat(), and just resetting
      pgdat->nr_zones and pgdat->classzone_idx to be 0 rather than the memset
      0 to avoid breaking pointer information in pgdat.
      Signed-off-by: default avatarGu Zheng <guz.fnst@cn.fujitsu.com>
      Reported-by: default avatarXishi Qiu <qiuxishi@huawei.com>
      Suggested-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Xie XiuQi <xiexiuqi@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      9270aa9e
    • Leon Yu's avatar
      mm: fix anon_vma->degree underflow in anon_vma endless growing prevention · 2ac1a108
      Leon Yu authored
      commit 3fe89b3e upstream.
      
      I have constantly stumbled upon "kernel BUG at mm/rmap.c:399!" after
      upgrading to 3.19 and had no luck with 4.0-rc1 neither.
      
      So, after looking into new logic introduced by commit 7a3ef208 ("mm:
      prevent endless growth of anon_vma hierarchy"), I found chances are that
      unlink_anon_vmas() is called without incrementing dst->anon_vma->degree
      in anon_vma_clone() due to allocation failure.  If dst->anon_vma is not
      NULL in error path, its degree will be incorrectly decremented in
      unlink_anon_vmas() and eventually underflow when exiting as a result of
      another call to unlink_anon_vmas().  That's how "kernel BUG at
      mm/rmap.c:399!" is triggered for me.
      
      This patch fixes the underflow by dropping dst->anon_vma when allocation
      fails.  It's safe to do so regardless of original value of dst->anon_vma
      because dst->anon_vma doesn't have valid meaning if anon_vma_clone()
      fails.  Besides, callers don't care dst->anon_vma in such case neither.
      
      Also suggested by Michal Hocko, we can clean up vma_adjust() a bit as
      anon_vma_clone() now does the work.
      
      [akpm@linux-foundation.org: tweak comment]
      Fixes: 7a3ef208 ("mm: prevent endless growth of anon_vma hierarchy")
      Signed-off-by: default avatarLeon Yu <chianglungyu@gmail.com>
      Signed-off-by: default avatarKonstantin Khlebnikov <koct9i@gmail.com>
      Reviewed-by: default avatarMichal Hocko <mhocko@suse.cz>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      2ac1a108
    • Joe Perches's avatar
      selinux: fix sel_write_enforce broken return value · d4af55bd
      Joe Perches authored
      commit 6436a123 upstream.
      
      Return a negative error value like the rest of the entries in this function.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Acked-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      [PM: tweaked subject line]
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      d4af55bd
    • Catalin Marinas's avatar
      arm64: Use the reserved TTBR0 if context switching to the init_mm · 69528942
      Catalin Marinas authored
      commit e53f21bc upstream.
      
      The idle_task_exit() function may call switch_mm() with next ==
      &init_mm. On arm64, init_mm.pgd cannot be used for user mappings, so
      this patch simply sets the reserved TTBR0.
      Reported-by: default avatarJon Medhurst (Tixy) <tixy@linaro.org>
      Tested-by: default avatarJon Medhurst (Tixy) <tixy@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      69528942
    • Sebastian Wicki's avatar
      ALSA: hda - Add dock support for Thinkpad T450s (17aa:5036) · a52ec488
      Sebastian Wicki authored
      commit 80b311d3 upstream.
      
      This model uses the same dock port as the previous generation.
      Signed-off-by: default avatarSebastian Wicki <gandro@gmx.net>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      a52ec488
    • Brian Silverman's avatar
      sched: Fix RLIMIT_RTTIME when PI-boosting to RT · 10b0c9d9
      Brian Silverman authored
      commit 746db944 upstream.
      
      When non-realtime tasks get priority-inheritance boosted to a realtime
      scheduling class, RLIMIT_RTTIME starts to apply to them. However, the
      counter used for checking this (the same one used for SCHED_RR
      timeslices) was not getting reset. This meant that tasks running with a
      non-realtime scheduling class which are repeatedly boosted to a realtime
      one, but never block while they are running realtime, eventually hit the
      timeout without ever running for a time over the limit. This patch
      resets the realtime timeslice counter when un-PI-boosting from an RT to
      a non-RT scheduling class.
      
      I have some test code with two threads and a shared PTHREAD_PRIO_INHERIT
      mutex which induces priority boosting and spins while boosted that gets
      killed by a SIGXCPU on non-fixed kernels but doesn't with this patch
      applied. It happens much faster with a CONFIG_PREEMPT_RT kernel, and
      does happen eventually with PREEMPT_VOLUNTARY kernels.
      Signed-off-by: default avatarBrian Silverman <brian@peloton-tech.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: austin@peloton-tech.com
      Link: http://lkml.kernel.org/r/1424305436-6716-1-git-send-email-brian@peloton-tech.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      10b0c9d9
    • Peter Zijlstra's avatar
      perf: Fix irq_work 'tail' recursion · 6552dfcd
      Peter Zijlstra authored
      commit d525211f upstream.
      
      Vince reported a watchdog lockup like:
      
      	[<ffffffff8115e114>] perf_tp_event+0xc4/0x210
      	[<ffffffff810b4f8a>] perf_trace_lock+0x12a/0x160
      	[<ffffffff810b7f10>] lock_release+0x130/0x260
      	[<ffffffff816c7474>] _raw_spin_unlock_irqrestore+0x24/0x40
      	[<ffffffff8107bb4d>] do_send_sig_info+0x5d/0x80
      	[<ffffffff811f69df>] send_sigio_to_task+0x12f/0x1a0
      	[<ffffffff811f71ce>] send_sigio+0xae/0x100
      	[<ffffffff811f72b7>] kill_fasync+0x97/0xf0
      	[<ffffffff8115d0b4>] perf_event_wakeup+0xd4/0xf0
      	[<ffffffff8115d103>] perf_pending_event+0x33/0x60
      	[<ffffffff8114e3fc>] irq_work_run_list+0x4c/0x80
      	[<ffffffff8114e448>] irq_work_run+0x18/0x40
      	[<ffffffff810196af>] smp_trace_irq_work_interrupt+0x3f/0xc0
      	[<ffffffff816c99bd>] trace_irq_work_interrupt+0x6d/0x80
      
      Which is caused by an irq_work generating new irq_work and therefore
      not allowing forward progress.
      
      This happens because processing the perf irq_work triggers another
      perf event (tracepoint stuff) which in turn generates an irq_work ad
      infinitum.
      
      Avoid this by raising the recursion counter in the irq_work -- which
      effectively disables all software events (including tracepoints) from
      actually triggering again.
      Reported-by: default avatarVince Weaver <vincent.weaver@maine.edu>
      Tested-by: default avatarVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20150219170311.GH21418@twins.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      6552dfcd