1. 15 Oct, 2014 9 commits
    • Frederic Weisbecker's avatar
      irq_work: Force raised irq work to run on irq work interrupt · 6fd5de08
      Frederic Weisbecker authored
      commit 76a33061 upstream.
      
      The nohz full kick, which restarts the tick when any resource depend
      on it, can't be executed anywhere given the operation it does on timers.
      If it is called from the scheduler or timers code, chances are that
      we run into a deadlock.
      
      This is why we run the nohz full kick from an irq work. That way we make
      sure that the kick runs on a virgin context.
      
      However if that's the case when irq work runs in its own dedicated
      self-ipi, things are different for the big bunch of archs that don't
      support the self triggered way. In order to support them, irq works are
      also handled by the timer interrupt as fallback.
      
      Now when irq works run on the timer interrupt, the context isn't blank.
      More precisely, they can run in the context of the hrtimer that runs the
      tick. But the nohz kick cancels and restarts this hrtimer and cancelling
      an hrtimer from itself isn't allowed. This is why we run in an endless
      loop:
      
      	Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 2
      	CPU: 2 PID: 7538 Comm: kworker/u8:8 Not tainted 3.16.0+ #34
      	Workqueue: btrfs-endio-write normal_work_helper [btrfs]
      	 ffff880244c06c88 000000001b486fe1 ffff880244c06bf0 ffffffff8a7f1e37
      	 ffffffff8ac52a18 ffff880244c06c78 ffffffff8a7ef928 0000000000000010
      	 ffff880244c06c88 ffff880244c06c20 000000001b486fe1 0000000000000000
      	Call Trace:
      	 <NMI[<ffffffff8a7f1e37>] dump_stack+0x4e/0x7a
      	 [<ffffffff8a7ef928>] panic+0xd4/0x207
      	 [<ffffffff8a1450e8>] watchdog_overflow_callback+0x118/0x120
      	 [<ffffffff8a186b0e>] __perf_event_overflow+0xae/0x350
      	 [<ffffffff8a184f80>] ? perf_event_task_disable+0xa0/0xa0
      	 [<ffffffff8a01a4cf>] ? x86_perf_event_set_period+0xbf/0x150
      	 [<ffffffff8a187934>] perf_event_overflow+0x14/0x20
      	 [<ffffffff8a020386>] intel_pmu_handle_irq+0x206/0x410
      	 [<ffffffff8a01937b>] perf_event_nmi_handler+0x2b/0x50
      	 [<ffffffff8a007b72>] nmi_handle+0xd2/0x390
      	 [<ffffffff8a007aa5>] ? nmi_handle+0x5/0x390
      	 [<ffffffff8a0cb7f8>] ? match_held_lock+0x8/0x1b0
      	 [<ffffffff8a008062>] default_do_nmi+0x72/0x1c0
      	 [<ffffffff8a008268>] do_nmi+0xb8/0x100
      	 [<ffffffff8a7ff66a>] end_repeat_nmi+0x1e/0x2e
      	 [<ffffffff8a0cb7f8>] ? match_held_lock+0x8/0x1b0
      	 [<ffffffff8a0cb7f8>] ? match_held_lock+0x8/0x1b0
      	 [<ffffffff8a0cb7f8>] ? match_held_lock+0x8/0x1b0
      	 <<EOE><IRQ[<ffffffff8a0ccd2f>] lock_acquired+0xaf/0x450
      	 [<ffffffff8a0f74c5>] ? lock_hrtimer_base.isra.20+0x25/0x50
      	 [<ffffffff8a7fc678>] _raw_spin_lock_irqsave+0x78/0x90
      	 [<ffffffff8a0f74c5>] ? lock_hrtimer_base.isra.20+0x25/0x50
      	 [<ffffffff8a0f74c5>] lock_hrtimer_base.isra.20+0x25/0x50
      	 [<ffffffff8a0f7723>] hrtimer_try_to_cancel+0x33/0x1e0
      	 [<ffffffff8a0f78ea>] hrtimer_cancel+0x1a/0x30
      	 [<ffffffff8a109237>] tick_nohz_restart+0x17/0x90
      	 [<ffffffff8a10a213>] __tick_nohz_full_check+0xc3/0x100
      	 [<ffffffff8a10a25e>] nohz_full_kick_work_func+0xe/0x10
      	 [<ffffffff8a17c884>] irq_work_run_list+0x44/0x70
      	 [<ffffffff8a17c8da>] irq_work_run+0x2a/0x50
      	 [<ffffffff8a0f700b>] update_process_times+0x5b/0x70
      	 [<ffffffff8a109005>] tick_sched_handle.isra.21+0x25/0x60
      	 [<ffffffff8a109b81>] tick_sched_timer+0x41/0x60
      	 [<ffffffff8a0f7aa2>] __run_hrtimer+0x72/0x470
      	 [<ffffffff8a109b40>] ? tick_sched_do_timer+0xb0/0xb0
      	 [<ffffffff8a0f8707>] hrtimer_interrupt+0x117/0x270
      	 [<ffffffff8a034357>] local_apic_timer_interrupt+0x37/0x60
      	 [<ffffffff8a80010f>] smp_apic_timer_interrupt+0x3f/0x50
      	 [<ffffffff8a7fe52f>] apic_timer_interrupt+0x6f/0x80
      
      To fix this we force non-lazy irq works to run on irq work self-IPIs
      when available. That ability of the arch to trigger irq work self IPIs
      is available with arch_irq_work_has_interrupt().
      Reported-by: default avatarCatalin Iacob <iacobcatalin@gmail.com>
      Reported-by: default avatarDave Jones <davej@redhat.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6fd5de08
    • Peter Zijlstra's avatar
      irq_work: Introduce arch_irq_work_has_interrupt() · b677a767
      Peter Zijlstra authored
      commit c5c38ef3 upstream.
      
      The nohz full code needs irq work to trigger its own interrupt so that
      the subsystem can work even when the tick is stopped.
      
      Lets introduce arch_irq_work_has_interrupt() that archs can override to
      tell about their support for this ability.
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b677a767
    • WANG Cong's avatar
      net_sched: copy exts->type in tcf_exts_change() · 3bc0d335
      WANG Cong authored
      [ Upstream commit 5301e3e1 ]
      
      We need to copy exts->type when committing the change, otherwise
      it would be always 0. This is a quick fix for -net and -stable,
      for net-next tcf_exts will be removed.
      
      Fixes: commit 33be6271 ("net_sched: act: use standard struct list_head")
      Reported-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3bc0d335
    • Sylvain \\\"ythier\\\" Hitier's avatar
      3c59x: fix bad split of cpu_to_le32(pci_map_single()) · f823cda9
      Sylvain \\\"ythier\\\" Hitier authored
      [ Upstream commit 88b09a6d ]
      
      In commit 6f2b6a30,
        # 3c59x: Add dma error checking and recovery
      the intent is to split out the mapping from the byte-swapping in order to
      insert a dma_mapping_error() check.
      
      Kinda this semantic patch:
      
          // See http://coccinelle.lip6.fr/
          //
          // Beware, grouik-and-dirty!
          @@
          expression DEV, X, Y, Z;
          @@
          -   cpu_to_le32(pci_map_single(DEV, X, Y, Z))
          +   dma_addr_t addr = pci_map_single(DEV, X, Y, Z);
          +   if (dma_mapping_error(&DEV->dev, addr))
          +       /* snip */;
          +   cpu_to_le32(addr)
      
      However, the #else part (of the #if DO_ZEROCOPY test) is changed this way:
      
          -   cpu_to_le32(pci_map_single(DEV, X, Y, Z))
          +   dma_addr_t addr = cpu_to_le32(pci_map_single(DEV, X, Y, Z));
          //                    ^^^^^^^^^^^
          //                    That mismatches the 3 other changes!
          +   if (dma_mapping_error(&DEV->dev, addr))
          +       /* snip */;
          +   cpu_to_le32(addr)
      
      Let's remove the leftover cpu_to_le32() for coherency.
      
      v2: Better changelog.
      v3: Add Acked-by
      
      Fixes: 6f2b6a30
        # 3c59x: Add dma error checking and recovery
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarSylvain "ythier" Hitier <sylvain.hitier@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f823cda9
    • Vlad Yasevich's avatar
      sctp: handle association restarts when the socket is closed. · 404db050
      Vlad Yasevich authored
      [ Upstream commit bdf6fa52 ]
      
      Currently association restarts do not take into consideration the
      state of the socket.  When a restart happens, the current assocation
      simply transitions into established state.  This creates a condition
      where a remote system, through a the restart procedure, may create a
      local association that is no way reachable by user.  The conditions
      to trigger this are as follows:
        1) Remote does not acknoledge some data causing data to remain
           outstanding.
        2) Local application calls close() on the socket.  Since data
           is still outstanding, the association is placed in SHUTDOWN_PENDING
           state.  However, the socket is closed.
        3) The remote tries to create a new association, triggering a restart
           on the local system.  The association moves from SHUTDOWN_PENDING
           to ESTABLISHED.  At this point, it is no longer reachable by
           any socket on the local system.
      
      This patch addresses the above situation by moving the newly ESTABLISHED
      association into SHUTDOWN-SENT state and bundling a SHUTDOWN after
      the COOKIE-ACK chunk.  This way, the restarted associate immidiately
      enters the shutdown procedure and forces the termination of the
      unreachable association.
      Reported-by: default avatarDavid Laight <David.Laight@aculab.com>
      Signed-off-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      404db050
    • KY Srinivasan's avatar
      hyperv: Fix a bug in netvsc_send() · ed54569e
      KY Srinivasan authored
      [ Upstream commit 3a67c9cc ]
      
      After the packet is successfully sent, we should not touch the packet
      as it may have been freed. This patch is based on the work done by
      Long Li <longli@microsoft.com>.
      
      David, please queue this up for stable.
      Signed-off-by: default avatarK. Y. Srinivasan <kys@microsoft.com>
      Reported-by: default avatarSitsofe Wheeler <sitsofe@yahoo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed54569e
    • Joe Lawrence's avatar
      team: avoid race condition in scheduling delayed work · c96de3dc
      Joe Lawrence authored
      [ Upstream commit 47549650 ]
      
      When team_notify_peers and team_mcast_rejoin are called, they both reset
      their respective .count_pending atomic variable. Then when the actual
      worker function is executed, the variable is atomically decremented.
      This pattern introduces a potential race condition where the
      .count_pending rolls over and the worker function keeps rescheduling
      until .count_pending decrements to zero again:
      
      THREAD 1                           THREAD 2
      
      ========                           ========
      team_notify_peers(teamX)
        atomic_set count_pending = 1
        schedule_delayed_work
                                         team_notify_peers(teamX)
                                         atomic_set count_pending = 1
      team_notify_peers_work
        atomic_dec_and_test
          count_pending = 0
        (return)
                                         schedule_delayed_work
                                         team_notify_peers_work
                                         atomic_dec_and_test
                                           count_pending = -1
                                         schedule_delayed_work
                                         (repeat until count_pending = 0)
      
      Instead of assigning a new value to .count_pending, use atomic_add to
      tack-on the additional desired worker function invocations.
      Signed-off-by: default avatarJoe Lawrence <joe.lawrence@stratus.com>
      Acked-by: default avatarJiri Pirko <jiri@resnulli.us>
      Fixes: fc423ff0 ("team: add peer notification")
      Fixes: 492b200e ("team: add support for sending multicast rejoins")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c96de3dc
    • Florian Fainelli's avatar
      net: systemport: fix bcm_sysport_insert_tsb() · e0ff8275
      Florian Fainelli authored
      [ Upstream commit e87474a6 ]
      
      Similar to commit bc23333b ("net:
      bcmgenet: fix bcmgenet_put_tx_csum()"), we need to return the skb
      pointer in case we had to reallocate the SKB headroom.
      
      Fixes: 80105bef ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0ff8275
    • Nicolas Dichtel's avatar
      ip6_gre: fix flowi6_proto value in xmit path · 570f480a
      Nicolas Dichtel authored
      [ Upstream commit 3be07244 ]
      
      In xmit path, we build a flowi6 which will be used for the output route lookup.
      We are sending a GRE packet, neither IPv4 nor IPv6 encapsulated packet, thus the
      protocol should be IPPROTO_GRE.
      
      Fixes: c12b395a ("gre: Support GRE over IPv6")
      Reported-by: default avatarMatthieu Ternisien d'Ouville <matthieu.tdo@6wind.com>
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      570f480a
  2. 05 Oct, 2014 2 commits
  3. 04 Oct, 2014 1 commit
  4. 03 Oct, 2014 13 commits
  5. 02 Oct, 2014 15 commits