1. 01 Oct, 2019 34 commits
  2. 26 Sep, 2019 6 commits
    • Dirk Morris's avatar
      netfilter: conntrack: Use consistent ct id hash calculation · 3c54b129
      Dirk Morris authored
      BugLink: https://bugs.launchpad.net/bugs/1845036
      
      commit 656c8e9c upstream.
      
      Change ct id hash calculation to only use invariants.
      
      Currently the ct id hash calculation is based on some fields that can
      change in the lifetime on a conntrack entry in some corner cases. The
      current hash uses the whole tuple which contains an hlist pointer which
      will change when the conntrack is placed on the dying list resulting in
      a ct id change.
      
      This patch also removes the reply-side tuple and extension pointer from
      the hash calculation so that the ct id will will not change from
      initialization until confirmation.
      
      Fixes: 3c791076 ("netfilter: ctnetlink: don't use conntrack/expect object addresses as id")
      Signed-off-by: default avatarDirk Morris <dmorris@metaloft.com>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarConnor Kuehl <connor.kuehl@canonical.com>
      Signed-off-by: default avatarKhalid Elmously <khalid.elmously@canonical.com>
      3c54b129
    • Florian Westphal's avatar
      netfilter: ctnetlink: don't use conntrack/expect object addresses as id · ae7b03f6
      Florian Westphal authored
      BugLink: https://bugs.launchpad.net/bugs/1845036
      
      commit 3c791076 upstream.
      
      else, we leak the addresses to userspace via ctnetlink events
      and dumps.
      
      Compute an ID on demand based on the immutable parts of nf_conn struct.
      
      Another advantage compared to using an address is that there is no
      immediate re-use of the same ID in case the conntrack entry is freed and
      reallocated again immediately.
      
      Fixes: 35832402 ("[NETFILTER]: nf_conntrack_expect: kill unique ID")
      Fixes: 7f85f914 ("[NETFILTER]: nf_conntrack: kill unique ID")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      [bwh: Backported to 4.4: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarConnor Kuehl <connor.kuehl@canonical.com>
      Signed-off-by: default avatarKhalid Elmously <khalid.elmously@canonical.com>
      ae7b03f6
    • Jason A. Donenfeld's avatar
      siphash: implement HalfSipHash1-3 for hash tables · 70ef81e7
      Jason A. Donenfeld authored
      BugLink: https://bugs.launchpad.net/bugs/1845036
      
      commit 1ae2324f upstream.
      
      HalfSipHash, or hsiphash, is a shortened version of SipHash, which
      generates 32-bit outputs using a weaker 64-bit key. It has *much* lower
      security margins, and shouldn't be used for anything too sensitive, but
      it could be used as a hashtable key function replacement, if the output
      is never exposed, and if the security requirement is not too high.
      
      The goal is to make this something that performance-critical jhash users
      would be willing to use.
      
      On 64-bit machines, HalfSipHash1-3 is slower than SipHash1-3, so we alias
      SipHash1-3 to HalfSipHash1-3 on those systems.
      
      64-bit x86_64:
      [    0.509409] test_siphash:     SipHash2-4 cycles: 4049181
      [    0.510650] test_siphash:     SipHash1-3 cycles: 2512884
      [    0.512205] test_siphash: HalfSipHash1-3 cycles: 3429920
      [    0.512904] test_siphash:    JenkinsHash cycles:  978267
      So, we map hsiphash() -> SipHash1-3
      
      32-bit x86:
      [    0.509868] test_siphash:     SipHash2-4 cycles: 14812892
      [    0.513601] test_siphash:     SipHash1-3 cycles:  9510710
      [    0.515263] test_siphash: HalfSipHash1-3 cycles:  3856157
      [    0.515952] test_siphash:    JenkinsHash cycles:  1148567
      So, we map hsiphash() -> HalfSipHash1-3
      
      hsiphash() is roughly 3 times slower than jhash(), but comes with a
      considerable security improvement.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Reviewed-by: default avatarJean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 4.4 to avoid regression for WireGuard with only half
       the siphash API present]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarConnor Kuehl <connor.kuehl@canonical.com>
      Signed-off-by: default avatarKhalid Elmously <khalid.elmously@canonical.com>
      70ef81e7
    • Alexander Kochetkov's avatar
      net: arc_emac: fix koops caused by sk_buff free · ffe4cd53
      Alexander Kochetkov authored
      BugLink: https://bugs.launchpad.net/bugs/1845036
      
      commit c278c253 upstream.
      
      There is a race between arc_emac_tx() and arc_emac_tx_clean().
      sk_buff got freed by arc_emac_tx_clean() while arc_emac_tx()
      submitting sk_buff.
      
      In order to free sk_buff arc_emac_tx_clean() checks:
          if ((info & FOR_EMAC) || !txbd->data)
              break;
          ...
          dev_kfree_skb_irq(skb);
      
      If condition false, arc_emac_tx_clean() free sk_buff.
      
      In order to submit txbd, arc_emac_tx() do:
          priv->tx_buff[*txbd_curr].skb = skb;
          ...
          priv->txbd[*txbd_curr].data = cpu_to_le32(addr);
          ...
          ...  <== arc_emac_tx_clean() check condition here
          ...  <== (info & FOR_EMAC) is false
          ...  <== !txbd->data is false
          ...
          *info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
      
      In order to reproduce the situation,
      run device:
          # iperf -s
      run on host:
          # iperf -t 600 -c <device-ip-addr>
      
      [   28.396284] ------------[ cut here ]------------
      [   28.400912] kernel BUG at .../net/core/skbuff.c:1355!
      [   28.414019] Internal error: Oops - BUG: 0 [#1] SMP ARM
      [   28.419150] Modules linked in:
      [   28.422219] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G    B           4.4.0+ #120
      [   28.429516] Hardware name: Rockchip (Device Tree)
      [   28.434216] task: c0665070 ti: c0660000 task.ti: c0660000
      [   28.439622] PC is at skb_put+0x10/0x54
      [   28.443381] LR is at arc_emac_poll+0x260/0x474
      [   28.447821] pc : [<c03af580>]    lr : [<c028fec4>]    psr: a0070113
      [   28.447821] sp : c0661e58  ip : eea68502  fp : ef377000
      [   28.459280] r10: 0000012c  r9 : f08b2000  r8 : eeb57100
      [   28.464498] r7 : 00000000  r6 : ef376594  r5 : 00000077  r4 : ef376000
      [   28.471015] r3 : 0030488b  r2 : ef13e880  r1 : 000005ee  r0 : eeb57100
      [   28.477534] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
      [   28.484658] Control: 10c5387d  Table: 8eaf004a  DAC: 00000051
      [   28.490396] Process swapper/0 (pid: 0, stack limit = 0xc0660210)
      [   28.496393] Stack: (0xc0661e58 to 0xc0662000)
      [   28.500745] 1e40:                                                       00000002 00000000
      [   28.508913] 1e60: 00000000 ef376520 00000028 f08b23b8 00000000 ef376520 ef7b6900 c028fc64
      [   28.517082] 1e80: 2f158000 c0661ea8 c0661eb0 0000012c c065e900 c03bdeac ffff95e9 c0662100
      [   28.525250] 1ea0: c0663924 00000028 c0661ea8 c0661ea8 c0661eb0 c0661eb0 0000001e c0660000
      [   28.533417] 1ec0: 40000003 00000008 c0695a00 0000000a c066208c 00000100 c0661ee0 c0027410
      [   28.541584] 1ee0: ef0fb700 2f158000 00200000 ffff95e8 00000004 c0662100 c0662080 00000003
      [   28.549751] 1f00: 00000000 00000000 00000000 c065b45c 0000001e ef005000 c0647a30 00000000
      [   28.557919] 1f20: 00000000 c0027798 00000000 c005cf40 f0802100 c0662ffc c0661f60 f0803100
      [   28.566088] 1f40: c0661fb8 c00093bc c000ffb4 60070013 ffffffff c0661f94 c0661fb8 c00137d4
      [   28.574267] 1f60: 00000001 00000000 00000000 c001ffa0 00000000 c0660000 00000000 c065a364
      [   28.582441] 1f80: c0661fb8 c0647a30 00000000 00000000 00000000 c0661fb0 c000ffb0 c000ffb4
      [   28.590608] 1fa0: 60070013 ffffffff 00000051 00000000 00000000 c005496c c0662400 c061bc40
      [   28.598776] 1fc0: ffffffff ffffffff 00000000 c061b680 00000000 c0647a30 00000000 c0695294
      [   28.606943] 1fe0: c0662488 c0647a2c c066619c 6000406a 413fc090 6000807c 00000000 00000000
      [   28.615127] [<c03af580>] (skb_put) from [<ef376520>] (0xef376520)
      [   28.621218] Code: e5902054 e590c090 e3520000 0a000000 (e7f001f2)
      [   28.627307] ---[ end trace 4824734e2243fdb6 ]---
      
      [   34.377068] Internal error: Oops: 17 [#1] SMP ARM
      [   34.382854] Modules linked in:
      [   34.385947] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.4.0+ #120
      [   34.392219] Hardware name: Rockchip (Device Tree)
      [   34.396937] task: ef02d040 ti: ef05c000 task.ti: ef05c000
      [   34.402376] PC is at __dev_kfree_skb_irq+0x4/0x80
      [   34.407121] LR is at arc_emac_poll+0x130/0x474
      [   34.411583] pc : [<c03bb640>]    lr : [<c028fd94>]    psr: 60030013
      [   34.411583] sp : ef05de68  ip : 0008e83c  fp : ef377000
      [   34.423062] r10: c001bec4  r9 : 00000000  r8 : f08b24c8
      [   34.428296] r7 : f08b2400  r6 : 00000075  r5 : 00000019  r4 : ef376000
      [   34.434827] r3 : 00060000  r2 : 00000042  r1 : 00000001  r0 : 00000000
      [   34.441365] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
      [   34.448507] Control: 10c5387d  Table: 8f25c04a  DAC: 00000051
      [   34.454262] Process ksoftirqd/0 (pid: 3, stack limit = 0xef05c210)
      [   34.460449] Stack: (0xef05de68 to 0xef05e000)
      [   34.464827] de60:                   ef376000 c028fd94 00000000 c0669480 c0669480 ef376520
      [   34.473022] de80: 00000028 00000001 00002ae4 ef376520 ef7b6900 c028fc64 2f158000 ef05dec0
      [   34.481215] dea0: ef05dec8 0000012c c065e900 c03bdeac ffff983f c0662100 c0663924 00000028
      [   34.489409] dec0: ef05dec0 ef05dec0 ef05dec8 ef05dec8 ef7b6000 ef05c000 40000003 00000008
      [   34.497600] dee0: c0695a00 0000000a c066208c 00000100 ef05def8 c0027410 ef7b6000 40000000
      [   34.505795] df00: 04208040 ffff983e 00000004 c0662100 c0662080 00000003 ef05c000 ef027340
      [   34.513985] df20: ef05c000 c0666c2c 00000000 00000001 00000002 00000000 00000000 c0027568
      [   34.522176] df40: ef027340 c003ef48 ef027300 00000000 ef027340 c003edd4 00000000 00000000
      [   34.530367] df60: 00000000 c003c37c ffffff7f 00000001 00000000 ef027340 00000000 00030003
      [   34.538559] df80: ef05df80 ef05df80 00000000 00000000 ef05df90 ef05df90 ef05dfac ef027300
      [   34.546750] dfa0: c003c2a4 00000000 00000000 c000f578 00000000 00000000 00000000 00000000
      [   34.554939] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      [   34.563129] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 ffffffff dfff7fff
      [   34.571360] [<c03bb640>] (__dev_kfree_skb_irq) from [<c028fd94>] (arc_emac_poll+0x130/0x474)
      [   34.579840] [<c028fd94>] (arc_emac_poll) from [<c03bdeac>] (net_rx_action+0xdc/0x28c)
      [   34.587712] [<c03bdeac>] (net_rx_action) from [<c0027410>] (__do_softirq+0xcc/0x1f8)
      [   34.595482] [<c0027410>] (__do_softirq) from [<c0027568>] (run_ksoftirqd+0x2c/0x50)
      [   34.603168] [<c0027568>] (run_ksoftirqd) from [<c003ef48>] (smpboot_thread_fn+0x174/0x18c)
      [   34.611466] [<c003ef48>] (smpboot_thread_fn) from [<c003c37c>] (kthread+0xd8/0xec)
      [   34.619075] [<c003c37c>] (kthread) from [<c000f578>] (ret_from_fork+0x14/0x3c)
      [   34.626317] Code: e8bd8010 e3a00000 e12fff1e e92d4010 (e59030a4)
      [   34.632572] ---[ end trace cca5a3d86a82249a ]---
      Signed-off-by: default avatarAlexander Kochetkov <al.kochet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarConnor Kuehl <connor.kuehl@canonical.com>
      Signed-off-by: default avatarKhalid Elmously <khalid.elmously@canonical.com>
      ffe4cd53
    • Daniel Bristot de Oliveira's avatar
      cgroup: Disable IRQs while holding css_set_lock · 2a68165d
      Daniel Bristot de Oliveira authored
      BugLink: https://bugs.launchpad.net/bugs/1845036
      
      commit 82d6489d upstream.
      
      While testing the deadline scheduler + cgroup setup I hit this
      warning.
      
      [  132.612935] ------------[ cut here ]------------
      [  132.612951] WARNING: CPU: 5 PID: 0 at kernel/softirq.c:150 __local_bh_enable_ip+0x6b/0x80
      [  132.612952] Modules linked in: (a ton of modules...)
      [  132.612981] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.7.0-rc2 #2
      [  132.612981] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014
      [  132.612982]  0000000000000086 45c8bb5effdd088b ffff88013fd43da0 ffffffff813d229e
      [  132.612984]  0000000000000000 0000000000000000 ffff88013fd43de0 ffffffff810a652b
      [  132.612985]  00000096811387b5 0000000000000200 ffff8800bab29d80 ffff880034c54c00
      [  132.612986] Call Trace:
      [  132.612987]  <IRQ>  [<ffffffff813d229e>] dump_stack+0x63/0x85
      [  132.612994]  [<ffffffff810a652b>] __warn+0xcb/0xf0
      [  132.612997]  [<ffffffff810e76a0>] ? push_dl_task.part.32+0x170/0x170
      [  132.612999]  [<ffffffff810a665d>] warn_slowpath_null+0x1d/0x20
      [  132.613000]  [<ffffffff810aba5b>] __local_bh_enable_ip+0x6b/0x80
      [  132.613008]  [<ffffffff817d6c8a>] _raw_write_unlock_bh+0x1a/0x20
      [  132.613010]  [<ffffffff817d6c9e>] _raw_spin_unlock_bh+0xe/0x10
      [  132.613015]  [<ffffffff811388ac>] put_css_set+0x5c/0x60
      [  132.613016]  [<ffffffff8113dc7f>] cgroup_free+0x7f/0xa0
      [  132.613017]  [<ffffffff810a3912>] __put_task_struct+0x42/0x140
      [  132.613018]  [<ffffffff810e776a>] dl_task_timer+0xca/0x250
      [  132.613027]  [<ffffffff810e76a0>] ? push_dl_task.part.32+0x170/0x170
      [  132.613030]  [<ffffffff8111371e>] __hrtimer_run_queues+0xee/0x270
      [  132.613031]  [<ffffffff81113ec8>] hrtimer_interrupt+0xa8/0x190
      [  132.613034]  [<ffffffff81051a58>] local_apic_timer_interrupt+0x38/0x60
      [  132.613035]  [<ffffffff817d9b0d>] smp_apic_timer_interrupt+0x3d/0x50
      [  132.613037]  [<ffffffff817d7c5c>] apic_timer_interrupt+0x8c/0xa0
      [  132.613038]  <EOI>  [<ffffffff81063466>] ? native_safe_halt+0x6/0x10
      [  132.613043]  [<ffffffff81037a4e>] default_idle+0x1e/0xd0
      [  132.613044]  [<ffffffff810381cf>] arch_cpu_idle+0xf/0x20
      [  132.613046]  [<ffffffff810e8fda>] default_idle_call+0x2a/0x40
      [  132.613047]  [<ffffffff810e92d7>] cpu_startup_entry+0x2e7/0x340
      [  132.613048]  [<ffffffff81050235>] start_secondary+0x155/0x190
      [  132.613049] ---[ end trace f91934d162ce9977 ]---
      
      The warn is the spin_(lock|unlock)_bh(&css_set_lock) in the interrupt
      context. Converting the spin_lock_bh to spin_lock_irq(save) to avoid
      this problem - and other problems of sharing a spinlock with an
      interrupt.
      
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Juri Lelli <juri.lelli@arm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: cgroups@vger.kernel.org
      Cc: stable@vger.kernel.org # 4.5+
      Cc: linux-kernel@vger.kernel.org
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Reviewed-by: default avatar"Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Acked-by: default avatarZefan Li <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarZubin Mithra <zsm@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarConnor Kuehl <connor.kuehl@canonical.com>
      Signed-off-by: default avatarKhalid Elmously <khalid.elmously@canonical.com>
      2a68165d
    • Mikulas Patocka's avatar
      dm table: fix invalid memory accesses with too high sector number · da070c47
      Mikulas Patocka authored
      BugLink: https://bugs.launchpad.net/bugs/1845036
      
      commit 1cfd5d33 upstream.
      
      If the sector number is too high, dm_table_find_target() should return a
      pointer to a zeroed dm_target structure (the caller should test it with
      dm_target_is_valid).
      
      However, for some table sizes, the code in dm_table_find_target() that
      performs btree lookup will access out of bound memory structures.
      
      Fix this bug by testing the sector number at the beginning of
      dm_table_find_target(). Also, add an "inline" keyword to the function
      dm_table_get_size() because this is a hot path.
      
      Fixes: 512875bd ("dm: table detect io beyond device")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarZhang Tao <kontais@zoho.com>
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarConnor Kuehl <connor.kuehl@canonical.com>
      Signed-off-by: default avatarKhalid Elmously <khalid.elmously@canonical.com>
      da070c47