1. 31 May, 2014 1 commit
    • Alexei Starovoitov's avatar
      net: filter: x86: fix JIT address randomization · 070a08cd
      Alexei Starovoitov authored
      [ Upstream commit 773cd38f ]
      
      bpf_alloc_binary() adds 128 bytes of room to JITed program image
      and rounds it up to the nearest page size. If image size is close
      to page size (like 4000), it is rounded to two pages:
      round_up(4000 + 4 + 128) == 8192
      then 'hole' is computed as 8192 - (4000 + 4) = 4188
      If prandom_u32() % hole selects a number >= PAGE_SIZE - sizeof(*header)
      then kernel will crash during bpf_jit_free():
      
      kernel BUG at arch/x86/mm/pageattr.c:887!
      Call Trace:
       [<ffffffff81037285>] change_page_attr_set_clr+0x135/0x460
       [<ffffffff81694cc0>] ? _raw_spin_unlock_irq+0x30/0x50
       [<ffffffff810378ff>] set_memory_rw+0x2f/0x40
       [<ffffffffa01a0d8d>] bpf_jit_free_deferred+0x2d/0x60
       [<ffffffff8106bf98>] process_one_work+0x1d8/0x6a0
       [<ffffffff8106bf38>] ? process_one_work+0x178/0x6a0
       [<ffffffff8106c90c>] worker_thread+0x11c/0x370
      
      since bpf_jit_free() does:
        unsigned long addr = (unsigned long)fp->bpf_func & PAGE_MASK;
        struct bpf_binary_header *header = (void *)addr;
      to compute start address of 'bpf_binary_header'
      and header->pages will pass junk to:
        set_memory_rw(addr, header->pages);
      
      Fix it by making sure that &header->image[prandom_u32() % hole] and &header
      are in the same page
      
      Fixes: 314beb9b
      
       ("x86: bpf_jit_comp: secure bpf jit against spraying attacks")
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      070a08cd
  2. 16 Jan, 2014 1 commit
  3. 08 Nov, 2013 1 commit
    • Andrey Vagin's avatar
      net: x86: bpf: don't forget to free sk_filter (v2) · 98bbc06a
      Andrey Vagin authored
      sk_filter isn't freed if bpf_func is equal to sk_run_filter.
      
      This memory leak was introduced by v3.12-rc3-224-gd45ed4a4
      "net: fix unsafe set_memory_rw from softirq".
      
      Before this patch sk_filter was freed in sk_filter_release_rcu,
      now it should be freed in bpf_jit_free.
      
      Here is output of kmemleak:
      unreferenced object 0xffff8800b774eab0 (size 128):
        comm "systemd", pid 1, jiffies 4294669014 (age 124.062s)
        hex dump (first 32 bytes):
          00 00 00 00 0b 00 00 00 20 63 7f b7 00 88 ff ff  ........ c......
          60 d4 55 81 ff ff ff ff 30 d9 55 81 ff ff ff ff  `.U.....0.U.....
        backtrace:
          [<ffffffff816444be>] kmemleak_alloc+0x4e/0xb0
          [<ffffffff811845af>] __kmalloc+0xef/0x260
          [<ffffffff81534028>] sock_kmalloc+0x38/0x60
          [<ffffffff8155d4dd>] sk_attach_filter+0x5d/0x190
          [<ffffffff815378a1>] sock_setsockopt+0x991/0x9e0
          [<ffffffff81531bd6>] SyS_setsockopt+0xb6/0xd0
          [<ffffffff8165f3e9>] system_call_fastpath+0x16/0x1b
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      v2: add extra { } after else
      
      Fixes: d45ed4a4
      
       ("net: fix unsafe set_memory_rw from softirq")
      Acked-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: default avatarAndrey Vagin <avagin@openvz.org>
      Acked-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      98bbc06a
  4. 07 Oct, 2013 1 commit
    • Alexei Starovoitov's avatar
      net: fix unsafe set_memory_rw from softirq · d45ed4a4
      Alexei Starovoitov authored
      
      on x86 system with net.core.bpf_jit_enable = 1
      
      sudo tcpdump -i eth1 'tcp port 22'
      
      causes the warning:
      [   56.766097]  Possible unsafe locking scenario:
      [   56.766097]
      [   56.780146]        CPU0
      [   56.786807]        ----
      [   56.793188]   lock(&(&vb->lock)->rlock);
      [   56.799593]   <Interrupt>
      [   56.805889]     lock(&(&vb->lock)->rlock);
      [   56.812266]
      [   56.812266]  *** DEADLOCK ***
      [   56.812266]
      [   56.830670] 1 lock held by ksoftirqd/1/13:
      [   56.836838]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff8118f44c>] vm_unmap_aliases+0x8c/0x380
      [   56.849757]
      [   56.849757] stack backtrace:
      [   56.862194] CPU: 1 PID: 13 Comm: ksoftirqd/1 Not tainted 3.12.0-rc3+ #45
      [   56.868721] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
      [   56.882004]  ffffffff821944c0 ffff88080bbdb8c8 ffffffff8175a145 0000000000000007
      [   56.895630]  ffff88080bbd5f40 ffff88080bbdb928 ffffffff81755b14 0000000000000001
      [   56.909313]  ffff880800000001 ffff880800000000 ffffffff8101178f 0000000000000001
      [   56.923006] Call Trace:
      [   56.929532]  [<ffffffff8175a145>] dump_stack+0x55/0x76
      [   56.936067]  [<ffffffff81755b14>] print_usage_bug+0x1f7/0x208
      [   56.942445]  [<ffffffff8101178f>] ? save_stack_trace+0x2f/0x50
      [   56.948932]  [<ffffffff810cc0a0>] ? check_usage_backwards+0x150/0x150
      [   56.955470]  [<ffffffff810ccb52>] mark_lock+0x282/0x2c0
      [   56.961945]  [<ffffffff810ccfed>] __lock_acquire+0x45d/0x1d50
      [   56.968474]  [<ffffffff810cce6e>] ? __lock_acquire+0x2de/0x1d50
      [   56.975140]  [<ffffffff81393bf5>] ? cpumask_next_and+0x55/0x90
      [   56.981942]  [<ffffffff810cef72>] lock_acquire+0x92/0x1d0
      [   56.988745]  [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380
      [   56.995619]  [<ffffffff817628f1>] _raw_spin_lock+0x41/0x50
      [   57.002493]  [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380
      [   57.009447]  [<ffffffff8118f52a>] vm_unmap_aliases+0x16a/0x380
      [   57.016477]  [<ffffffff8118f44c>] ? vm_unmap_aliases+0x8c/0x380
      [   57.023607]  [<ffffffff810436b0>] change_page_attr_set_clr+0xc0/0x460
      [   57.030818]  [<ffffffff810cfb8d>] ? trace_hardirqs_on+0xd/0x10
      [   57.037896]  [<ffffffff811a8330>] ? kmem_cache_free+0xb0/0x2b0
      [   57.044789]  [<ffffffff811b59c3>] ? free_object_rcu+0x93/0xa0
      [   57.051720]  [<ffffffff81043d9f>] set_memory_rw+0x2f/0x40
      [   57.058727]  [<ffffffff8104e17c>] bpf_jit_free+0x2c/0x40
      [   57.065577]  [<ffffffff81642cba>] sk_filter_release_rcu+0x1a/0x30
      [   57.072338]  [<ffffffff811108e2>] rcu_process_callbacks+0x202/0x7c0
      [   57.078962]  [<ffffffff81057f17>] __do_softirq+0xf7/0x3f0
      [   57.085373]  [<ffffffff81058245>] run_ksoftirqd+0x35/0x70
      
      cannot reuse jited filter memory, since it's readonly,
      so use original bpf insns memory to hold work_struct
      
      defer kfree of sk_filter until jit completed freeing
      
      tested on x86_64 and i386
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d45ed4a4
  5. 20 May, 2013 1 commit
  6. 18 May, 2013 1 commit
  7. 21 Mar, 2013 1 commit
    • Daniel Borkmann's avatar
      filter: bpf_jit_comp: refactor and unify BPF JIT image dump output · 79617801
      Daniel Borkmann authored
      
      If bpf_jit_enable > 1, then we dump the emitted JIT compiled image
      after creation. Currently, only SPARC and PowerPC has similar output
      as in the reference implementation on x86_64. Make a small helper
      function in order to reduce duplicated code and make the dump output
      uniform across architectures x86_64, SPARC, PPC, ARM (e.g. on ARM
      flen, pass and proglen are currently not shown, but would be
      interesting to know as well), also for future BPF JIT implementations
      on other archs.
      
      Cc: Mircea Gherzan <mgherzan@gmail.com>
      Cc: Matt Evans <matt@ozlabs.org>
      Cc: Eric Dumazet <eric.dumazet@google.com>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79617801
  8. 31 Jan, 2013 1 commit
  9. 31 Oct, 2012 1 commit
  10. 24 Sep, 2012 1 commit
  11. 10 Sep, 2012 1 commit
  12. 06 Jun, 2012 1 commit
  13. 03 Apr, 2012 1 commit
  14. 29 Mar, 2012 1 commit
  15. 19 Mar, 2012 1 commit
  16. 18 Jan, 2012 1 commit
    • Eric Dumazet's avatar
      net: bpf_jit: fix divide by 0 generation · d00a9dd2
      Eric Dumazet authored
      
      Several problems fixed in this patch :
      
      1) Target of the conditional jump in case a divide by 0 is performed
         by a bpf is wrong.
      
      2) Must 'generate' the full function prologue/epilogue at pass=0,
         or else we can stop too early in pass=1 if the proglen doesnt change.
         (if the increase of prologue/epilogue equals decrease of all
          instructions length because some jumps are converted to near jumps)
      
      3) Change the wrong length detection at the end of code generation to
         issue a more explicit message, no need for a full stack trace.
      Reported-by: default avatarPhil Oester <kernel@linuxace.com>
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d00a9dd2
  17. 19 Dec, 2011 1 commit
  18. 28 Apr, 2011 1 commit
    • Eric Dumazet's avatar
      net: filter: Just In Time compiler for x86-64 · 0a14842f
      Eric Dumazet authored
      
      In order to speedup packet filtering, here is an implementation of a
      JIT compiler for x86_64
      
      It is disabled by default, and must be enabled by the admin.
      
      echo 1 >/proc/sys/net/core/bpf_jit_enable
      
      It uses module_alloc() and module_free() to get memory in the 2GB text
      kernel range since we call helpers functions from the generated code.
      
      EAX : BPF A accumulator
      EBX : BPF X accumulator
      RDI : pointer to skb   (first argument given to JIT function)
      RBP : frame pointer (even if CONFIG_FRAME_POINTER=n)
      r9d : skb->len - skb->data_len (headlen)
      r8  : skb->data
      
      To get a trace of generated code, use :
      
      echo 2 >/proc/sys/net/core/bpf_jit_enable
      
      Example of generated code :
      
      # tcpdump -p -n -s 0 -i eth1 host 192.168.20.0/24
      
      flen=18 proglen=147 pass=3 image=ffffffffa00b5000
      JIT code: ffffffffa00b5000: 55 48 89 e5 48 83 ec 60 48 89 5d f8 44 8b 4f 60
      JIT code: ffffffffa00b5010: 44 2b 4f 64 4c 8b 87 b8 00 00 00 be 0c 00 00 00
      JIT code: ffffffffa00b5020: e8 24 7b f7 e0 3d 00 08 00 00 75 28 be 1a 00 00
      JIT code: ffffffffa00b5030: 00 e8 fe 7a f7 e0 24 00 3d 00 14 a8 c0 74 49 be
      JIT code: ffffffffa00b5040: 1e 00 00 00 e8 eb 7a f7 e0 24 00 3d 00 14 a8 c0
      JIT code: ffffffffa00b5050: 74 36 eb 3b 3d 06 08 00 00 74 07 3d 35 80 00 00
      JIT code: ffffffffa00b5060: 75 2d be 1c 00 00 00 e8 c8 7a f7 e0 24 00 3d 00
      JIT code: ffffffffa00b5070: 14 a8 c0 74 13 be 26 00 00 00 e8 b5 7a f7 e0 24
      JIT code: ffffffffa00b5080: 00 3d 00 14 a8 c0 75 07 b8 ff ff 00 00 eb 02 31
      JIT code: ffffffffa00b5090: c0 c9 c3
      
      BPF program is 144 bytes long, so native program is almost same size ;)
      
      (000) ldh      [12]
      (001) jeq      #0x800           jt 2    jf 8
      (002) ld       [26]
      (003) and      #0xffffff00
      (004) jeq      #0xc0a81400      jt 16   jf 5
      (005) ld       [30]
      (006) and      #0xffffff00
      (007) jeq      #0xc0a81400      jt 16   jf 17
      (008) jeq      #0x806           jt 10   jf 9
      (009) jeq      #0x8035          jt 10   jf 17
      (010) ld       [28]
      (011) and      #0xffffff00
      (012) jeq      #0xc0a81400      jt 16   jf 13
      (013) ld       [38]
      (014) and      #0xffffff00
      (015) jeq      #0xc0a81400      jt 16   jf 17
      (016) ret      #65535
      (017) ret      #0
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Hagen Paul Pfeifer <hagen@jauu.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a14842f