1. 28 Jul, 2018 3 commits
    • Taehee Yoo's avatar
      bpf: use GFP_ATOMIC instead of GFP_KERNEL in bpf_parse_prog() · 71eb5255
      Taehee Yoo authored
      bpf_parse_prog() is protected by rcu_read_lock().
      so that GFP_KERNEL is not allowed in the bpf_parse_prog().
      
      [51015.579396] =============================
      [51015.579418] WARNING: suspicious RCU usage
      [51015.579444] 4.18.0-rc6+ #208 Not tainted
      [51015.579464] -----------------------------
      [51015.579488] ./include/linux/rcupdate.h:303 Illegal context switch in RCU read-side critical section!
      [51015.579510] other info that might help us debug this:
      [51015.579532] rcu_scheduler_active = 2, debug_locks = 1
      [51015.579556] 2 locks held by ip/1861:
      [51015.579577]  #0: 00000000a8c12fd1 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x2e0/0x910
      [51015.579711]  #1: 00000000bf815f8e (rcu_read_lock){....}, at: lwtunnel_build_state+0x96/0x390
      [51015.579842] stack backtrace:
      [51015.579869] CPU: 0 PID: 1861 Comm: ip Not tainted 4.18.0-rc6+ #208
      [51015.579891] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015
      [51015.579911] Call Trace:
      [51015.579950]  dump_stack+0x74/0xbb
      [51015.580000]  ___might_sleep+0x16b/0x3a0
      [51015.580047]  __kmalloc_track_caller+0x220/0x380
      [51015.580077]  kmemdup+0x1c/0x40
      [51015.580077]  bpf_parse_prog+0x10e/0x230
      [51015.580164]  ? kasan_kmalloc+0xa0/0xd0
      [51015.580164]  ? bpf_destroy_state+0x30/0x30
      [51015.580164]  ? bpf_build_state+0xe2/0x3e0
      [51015.580164]  bpf_build_state+0x1bb/0x3e0
      [51015.580164]  ? bpf_parse_prog+0x230/0x230
      [51015.580164]  ? lock_is_held_type+0x123/0x1a0
      [51015.580164]  lwtunnel_build_state+0x1aa/0x390
      [51015.580164]  fib_create_info+0x1579/0x33d0
      [51015.580164]  ? sched_clock_local+0xe2/0x150
      [51015.580164]  ? fib_info_update_nh_saddr+0x1f0/0x1f0
      [51015.580164]  ? sched_clock_local+0xe2/0x150
      [51015.580164]  fib_table_insert+0x201/0x1990
      [51015.580164]  ? lock_downgrade+0x610/0x610
      [51015.580164]  ? fib_table_lookup+0x1920/0x1920
      [51015.580164]  ? lwtunnel_valid_encap_type.part.6+0xcb/0x3a0
      [51015.580164]  ? rtm_to_fib_config+0x637/0xbd0
      [51015.580164]  inet_rtm_newroute+0xed/0x1b0
      [51015.580164]  ? rtm_to_fib_config+0xbd0/0xbd0
      [51015.580164]  rtnetlink_rcv_msg+0x331/0x910
      [ ... ]
      
      Fixes: 3a0af8fd ("bpf: BPF for lightweight tunnel infrastructure")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      71eb5255
    • Daniel Borkmann's avatar
      bpf: fix bpf_skb_load_bytes_relative pkt length check · 3eee1f75
      Daniel Borkmann authored
      The len > skb_headlen(skb) cannot be used as a maximum upper bound
      for the packet length since it does not have any relation to the full
      linear packet length when filtering is used from upper layers (e.g.
      in case of reuseport BPF programs) as by then skb->data, skb->len
      already got mangled through __skb_pull() and others.
      
      Fixes: 4e1ec56c ("bpf: add skb_load_bytes_relative helper")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      3eee1f75
    • Thomas Richter's avatar
      perf build: Build error in libbpf missing initialization · b611da43
      Thomas Richter authored
      In linux-next tree compiling the perf tool with additional make flags
      EXTRA_CFLAGS="-Wp,-D_FORTIFY_SOURCE=2 -O2" causes a compiler error.
      It is the warning 'variable may be used uninitialized' which is treated
      as error: I compile it using a FEDORA 28 installation, my gcc compiler
      version: gcc (GCC) 8.0.1 20180324 (Red Hat 8.0.1-0.20). The file that
      causes the error is tools/lib/bpf/libbpf.c.
      
        [root@p23lp27] # make V=1 EXTRA_CFLAGS="-Wp,-D_FORTIFY_SOURCE=2 -O2"
        [...]
        Makefile.config:849: No openjdk development package found, please
           install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel
        Warning: Kernel ABI header at 'tools/include/uapi/linux/if_link.h'
                differs from latest version at 'include/uapi/linux/if_link.h'
          CC       libbpf.o
        libbpf.c: In function ‘bpf_perf_event_read_simple’:
        libbpf.c:2342:6: error: ‘ret’ may be used uninitialized in this
        			function [-Werror=maybe-uninitialized]
          int ret;
              ^
        cc1: all warnings being treated as errors
        mv: cannot stat './.libbpf.o.tmp': No such file or directory
        /home6/tmricht/linux-next/tools/build/Makefile.build:96: recipe for target 'libbpf.o' failed
      Suggested-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      b611da43
  2. 27 Jul, 2018 2 commits
  3. 26 Jul, 2018 2 commits
  4. 25 Jul, 2018 4 commits
  5. 23 Jul, 2018 22 commits
    • Martin KaFai Lau's avatar
      bpf: btf: Ensure the member->offset is in the right order · 6283fa38
      Martin KaFai Lau authored
      This patch ensures the member->offset of a struct
      is in the correct order (i.e the later member's offset cannot
      go backward).
      
      The current "pahole -J" BTF encoder does not generate something
      like this.  However, checking this can ensure future encoder
      will not violate this.
      
      Fixes: 69b693f0 ("bpf: btf: Introduce BPF Type Format (BTF)")
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      6283fa38
    • David S. Miller's avatar
      Merge branch 'tcp-robust-ooo' · 1a4f14ba
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      Juha-Matti Tilli reported that malicious peers could inject tiny
      packets in out_of_order_queue, forcing very expensive calls
      to tcp_collapse_ofo_queue() and tcp_prune_ofo_queue() for
      every incoming packet.
      
      With tcp_rmem[2] default of 6MB, the ooo queue could
      contain ~7000 nodes.
      
      This patch series makes sure we cut cpu cycles enough to
      render the attack not critical.
      
      We might in the future go further, like disconnecting
      or black-holing proven malicious flows.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a4f14ba
    • Eric Dumazet's avatar
      tcp: add tcp_ooo_try_coalesce() helper · 58152ecb
      Eric Dumazet authored
      In case skb in out_or_order_queue is the result of
      multiple skbs coalescing, we would like to get a proper gso_segs
      counter tracking, so that future tcp_drop() can report an accurate
      number.
      
      I chose to not implement this tracking for skbs in receive queue,
      since they are not dropped, unless socket is disconnected.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      58152ecb
    • Eric Dumazet's avatar
      tcp: call tcp_drop() from tcp_data_queue_ofo() · 8541b21e
      Eric Dumazet authored
      In order to be able to give better diagnostics and detect
      malicious traffic, we need to have better sk->sk_drops tracking.
      
      Fixes: 9f5afeae ("tcp: use an RB tree for ooo receive queue")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8541b21e
    • Eric Dumazet's avatar
      tcp: detect malicious patterns in tcp_collapse_ofo_queue() · 3d4bf93a
      Eric Dumazet authored
      In case an attacker feeds tiny packets completely out of order,
      tcp_collapse_ofo_queue() might scan the whole rb-tree, performing
      expensive copies, but not changing socket memory usage at all.
      
      1) Do not attempt to collapse tiny skbs.
      2) Add logic to exit early when too many tiny skbs are detected.
      
      We prefer not doing aggressive collapsing (which copies packets)
      for pathological flows, and revert to tcp_prune_ofo_queue() which
      will be less expensive.
      
      In the future, we might add the possibility of terminating flows
      that are proven to be malicious.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3d4bf93a
    • Eric Dumazet's avatar
      tcp: avoid collapses in tcp_prune_queue() if possible · f4a3313d
      Eric Dumazet authored
      Right after a TCP flow is created, receiving tiny out of order
      packets allways hit the condition :
      
      if (atomic_read(&sk->sk_rmem_alloc) >= sk->sk_rcvbuf)
      	tcp_clamp_window(sk);
      
      tcp_clamp_window() increases sk_rcvbuf to match sk_rmem_alloc
      (guarded by tcp_rmem[2])
      
      Calling tcp_collapse_ofo_queue() in this case is not useful,
      and offers a O(N^2) surface attack to malicious peers.
      
      Better not attempt anything before full queue capacity is reached,
      forcing attacker to spend lots of resource and allow us to more
      easily detect the abuse.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f4a3313d
    • Eric Dumazet's avatar
      tcp: free batches of packets in tcp_prune_ofo_queue() · 72cd43ba
      Eric Dumazet authored
      Juha-Matti Tilli reported that malicious peers could inject tiny
      packets in out_of_order_queue, forcing very expensive calls
      to tcp_collapse_ofo_queue() and tcp_prune_ofo_queue() for
      every incoming packet. out_of_order_queue rb-tree can contain
      thousands of nodes, iterating over all of them is not nice.
      
      Before linux-4.9, we would have pruned all packets in ofo_queue
      in one go, every XXXX packets. XXXX depends on sk_rcvbuf and skbs
      truesize, but is about 7000 packets with tcp_rmem[2] default of 6 MB.
      
      Since we plan to increase tcp_rmem[2] in the future to cope with
      modern BDP, can not revert to the old behavior, without great pain.
      
      Strategy taken in this patch is to purge ~12.5 % of the queue capacity.
      
      Fixes: 36a6503f ("tcp: refine tcp_prune_ofo_queue() to not drop all packets")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarJuha-Matti Tilli <juha-matti.tilli@iki.fi>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      72cd43ba
    • Paolo Abeni's avatar
      ip: hash fragments consistently · 3dd1c9a1
      Paolo Abeni authored
      The skb hash for locally generated ip[v6] fragments belonging
      to the same datagram can vary in several circumstances:
      * for connected UDP[v6] sockets, the first fragment get its hash
        via set_owner_w()/skb_set_hash_from_sk()
      * for unconnected IPv6 UDPv6 sockets, the first fragment can get
        its hash via ip6_make_flowlabel()/skb_get_hash_flowi6(), if
        auto_flowlabel is enabled
      
      For the following frags the hash is usually computed via
      skb_get_hash().
      The above can cause OoO for unconnected IPv6 UDPv6 socket: in that
      scenario the egress tx queue can be selected on a per packet basis
      via the skb hash.
      It may also fool flow-oriented schedulers to place fragments belonging
      to the same datagram in different flows.
      
      Fix the issue by copying the skb hash from the head frag into
      the others at fragmentation time.
      
      Before this commit:
      perf probe -a "dev_queue_xmit skb skb->hash skb->l4_hash:b1@0/8 skb->sw_hash:b1@1/8"
      netperf -H $IPV4 -t UDP_STREAM -l 5 -- -m 2000 -n &
      perf record -e probe:dev_queue_xmit -e probe:skb_set_owner_w -a sleep 0.1
      perf script
      probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=3713014309 l4_hash=1 sw_hash=0
      probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=0 l4_hash=0 sw_hash=0
      
      After this commit:
      probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0
      probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0
      
      Fixes: b73c3d0e ("net: Save TX flow hash in sock and set in skbuf on xmit")
      Fixes: 67800f9b ("ipv6: Call skb_get_hash_flowi6 to get skb->hash in ip6_make_flowlabel")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3dd1c9a1
    • Wei Wang's avatar
      ipv6: use fib6_info_hold_safe() when necessary · e873e4b9
      Wei Wang authored
      In the code path where only rcu read lock is held, e.g. in the route
      lookup code path, it is not safe to directly call fib6_info_hold()
      because the fib6_info may already have been deleted but still exists
      in the rcu grace period. Holding reference to it could cause double
      free and crash the kernel.
      
      This patch adds a new function fib6_info_hold_safe() and replace
      fib6_info_hold() in all necessary places.
      
      Syzbot reported 3 crash traces because of this. One of them is:
      8021q: adding VLAN 0 to HW filter on device team0
      IPv6: ADDRCONF(NETDEV_CHANGE): team0: link becomes ready
      dst_release: dst:(____ptrval____) refcnt:-1
      dst_release: dst:(____ptrval____) refcnt:-2
      WARNING: CPU: 1 PID: 4845 at include/net/dst.h:239 dst_hold include/net/dst.h:239 [inline]
      WARNING: CPU: 1 PID: 4845 at include/net/dst.h:239 ip6_setup_cork+0xd66/0x1830 net/ipv6/ip6_output.c:1204
      dst_release: dst:(____ptrval____) refcnt:-1
      Kernel panic - not syncing: panic_on_warn set ...
      
      CPU: 1 PID: 4845 Comm: syz-executor493 Not tainted 4.18.0-rc3+ #10
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
       panic+0x238/0x4e7 kernel/panic.c:184
      dst_release: dst:(____ptrval____) refcnt:-2
      dst_release: dst:(____ptrval____) refcnt:-3
       __warn.cold.8+0x163/0x1ba kernel/panic.c:536
      dst_release: dst:(____ptrval____) refcnt:-4
       report_bug+0x252/0x2d0 lib/bug.c:186
       fixup_bug arch/x86/kernel/traps.c:178 [inline]
       do_error_trap+0x1fc/0x4d0 arch/x86/kernel/traps.c:296
      dst_release: dst:(____ptrval____) refcnt:-5
       do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316
       invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:992
      RIP: 0010:dst_hold include/net/dst.h:239 [inline]
      RIP: 0010:ip6_setup_cork+0xd66/0x1830 net/ipv6/ip6_output.c:1204
      Code: c1 ed 03 89 9d 18 ff ff ff 48 b8 00 00 00 00 00 fc ff df 41 c6 44 05 00 f8 e9 2d 01 00 00 4c 8b a5 c8 fe ff ff e8 1a f6 e6 fa <0f> 0b e9 6a fc ff ff e8 0e f6 e6 fa 48 8b 85 d0 fe ff ff 48 8d 78
      RSP: 0018:ffff8801a8fcf178 EFLAGS: 00010293
      RAX: ffff8801a8eba5c0 RBX: 0000000000000000 RCX: ffffffff869511e6
      RDX: 0000000000000000 RSI: ffffffff869515b6 RDI: 0000000000000005
      RBP: ffff8801a8fcf2c8 R08: ffff8801a8eba5c0 R09: ffffed0035ac8338
      R10: ffffed0035ac8338 R11: ffff8801ad6419c3 R12: ffff8801a8fcf720
      R13: ffff8801a8fcf6a0 R14: ffff8801ad6419c0 R15: ffff8801ad641980
       ip6_make_skb+0x2c8/0x600 net/ipv6/ip6_output.c:1768
       udpv6_sendmsg+0x2c90/0x35f0 net/ipv6/udp.c:1376
       inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798
       sock_sendmsg_nosec net/socket.c:641 [inline]
       sock_sendmsg+0xd5/0x120 net/socket.c:651
       ___sys_sendmsg+0x51d/0x930 net/socket.c:2125
       __sys_sendmmsg+0x240/0x6f0 net/socket.c:2220
       __do_sys_sendmmsg net/socket.c:2249 [inline]
       __se_sys_sendmmsg net/socket.c:2246 [inline]
       __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2246
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x446ba9
      Code: e8 cc bb 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fb39a469da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
      RAX: ffffffffffffffda RBX: 00000000006dcc54 RCX: 0000000000446ba9
      RDX: 00000000000000b8 RSI: 0000000020001b00 RDI: 0000000000000003
      RBP: 00000000006dcc50 R08: 00007fb39a46a700 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 45c828efc7a64843
      R13: e6eeb815b9d8a477 R14: 5068caf6f713c6fc R15: 0000000000000001
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Kernel Offset: disabled
      Rebooting in 86400 seconds..
      
      Fixes: 93531c67 ("net/ipv6: separate handling of FIB entries from dst based routes")
      Reported-by: syzbot+902e2a1bcd4f7808cef5@syzkaller.appspotmail.com
      Reported-by: syzbot+8ae62d67f647abeeceb9@syzkaller.appspotmail.com
      Reported-by: syzbot+3f08feb14086930677d0@syzkaller.appspotmail.com
      Signed-off-by: default avatarWei Wang <weiwan@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e873e4b9
    • David S. Miller's avatar
      Merge tag 'linux-can-fixes-for-4.18-20180723' of... · 5302a84e
      David S. Miller authored
      Merge tag 'linux-can-fixes-for-4.18-20180723' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can 2018-07-23
      
      this is a pull request of 12 patches for net/master.
      
      The patch by Stephane Grosjean for the peak_canfd CAN driver fixes a problem
      with older firmware. The next patch is by Roman Fietze and fixes the setup of
      the CCCR register in the m_can driver. Nicholas Mc Guire's patch for the
      mpc5xxx_can driver adds missing error checking. The two patches by Faiz Abbas
      fix the runtime resume and clean up the probe function in the m_can driver. The
      last 7 patches by Anssi Hannula fix several problem in the xilinx_can driver.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5302a84e
    • Anssi Hannula's avatar
      can: xilinx_can: fix power management handling · 8ebd83bd
      Anssi Hannula authored
      There are several issues with the suspend/resume handling code of the
      driver:
      
      - The device is attached and detached in the runtime_suspend() and
        runtime_resume() callbacks if the interface is running. However,
        during xcan_chip_start() the interface is considered running,
        causing the resume handler to incorrectly call netif_start_queue()
        at the beginning of xcan_chip_start(), and on xcan_chip_start() error
        return the suspend handler detaches the device leaving the user
        unable to bring-up the device anymore.
      
      - The device is not brought properly up on system resume. A reset is
        done and the code tries to determine the bus state after that.
        However, after reset the device is always in Configuration mode
        (down), so the state checking code does not make sense and
        communication will also not work.
      
      - The suspend callback tries to set the device to sleep mode (low-power
        mode which monitors the bus and brings the device back to normal mode
        on activity), but then immediately disables the clocks (possibly
        before the device reaches the sleep mode), which does not make sense
        to me. If a clean shutdown is wanted before disabling clocks, we can
        just bring it down completely instead of only sleep mode.
      
      Reorganize the PM code so that only the clock logic remains in the
      runtime PM callbacks and the system PM callbacks contain the device
      bring-up/down logic. This makes calling the runtime PM callbacks during
      e.g. xcan_chip_start() safe.
      
      The system PM callbacks now simply call common code to start/stop the
      HW if the interface was running, replacing the broken code from before.
      
      xcan_chip_stop() is updated to use the common reset code so that it will
      wait for the reset to complete. Reset also disables all interrupts so do
      not do that separately.
      
      Also, the device_may_wakeup() checks are removed as the driver does not
      have wakeup support.
      
      Tested on Zynq-7000 integrated CAN.
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Cc: Michal Simek <michal.simek@xilinx.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      8ebd83bd
    • Anssi Hannula's avatar
      can: xilinx_can: fix incorrect clear of non-processed interrupts · 2f4f0f33
      Anssi Hannula authored
      xcan_interrupt() clears ERROR|RXOFLV|BSOFF|ARBLST interrupts if any of
      them is asserted. This does not take into account that some of them
      could have been asserted between interrupt status read and interrupt
      clear, therefore clearing them without handling them.
      
      Fix the code to only clear those interrupts that it knows are asserted
      and therefore going to be processed in xcan_err_interrupt().
      
      Fixes: b1201e44 ("can: xilinx CAN controller support")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Cc: Michal Simek <michal.simek@xilinx.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      2f4f0f33
    • Anssi Hannula's avatar
      can: xilinx_can: fix RX overflow interrupt not being enabled · 83997997
      Anssi Hannula authored
      RX overflow interrupt (RXOFLW) is disabled even though xcan_interrupt()
      processes it. This means that an RX overflow interrupt will only be
      processed when another interrupt gets asserted (e.g. for RX/TX).
      
      Fix that by enabling the RXOFLW interrupt.
      
      Fixes: b1201e44 ("can: xilinx CAN controller support")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Cc: Michal Simek <michal.simek@xilinx.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      83997997
    • Anssi Hannula's avatar
      can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting · 620050d9
      Anssi Hannula authored
      The xilinx_can driver assumes that the TXOK interrupt only clears after
      it has been acknowledged as many times as there have been successfully
      sent frames.
      
      However, the documentation does not mention such behavior, instead
      saying just that the interrupt is cleared when the clear bit is set.
      
      Similarly, testing seems to also suggest that it is immediately cleared
      regardless of the amount of frames having been sent. Performing some
      heavy TX load and then going back to idle has the tx_head drifting
      further away from tx_tail over time, steadily reducing the amount of
      frames the driver keeps in the TX FIFO (but not to zero, as the TXOK
      interrupt always frees up space for 1 frame from the driver's
      perspective, so frames continue to be sent) and delaying the local echo
      frames.
      
      The TX FIFO tracking is also otherwise buggy as it does not account for
      TX FIFO being cleared after software resets, causing
        BUG!, TX FIFO full when queue awake!
      messages to be output.
      
      There does not seem to be any way to accurately track the state of the
      TX FIFO for local echo support while using the full TX FIFO.
      
      The Zynq version of the HW (but not the soft-AXI version) has watermark
      programming support and with it an additional TX-FIFO-empty interrupt
      bit.
      
      Modify the driver to only put 1 frame into TX FIFO at a time on soft-AXI
      and 2 frames at a time on Zynq. On Zynq the TXFEMP interrupt bit is used
      to detect whether 1 or 2 frames have been sent at interrupt processing
      time.
      
      Tested with the integrated CAN on Zynq-7000 SoC. The 1-frame-FIFO mode
      was also tested.
      
      An alternative way to solve this would be to drop local echo support but
      keep using the full TX FIFO.
      
      v2: Add FIFO space check before TX queue wake with locking to
      synchronize with queue stop. This avoids waking the queue when xmit()
      had just filled it.
      
      v3: Keep local echo support and reduce the amount of frames in FIFO
      instead as suggested by Marc Kleine-Budde.
      
      Fixes: b1201e44 ("can: xilinx CAN controller support")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      620050d9
    • Anssi Hannula's avatar
      can: xilinx_can: fix recovery from error states not being propagated · 877e0b75
      Anssi Hannula authored
      The xilinx_can driver contains no mechanism for propagating recovery
      from CAN_STATE_ERROR_WARNING and CAN_STATE_ERROR_PASSIVE.
      
      Add such a mechanism by factoring the handling of
      XCAN_STATE_ERROR_PASSIVE and XCAN_STATE_ERROR_WARNING out of
      xcan_err_interrupt and checking for recovery after RX and TX if the
      interface is in one of those states.
      
      Tested with the integrated CAN on Zynq-7000 SoC.
      
      Fixes: b1201e44 ("can: xilinx CAN controller support")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      877e0b75
    • Anssi Hannula's avatar
      can: xilinx_can: fix RX loop if RXNEMP is asserted without RXOK · 32852c56
      Anssi Hannula authored
      If the device gets into a state where RXNEMP (RX FIFO not empty)
      interrupt is asserted without RXOK (new frame received successfully)
      interrupt being asserted, xcan_rx_poll() will continue to try to clear
      RXNEMP without actually reading frames from RX FIFO. If the RX FIFO is
      not empty, the interrupt will not be cleared and napi_schedule() will
      just be called again.
      
      This situation can occur when:
      
      (a) xcan_rx() returns without reading RX FIFO due to an error condition.
      The code tries to clear both RXOK and RXNEMP but RXNEMP will not clear
      due to a frame still being in the FIFO. The frame will never be read
      from the FIFO as RXOK is no longer set.
      
      (b) A frame is received between xcan_rx_poll() reading interrupt status
      and clearing RXOK. RXOK will be cleared, but RXNEMP will again remain
      set as the new message is still in the FIFO.
      
      I'm able to trigger case (b) by flooding the bus with frames under load.
      
      There does not seem to be any benefit in using both RXNEMP and RXOK in
      the way the driver does, and the polling example in the reference manual
      (UG585 v1.10 18.3.7 Read Messages from RxFIFO) also says that either
      RXOK or RXNEMP can be used for detecting incoming messages.
      
      Fix the issue and simplify the RX processing by only using RXNEMP
      without RXOK.
      
      Tested with the integrated CAN on Zynq-7000 SoC.
      
      Fixes: b1201e44 ("can: xilinx CAN controller support")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      32852c56
    • Anssi Hannula's avatar
      can: xilinx_can: fix device dropping off bus on RX overrun · 2574fe54
      Anssi Hannula authored
      The xilinx_can driver performs a software reset when an RX overrun is
      detected. This causes the device to enter Configuration mode where no
      messages are received or transmitted.
      
      The documentation does not mention any need to perform a reset on an RX
      overrun, and testing by inducing an RX overflow also indicated that the
      device continues to work just fine without a reset.
      
      Remove the software reset.
      
      Tested with the integrated CAN on Zynq-7000 SoC.
      
      Fixes: b1201e44 ("can: xilinx CAN controller support")
      Signed-off-by: default avatarAnssi Hannula <anssi.hannula@bitwise.fi>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      2574fe54
    • Faiz Abbas's avatar
      can: m_can: Move accessing of message ram to after clocks are enabled · 54e4a0c4
      Faiz Abbas authored
      MCAN message ram should only be accessed once clocks are enabled.
      Therefore, move the call to parse/init the message ram to after
      clocks are enabled.
      Signed-off-by: default avatarFaiz Abbas <faiz_abbas@ti.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      54e4a0c4
    • Faiz Abbas's avatar
      can: m_can: Fix runtime resume call · 1675bee3
      Faiz Abbas authored
      pm_runtime_get_sync() returns a 1 if the state of the device is already
      'active'. This is not a failure case and should return a success.
      
      Therefore fix error handling for pm_runtime_get_sync() call such that
      it returns success when the value is 1.
      
      Also cleanup the TODO for using runtime PM for sleep mode as that is
      implemented.
      Signed-off-by: default avatarFaiz Abbas <faiz_abbas@ti.com>
      Cc: <stable@vger.kernel.org
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      1675bee3
    • Nicholas Mc Guire's avatar
      can: mpc5xxx_can: check of_iomap return before use · b5c1a23b
      Nicholas Mc Guire authored
      of_iomap() can return NULL so that return needs to be checked and NULL
      treated as failure. While at it also take care of the missing
      of_node_put() in the error path.
      Signed-off-by: default avatarNicholas Mc Guire <hofrat@osadl.org>
      Fixes: commit afa17a50 ("net/can: add driver for mscan family & mpc52xx_mscan")
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      b5c1a23b
    • Roman Fietze's avatar
      can: m_can.c: fix setup of CCCR register: clear CCCR NISO bit before checking can.ctrlmode · 393753b2
      Roman Fietze authored
      Inside m_can_chip_config(), when setting up the new value of the CCCR,
      the CCCR_NISO bit is not cleared like the others, CCCR_TEST, CCCR_MON,
      CCCR_BRSE and CCCR_FDOE, before checking the can.ctrlmode bits for
      CAN_CTRLMODE_FD_NON_ISO.
      
      This way once the controller was configured for CAN_CTRLMODE_FD_NON_ISO,
      this mode could never be cleared again.
      
      This fix is only relevant for controllers with version 3.1.x or 3.2.x.
      Older versions do not support NISO.
      Signed-off-by: default avatarRoman Fietze <roman.fietze@telemotive.de>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      393753b2
    • Stephane Grosjean's avatar
      can: peak_canfd: fix firmware < v3.3.0: limit allocation to 32-bit DMA addr only · 5d4c94ed
      Stephane Grosjean authored
      The DMA logic in firmwares < v3.3.0 embedded in the PCAN-PCIe FD cards
      family is not capable of handling a mix of 32-bit and 64-bit logical
      addresses. If the board is equipped with 2 or 4 CAN ports, then such a
      situation might lead to a PCIe Bus Error "Malformed TLP" packet
      as well as "irq xx: nobody cared" issue.
      
      This patch adds a workaround that requests only 32-bit DMA addresses
      when these might be allocated outside of the 4 GB area.
      
      This issue has been fixed in firmware v3.3.0 and next.
      Signed-off-by: default avatarStephane Grosjean <s.grosjean@peak-system.com>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      5d4c94ed
  6. 22 Jul, 2018 7 commits
    • Randy Dunlap's avatar
      net: prevent ISA drivers from building on PPC32 · c9ce1fa1
      Randy Dunlap authored
      Prevent drivers from building on PPC32 if they use isa_bus_to_virt(),
      isa_virt_to_bus(), or isa_page_to_bus(), which are not available and
      thus cause build errors.
      
      ../drivers/net/ethernet/3com/3c515.c: In function 'corkscrew_open':
      ../drivers/net/ethernet/3com/3c515.c:824:9: error: implicit declaration of function 'isa_virt_to_bus'; did you mean 'virt_to_bus'? [-Werror=implicit-function-declaration]
      
      ../drivers/net/ethernet/amd/lance.c: In function 'lance_rx':
      ../drivers/net/ethernet/amd/lance.c:1203:23: error: implicit declaration of function 'isa_bus_to_virt'; did you mean 'bus_to_virt'? [-Werror=implicit-function-declaration]
      
      ../drivers/net/ethernet/amd/ni65.c: In function 'ni65_init_lance':
      ../drivers/net/ethernet/amd/ni65.c:585:20: error: implicit declaration of function 'isa_virt_to_bus'; did you mean 'virt_to_bus'? [-Werror=implicit-function-declaration]
      
      ../drivers/net/ethernet/cirrus/cs89x0.c: In function 'net_open':
      ../drivers/net/ethernet/cirrus/cs89x0.c:897:20: error: implicit declaration of function 'isa_virt_to_bus'; did you mean 'virt_to_bus'? [-Werror=implicit-function-declaration]
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Suggested-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c9ce1fa1
    • John Hurley's avatar
      nfp: flower: ensure dead neighbour entries are not offloaded · b809ec86
      John Hurley authored
      Previously only the neighbour state was checked to decide if an offloaded
      entry should be removed. However, there can be situations when the entry
      is dead but still marked as valid. This can lead to dead entries not
      being removed from fw tables or even incorrect data being added.
      
      Check the entry dead bit before deciding if it should be added to or
      removed from fw neighbour tables.
      
      Fixes: 8e6a9046 ("nfp: flower vxlan neighbour offload")
      Signed-off-by: default avatarJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b809ec86
    • David S. Miller's avatar
      Merge branch 'vxlan-fix-default-fdb-entry-user-space-notify-ordering-race' · 0fb8d5a0
      David S. Miller authored
      Roopa Prabhu says:
      
      ====================
      vxlan: fix default fdb entry user-space notify ordering/race
      
      Problem:
      In vxlan_newlink, a default fdb entry is added before register_netdev.
      The default fdb creation function notifies user-space of the
      fdb entry on the vxlan device which user-space does not know about yet.
      (RTM_NEWNEIGH goes before RTM_NEWLINK for the same ifindex).
      
      This series fixes the user-space netlink notification ordering issue
      with the following changes:
      - decouple fdb notify from fdb create.
      - Move fdb notify after register_netdev.
      - modify rtnl_configure_link to allow configuring a link early.
      - Call rtnl_configure_link in vxlan newlink handler to notify
      userspace about the newlink before fdb notify and
      hence avoiding the user-space race.
      ====================
      
      Fixes: afbd8bae ("vxlan: add implicit fdb entry for default destination")
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      0fb8d5a0
    • Roopa Prabhu's avatar
      vxlan: fix default fdb entry netlink notify ordering during netdev create · e99465b9
      Roopa Prabhu authored
      Problem:
      In vxlan_newlink, a default fdb entry is added before register_netdev.
      The default fdb creation function also notifies user-space of the
      fdb entry on the vxlan device which user-space does not know about yet.
      (RTM_NEWNEIGH goes before RTM_NEWLINK for the same ifindex).
      
      This patch fixes the user-space netlink notification ordering issue
      with the following changes:
      - decouple fdb notify from fdb create.
      - Move fdb notify after register_netdev.
      - Call rtnl_configure_link in vxlan newlink handler to notify
      userspace about the newlink before fdb notify and
      hence avoiding the user-space race.
      
      Fixes: afbd8bae ("vxlan: add implicit fdb entry for default destination")
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e99465b9
    • Roopa Prabhu's avatar
      vxlan: make netlink notify in vxlan_fdb_destroy optional · f6e05385
      Roopa Prabhu authored
      Add a new option do_notify to vxlan_fdb_destroy to make
      sending netlink notify optional. Used by a later patch.
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6e05385
    • Roopa Prabhu's avatar
      vxlan: add new fdb alloc and create helpers · 7431016b
      Roopa Prabhu authored
      - Add new vxlan_fdb_alloc helper
      - rename existing vxlan_fdb_create into vxlan_fdb_update:
              because it really creates or updates an existing
              fdb entry
      - move new fdb creation into a separate vxlan_fdb_create
      
      Main motivation for this change is to introduce the ability
      to decouple vxlan fdb creation and notify, used in a later patch.
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7431016b
    • Roopa Prabhu's avatar
      rtnetlink: add rtnl_link_state check in rtnl_configure_link · 5025f7f7
      Roopa Prabhu authored
      rtnl_configure_link sets dev->rtnl_link_state to
      RTNL_LINK_INITIALIZED and unconditionally calls
      __dev_notify_flags to notify user-space of dev flags.
      
      current call sequence for rtnl_configure_link
      rtnetlink_newlink
          rtnl_link_ops->newlink
          rtnl_configure_link (unconditionally notifies userspace of
                               default and new dev flags)
      
      If a newlink handler wants to call rtnl_configure_link
      early, we will end up with duplicate notifications to
      user-space.
      
      This patch fixes rtnl_configure_link to check rtnl_link_state
      and call __dev_notify_flags with gchanges = 0 if already
      RTNL_LINK_INITIALIZED.
      
      Later in the series, this patch will help the following sequence
      where a driver implementing newlink can call rtnl_configure_link
      to initialize the link early.
      
      makes the following call sequence work:
      rtnetlink_newlink
          rtnl_link_ops->newlink (vxlan) -> rtnl_configure_link (initializes
                                                      link and notifies
                                                      user-space of default
                                                      dev flags)
          rtnl_configure_link (updates dev flags if requested by user ifm
                               and notifies user-space of new dev flags)
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5025f7f7