- 01 Apr, 2020 40 commits
-
-
Qian Cai authored
BugLink: https://bugs.launchpad.net/bugs/1868629 [ Upstream commit 6c5d9112 ] journal_head::b_transaction and journal_head::b_next_transaction could be accessed concurrently as noticed by KCSAN, LTP: starting fsync04 /dev/zero: Can't open blockdev EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null) ================================================================== BUG: KCSAN: data-race in __jbd2_journal_refile_buffer [jbd2] / jbd2_write_access_granted [jbd2] write to 0xffff99f9b1bd0e30 of 8 bytes by task 25721 on cpu 70: __jbd2_journal_refile_buffer+0xdd/0x210 [jbd2] __jbd2_journal_refile_buffer at fs/jbd2/transaction.c:2569 jbd2_journal_commit_transaction+0x2d15/0x3f20 [jbd2] (inlined by) jbd2_journal_commit_transaction at fs/jbd2/commit.c:1034 kjournald2+0x13b/0x450 [jbd2] kthread+0x1cd/0x1f0 ret_from_fork+0x27/0x50 read to 0xffff99f9b1bd0e30 of 8 bytes by task 25724 on cpu 68: jbd2_write_access_granted+0x1b2/0x250 [jbd2] jbd2_write_access_granted at fs/jbd2/transaction.c:1155 jbd2_journal_get_write_access+0x2c/0x60 [jbd2] __ext4_journal_get_write_access+0x50/0x90 [ext4] ext4_mb_mark_diskspace_used+0x158/0x620 [ext4] ext4_mb_new_blocks+0x54f/0xca0 [ext4] ext4_ind_map_blocks+0xc79/0x1b40 [ext4] ext4_map_blocks+0x3b4/0x950 [ext4] _ext4_get_block+0xfc/0x270 [ext4] ext4_get_block+0x3b/0x50 [ext4] __block_write_begin_int+0x22e/0xae0 __block_write_begin+0x39/0x50 ext4_write_begin+0x388/0xb50 [ext4] generic_perform_write+0x15d/0x290 ext4_buffered_write_iter+0x11f/0x210 [ext4] ext4_file_write_iter+0xce/0x9e0 [ext4] new_sync_write+0x29c/0x3b0 __vfs_write+0x92/0xa0 vfs_write+0x103/0x260 ksys_write+0x9d/0x130 __x64_sys_write+0x4c/0x60 do_syscall_64+0x91/0xb05 entry_SYSCALL_64_after_hwframe+0x49/0xbe 5 locks held by fsync04/25724: #0: ffff99f9911093f8 (sb_writers#13){.+.+}, at: vfs_write+0x21c/0x260 #1: ffff99f9db4c0348 (&sb->s_type->i_mutex_key#15){+.+.}, at: ext4_buffered_write_iter+0x65/0x210 [ext4] #2: ffff99f5e7dfcf58 (jbd2_handle){++++}, at: start_this_handle+0x1c1/0x9d0 [jbd2] #3: ffff99f9db4c0168 (&ei->i_data_sem){++++}, at: ext4_map_blocks+0x176/0x950 [ext4] #4: ffffffff99086b40 (rcu_read_lock){....}, at: jbd2_write_access_granted+0x4e/0x250 [jbd2] irq event stamp: 1407125 hardirqs last enabled at (1407125): [<ffffffff980da9b7>] __find_get_block+0x107/0x790 hardirqs last disabled at (1407124): [<ffffffff980da8f9>] __find_get_block+0x49/0x790 softirqs last enabled at (1405528): [<ffffffff98a0034c>] __do_softirq+0x34c/0x57c softirqs last disabled at (1405521): [<ffffffff97cc67a2>] irq_exit+0xa2/0xc0 Reported by Kernel Concurrency Sanitizer on: CPU: 68 PID: 25724 Comm: fsync04 Tainted: G L 5.6.0-rc2-next-20200221+ #7 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 The plain reads are outside of jh->b_state_lock critical section which result in data races. Fix them by adding pairs of READ|WRITE_ONCE(). Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Qian Cai <cai@lca.pw> Link: https://lore.kernel.org/r/20200222043111.2227-1-cai@lca.pwSigned-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Linus Torvalds authored
BugLink: https://bugs.launchpad.net/bugs/1868629 [ Upstream commit fda31c50 ] When queueing a signal, we increment both the users count of pending signals (for RLIMIT_SIGPENDING tracking) and we increment the refcount of the user struct itself (because we keep a reference to the user in the signal structure in order to correctly account for it when freeing). That turns out to be fairly expensive, because both of them are atomic updates, and particularly under extreme signal handling pressure on big machines, you can get a lot of cache contention on the user struct. That can then cause horrid cacheline ping-pong when you do these multiple accesses. So change the reference counting to only pin the user for the _first_ pending signal, and to unpin it when the last pending signal is dequeued. That means that when a user sees a lot of concurrent signal queuing - which is the only situation when this matters - the only atomic access needed is generally the 'sigpending' count update. This was noticed because of a particularly odd timing artifact on a dual-socket 96C/192T Cascade Lake platform: when you get into bad contention, on that machine for some reason seems to be much worse when the contention happens in the upper 32-byte half of the cacheline. As a result, the kernel test robot will-it-scale 'signal1' benchmark had an odd performance regression simply due to random alignment of the 'struct user_struct' (and pointed to a completely unrelated and apparently nonsensical commit for the regression). Avoiding the double increments (and decrements on the dequeueing side, of course) makes for much less contention and hugely improved performance on that will-it-scale microbenchmark. Quoting Feng Tang: "It makes a big difference, that the performance score is tripled! bump from original 17000 to 54000. Also the gap between 5.0-rc6 and 5.0-rc6+Jiri's patch is reduced to around 2%" [ The "2% gap" is the odd cacheline placement difference on that platform: under the extreme contention case, the effect of which half of the cacheline was hot was 5%, so with the reduced contention the odd timing artifact is reduced too ] It does help in the non-contended case too, but is not nearly as noticeable. Reported-and-tested-by: Feng Tang <feng.tang@intel.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Huang, Ying <ying.huang@intel.com> Cc: Philip Li <philip.li@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Marek Vasut authored
BugLink: https://bugs.launchpad.net/bugs/1868629 [ Upstream commit 44343418 ] The KS8851 requires that packet RX and TX are mutually exclusive. Currently, the driver hopes to achieve this by disabling interrupt from the card by writing the card registers and by disabling the interrupt on the interrupt controller. This however is racy on SMP. Replace this approach by expanding the spinlock used around the ks_start_xmit() TX path to ks_irq() RX path to assure true mutual exclusion and remove the interrupt enabling/disabling, which is now not needed anymore. Furthermore, disable interrupts also in ks_net_stop(), which was missing before. Note that a massive improvement here would be to re-use the KS8851 driver approach, which is to move the TX path into a worker thread, interrupt handling to threaded interrupt, and synchronize everything with mutexes, but that would be a much bigger rework, for a separate patch. Signed-off-by: Marek Vasut <marex@denx.de> Cc: David S. Miller <davem@davemloft.net> Cc: Lukas Wunner <lukas@wunner.de> Cc: Petr Stetiar <ynezz@true.cz> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Kim Phillips authored
BugLink: https://bugs.launchpad.net/bugs/1868629 [ Upstream commit f967140d ] Enable the sampling check in kernel/events/core.c::perf_event_open(), which returns the more appropriate -EOPNOTSUPP. BEFORE: $ sudo perf record -a -e instructions,l3_request_g1.caching_l3_cache_accesses true Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (l3_request_g1.caching_l3_cache_accesses). /bin/dmesg | grep -i perf may provide additional information. With nothing relevant in dmesg. AFTER: $ sudo perf record -a -e instructions,l3_request_g1.caching_l3_cache_accesses true Error: l3_request_g1.caching_l3_cache_accesses: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat' Fixes: c43ca509 ("perf/x86/amd: Add support for AMD NB and L2I "uncore" counters") Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200311191323.13124-1-kim.phillips@amd.comSigned-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 A transmission scheduling for an interface which is currently dropped by batadv_iv_ogm_iface_disable could still be in progress. The B.A.T.M.A.N. V is simply cancelling the workqueue item in an synchronous way but this is not possible with B.A.T.M.A.N. IV because the OGM submissions are intertwined. Instead it has to stop submitting the OGM when it detect that the buffer pointer is set to NULL. Reported-by: syzbot+a98f2016f40b9cd3818a@syzkaller.appspotmail.com Reported-by: syzbot+ac36b6a33c28a491e929@syzkaller.appspotmail.com Fixes: c6c8fea2 ("net: Add batman-adv meshing protocol") Signed-off-by: Sven Eckelmann <sven@narfation.org> Cc: Hillf Danton <hdanton@sina.com> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 40e220b4 upstream. Each slave interface of an B.A.T.M.A.N. IV virtual interface has an OGM packet buffer which is initialized using data from netdevice notifier and other rtnetlink related hooks. It is sent regularly via various slave interfaces of the batadv virtual interface and in this process also modified (realloced) to integrate additional state information via TVLV containers. It must be avoided that the worker item is executed without a common lock with the netdevice notifier/rtnetlink helpers. Otherwise it can either happen that half modified/freed data is sent out or functions modifying the OGM buffer try to access already freed memory regions. Reported-by: syzbot+0cc629f19ccb8534935b@syzkaller.appspotmail.com Fixes: c6c8fea2 ("net: Add batman-adv meshing protocol") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit a15d56a6 upstream. Multiple batadv_ogm_packet can be stored in an skbuff. The functions batadv_iv_ogm_send_to_if()/batadv_iv_ogm_receive() use batadv_iv_ogm_aggr_packet() to check if there is another additional batadv_ogm_packet in the skb or not before they continue processing the packet. The length for such an OGM is BATADV_OGM_HLEN + batadv_ogm_packet->tvlv_len. The check must first check that at least BATADV_OGM_HLEN bytes are available before it accesses tvlv_len (which is part of the header. Otherwise it might try read outside of the currently available skbuff to get the content of tvlv_len. Fixes: ef261577 ("batman-adv: tvlv - basic infrastructure") Reported-by: syzbot+355cab184197dbbfa384@syzkaller.appspotmail.com Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit f131a568 upstream. The batadv_hash_remove is a function which searches the hashtable for an entry using a needle, a hashtable bucket selection function and a compare function. It will lock the bucket list and delete an entry when the compare function matches it with the needle. It returns the pointer to the hlist_node which matches or NULL when no entry matches the needle. The batadv_tt_global_free is not itself protected in anyway to avoid that any other function is modifying the hashtable between the search for the entry and the call to batadv_hash_remove. It can therefore happen that the entry either doesn't exist anymore or an entry was deleted which is not the same object as the needle. In such an situation, the reference counter (for the reference stored in the hashtable) must not be reduced for the needle. Instead the reference counter of the actually removed entry has to be reduced. Otherwise the reference counter will underflow and the object might be freed before all its references were dropped. The kref helpers reported this problem as: refcount_t: underflow; use-after-free. Fixes: 7683fdc1 ("batman-adv: protect the local and the global trans-tables with rcu") Reported-by: Martin Weinelt <martin@linuxlounge.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 3d65b9ac upstream. The batadv_hash_remove is a function which searches the hashtable for an entry using a needle, a hashtable bucket selection function and a compare function. It will lock the bucket list and delete an entry when the compare function matches it with the needle. It returns the pointer to the hlist_node which matches or NULL when no entry matches the needle. The batadv_tt_local_remove is not itself protected in anyway to avoid that any other function is modifying the hashtable between the search for the entry and the call to batadv_hash_remove. It can therefore happen that the entry either doesn't exist anymore or an entry was deleted which is not the same object as the needle. In such an situation, the reference counter (for the reference stored in the hashtable) must not be reduced for the needle. Instead the reference counter of the actually removed entry has to be reduced. Otherwise the reference counter will underflow and the object might be freed before all its references were dropped. The kref helpers reported this problem as: refcount_t: underflow; use-after-free. Fixes: ef72706a ("batman-adv: protect tt_local_entry from concurrent delete events") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 4ba104f4 upstream. The batadv_hash_remove is a function which searches the hashtable for an entry using a needle, a hashtable bucket selection function and a compare function. It will lock the bucket list and delete an entry when the compare function matches it with the needle. It returns the pointer to the hlist_node which matches or NULL when no entry matches the needle. The batadv_bla_del_claim is not itself protected in anyway to avoid that any other function is modifying the hashtable between the search for the entry and the call to batadv_hash_remove. It can therefore happen that the entry either doesn't exist anymore or an entry was deleted which is not the same object as the needle. In such an situation, the reference counter (for the reference stored in the hashtable) must not be reduced for the needle. Instead the reference counter of the actually removed entry has to be reduced. Otherwise the reference counter will underflow and the object might be freed before all its references were dropped. The kref helpers reported this problem as: refcount_t: underflow; use-after-free. Fixes: 23721387 ("batman-adv: add basic bridge loop avoidance code") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit ae3cdc97 upstream. The function batadv_tvlv_handler_register is responsible for adding new tvlv_handler to the handler_list. It first checks whether the entry already is in the list or not. If it is, then the creation of a new entry is aborted. But the lock for the list is only held when the list is really modified. This could lead to duplicated entries because another context could create an entry with the same key between the check and the list manipulation. The check and the manipulation of the list must therefore be in the same locked code section. Fixes: ef261577 ("batman-adv: tvlv - basic infrastructure") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit e7136e48 upstream. The function batadv_tt_global_orig_entry_add is responsible for adding new tt_orig_list_entry to the orig_list. It first checks whether the entry already is in the list or not. If it is, then the creation of a new entry is aborted. But the lock for the list is only held when the list is really modified. This could lead to duplicated entries because another context could create an entry with the same key between the check and the list manipulation. The check and the manipulation of the list must therefore be in the same locked code section. Fixes: d657e621 ("batman-adv: add reference counting for type batadv_tt_orig_list_entry") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit fa122fec upstream. The function batadv_nc_get_nc_node is responsible for adding new nc_nodes to the in_coding_list and out_coding_list. It first checks whether the entry already is in the list or not. If it is, then the creation of a new entry is aborted. But the lock for the list is only held when the list is really modified. This could lead to duplicated entries because another context could create an entry with the same key between the check and the list manipulation. The check and the manipulation of the list must therefore be in the same locked code section. Fixes: d56b1705 ("batman-adv: network coding - detect coding nodes and remove these after timeout") Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit dff9bc42 upstream. The function batadv_gw_node_add is responsible for adding new gw_node to the gateway_list. It is expecting that the caller already checked that there is not already an entry with the same key or not. But the lock for the list is only held when the list is really modified. This could lead to duplicated entries because another context could create an entry with the same key between the check and the list manipulation. The check and the manipulation of the list must therefore be in the same locked code section. Fixes: c6c8fea2 ("net: Add batman-adv meshing protocol") Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Linus Lüssing authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 4a519b83 upstream. Since commit 54e22f26 ("batman-adv: fix TT sync flag inconsistencies") TT sync flags and TT non-sync'd flags are supposed to be stored separately. The previous patch missed to apply this separation on a TT entry with only a single TT orig entry. This is a minor fix because with only a single TT orig entry the DDoS issue the former patch solves does not apply. Fixes: 54e22f26 ("batman-adv: fix TT sync flag inconsistencies") Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 6da7be7d upstream. batman-adv is creating special debugfs directories in the init net_namespace for each created soft-interface (batadv net_device). But it is possible to rename a net_device to a completely different name then the original one. It can therefore happen that a user registers a new batadv net_device with the name "bat0". batman-adv is then also adding a new directory under $debugfs/batman-adv/ with the name "wlan0". The user then decides to rename this device to "bat1" and registers a different batadv device with the name "bat0". batman-adv will then try to create a directory with the name "bat0" under $debugfs/batman-adv/ again. But there already exists one with this name under this path and thus this fails. batman-adv will detect a problem and rollback the registering of this device. batman-adv must therefore take care of renaming the debugfs directories for soft-interfaces whenever it detects such a net_device rename. Fixes: c6c8fea2 ("net: Add batman-adv meshing protocol") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 36dc621c upstream. batman-adv is creating special debugfs directories in the init net_namespace for each valid hard-interface (net_device). But it is possible to rename a net_device to a completely different name then the original one. It can therefore happen that a user registers a new net_device which gets the name "wlan0" assigned by default. batman-adv is also adding a new directory under $debugfs/batman-adv/ with the name "wlan0". The user then decides to rename this device to "wl_pri" and registers a different device. The kernel may now decide to use the name "wlan0" again for this new device. batman-adv will detect it as a valid net_device and tries to create a directory with the name "wlan0" under $debugfs/batman-adv/. But there already exists one with this name under this path and thus this fails. batman-adv will detect a problem and rollback the registering of this device. batman-adv must therefore take care of renaming the debugfs directories for hard-interfaces whenever it detects such a net_device rename. Fixes: 5bc7c1eb ("batman-adv: add debugfs structure for information per interface") Reported-by: John Soros <sorosj@gmail.com> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Marek Lindner authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 16116dac upstream. A translation table TVLV changset sent with an OGM consists of a number of headers (one per VLAN) plus the changeset itself (addition and/or deletion of entries). The per-VLAN headers are used by OGM recipients for consistency checks. Said consistency check might determine that a full translation table request is needed to restore consistency. If the TT sender adds per-VLAN headers of empty VLANs into the OGM, recipients are led to believe to have reached an inconsistent state and thus request a full table update. The full table does not contain empty VLANs (due to missing entries) the cycle restarts when the next OGM is issued. Consequently, when the translation table TVLV headers are composed, empty VLANs are to be excluded. Fixes: 21a57f6e7a3b ("batman-adv: make the TT CRC logic VLAN specific") Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Linus Lüssing authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 7072337e upstream. The previous TT sync fix so far only fixed TT responses issued by the target node directly. So far, TT responses issued by intermediate nodes still lead to the wrong flags being added, leading to CRC mismatches. This behaviour was observed at Freifunk Hannover in a 800 nodes setup where a considerable amount of nodes were still infected with 'WI' TT flags even with (most) nodes having the previous TT sync fix applied. I was able to reproduce the issue with intermediate TT responses in a four node test setup and this patch fixes this issue by ensuring to use the per originator instead of the summarized, OR'd ones. Fixes: e9c00136 ("batman-adv: fix tt_global_entries flags update") Reported-by: Leonardo Mörlein <me@irrelefant.net> Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 8ba0f9bd upstream. The functions batadv_tt_prepare_tvlv_local_data and batadv_tt_prepare_tvlv_global_data are responsible for preparing a buffer which can be used to store the TVLV container for TT and add the VLAN information to it. This will be done in three phases: 1. count the number of VLANs and their entries 2. allocate the buffer using the counters from the previous step and limits from the caller (parameter tt_len) 3. insert the VLAN information to the buffer The step 1 and 3 operate on a list which contains the VLANs. The access to these lists must be protected with an appropriate lock or otherwise they might operate on on different entries. This could for example happen when another context is adding VLAN entries to this list. This could lead to a buffer overflow in these functions when enough entries were added between step 1 and 3 to the VLAN lists that the buffer room for the entries (*tt_change) is smaller then the now required extra buffer for new VLAN entries. Fixes: 7ea7b4a1 ("batman-adv: make the TT CRC logic VLAN specific") Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit fc04fdb2 upstream. batadv_check_unicast_ttvn may redirect a packet to itself or another originator. This involves rewriting the ttvn and the destination address in the batadv unicast header. These field were not yet pulled (with skb rcsum update) and thus any change to them also requires a change in the receive checksum. Reported-by: Matthias Schiffer <mschiffer@universe-factory.net> Fixes: a73105b8 ("batman-adv: improved client announcement mechanism") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Matthias Schiffer authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit bc44b781 upstream. batadv_check_unicast_ttvn() calls skb_cow(), so pointers into the SKB data must be (re)set after calling it. The ethhdr variable is dropped altogether. Fixes: 78fc6bbe0aca ("batman-adv: add UNICAST_4ADDR packet type") Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit f22e0893 upstream. batman-adv uses internal indices for each enabled and active interface. It is currently used by the B.A.T.M.A.N. IV algorithm to identifify the correct position in the ogm_cnt bitmaps. The type for the number of enabled interfaces (which defines the next interface index) was set to char. This type can be (depending on the architecture) either signed (limiting batman-adv to 127 active slave interfaces) or unsigned (limiting batman-adv to 255 active slave interfaces). This limit was not correctly checked when an interface was enabled and thus an overflow happened. This was only catched on systems with the signed char type when the B.A.T.M.A.N. IV code tried to resize its counter arrays with a negative size. The if_num interface index was only a s16 and therefore significantly smaller than the ifindex (int) used by the code net code. Both &batadv_hard_iface->if_num and &batadv_priv->num_ifaces must be (unsigned) int to support the same number of slave interfaces as the net core code. And the interface activation code must check the number of active slave interfaces to avoid integer overflows. Fixes: c6c8fea2 ("net: Add batman-adv meshing protocol") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 5ba7dcfe upstream. The originator node object orig_neigh_node is used to when accessing the bcast_own(_sum) and real_packet_count information. The access to them has to be protected with the spinlock in orig_neigh_node. But the function uses the lock in orig_node instead. This is incorrect because they could be two different originator node objects. Fixes: 0ede9f41 ("batman-adv: protect bit operations to count OGMs with spinlock") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Linus Lüssing authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 54e22f26 upstream. This patch fixes an issue in the translation table code potentially leading to a TT Request + Response storm. The issue may occur for nodes involving BLA and an inconsistent configuration of the batman-adv AP isolation feature. However, since the new multicast optimizations, a single, malformed packet may lead to a mesh-wide, persistent Denial-of-Service, too. The issue occurs because nodes are currently OR-ing the TT sync flags of all originators announcing a specific MAC address via the translation table. When an intermediate node now receives a TT Request and wants to answer this on behalf of the destination node, then this intermediate node now responds with an altered flag field and broken CRC. The next OGM of the real destination will lead to a CRC mismatch and triggering a TT Request and Response again. Furthermore, the OR-ing is currently never undone as long as at least one originator announcing the according MAC address remains, leading to the potential persistency of this issue. This patch fixes this issue by storing the flags used in the CRC calculation on a a per TT orig entry basis to be able to respond with the correct, original flags in an intermediate TT Response for one thing. And to be able to correctly unset sync flags once all nodes announcing a sync flag vanish for another. Fixes: e9c00136 ("batman-adv: fix tt_global_entries flags update") Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Acked-by: Antonio Quartulli <a@unstable.cc> [sw: typo in commit message] Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 36d4d68c upstream. The stats are generated by batadv_interface_stats and must not be stored directly in the net_device stats member variable. The batadv_priv bat_counters information is assembled when ndo_get_stats is called. The stats previously stored in net_device::stats is then overwritten. The batman-adv counters must therefore be increased when an ARP packet is answered locally via the distributed arp table. Fixes: c384ea3e ("batman-adv: Distributed ARP Table - add snooping functions for ARP messages") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Linus Lüssing authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 51c6b429 upstream. Trying to split and transmit a unicast packet in 16 parts will fail for the final fragment: After having sent the 15th one with a frag_packet.no index of 14, we will increase the the index to 15 - and return with an error code immediately, even though one more fragment is due for transmission and allowed. Fixing this issue by moving the check before incrementing the index. While at it, adding an unlikely(), because the check is actually more of an assertion. Fixes: ee75ed88 ("batman-adv: Fragment and send skbs larger than mtu") Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 248e23b5 upstream. The function batadv_frag_skb_buffer was supposed not to consume the skbuff on errors. This was followed in the helper function batadv_frag_insert_packet when the skb would potentially be inserted in the fragment queue. But it could happen that the next helper function batadv_frag_merge_packets would try to merge the fragments and fail. This results in a kfree_skb of all the enqueued fragments (including the just inserted one). batadv_recv_frag_packet would detect the error in batadv_frag_skb_buffer and try to free the skb again. The behavior of batadv_frag_skb_buffer (and its helper batadv_frag_insert_packet) must therefore be changed to always consume the skbuff to have a common behavior and avoid the double kfree_skb. Fixes: 610bfc6b ("batman-adv: Receive fragmented packets and merge") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 93652344 upstream. batadv_find_router dereferences last_bonding_candidate from orig_node without making sure that it has a valid reference. This reference has to be retrieved by increasing the reference counter while holding neigh_list_lock. The lock is required to avoid that batadv_last_bonding_replace removes the current last_bonding_candidate, reduces the reference counter and maybe destroys the object in this process. Fixes: f3b3d901 ("batman-adv: add bonding again") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit d1fe176c upstream. Speedy join only works when the received packet is either broadcast or an 4addr unicast packet. Thus packets converted from broadcast to unicast via the gateway handling code have to be converted to 4addr packets to allow the receiving gateway server to add the sender address as temporary entry to the translation table. Not doing it will make the batman-adv gateway server drop the DHCP response in many situations because it doesn't yet have the TT entry for the destination of the DHCP response. Fixes: 37135173 ("batman-adv: change interface_rx to get orig node") Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit cbef1e10 upstream. The orig_ifinfo reference counter for last_bonding_candidate in batadv_orig_node has to be reduced when an originator node is released. Otherwise the orig_ifinfo is leaked and the reference counter the netdevice is not reduced correctly. Fixes: f3b3d901 ("batman-adv: add bonding again") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 15c2ed75 upstream. The replacement of last_bonding_candidate in batadv_orig_node has to be an atomic operation. Otherwise it is possible that the reference counter of a batadv_orig_ifinfo is reduced which was no longer the last_bonding_candidate when the new candidate is added. This can either lead to an invalid memory access or to reference leaks which make it impossible to an interface which was added to batman-adv. Fixes: f3b3d901 ("batman-adv: add bonding again") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 3db0decf upstream. The pointer batadv_bla_claim::backbone_gw can be changed at any time. Therefore, access to it must be protected to ensure that two function accessing the same backbone_gw are actually accessing the same. This is especially important when the crc_lock is used or when the backbone_gw of a claim is exchanged. Not doing so leads to invalid memory access and/or reference leaks. Fixes: 23721387 ("batman-adv: add basic bridge loop avoidance code") Fixes: 5a1dd8a4 ("batman-adv: lock crc access in bridge loop avoidance") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Simon Wunderlich authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 5a1dd8a4 upstream. We have found some networks in which nodes were constantly requesting other nodes BLA claim tables to synchronize, just to ask for that again once completed. The reason was that the crc checksum of the asked nodes were out of sync due to missing locking and multiple writes to the same crc checksum when adding/removing entries. Therefore the asked nodes constantly reported the wrong crc, which caused repeating requests. To avoid multiple functions changing a backbone gateways crc entry at the same time, lock it using a spinlock. Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Tested-by: Alfons Name <AlfonsName@web.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 33fbb1f3 upstream. batadv_orig_node_new uses batadv_orig_node_vlan_new to allocate a new batadv_orig_node_vlan and add it to batadv_orig_node::vlan_list. References to this list have also to be cleaned when the batadv_orig_node is removed. Fixes: 7ea7b4a1 ("batman-adv: make the TT CRC logic VLAN specific") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 60154a1e upstream. vlan_insert_tag can return NULL on errors. The distributed arp table code therefore has to check the return value of vlan_insert_tag for NULL before it can safely operate on this pointer. Fixes: be1db4f6 ("batman-adv: make the Distributed ARP Table vlan aware") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 10c78f58 upstream. vlan_insert_tag can return NULL on errors. The bridge loop avoidance code therefore has to check the return value of vlan_insert_tag for NULL before it can safely operate on this pointer. Fixes: 23721387 ("batman-adv: add basic bridge loop avoidance code") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 420cb1b7 upstream. The untagged vlan object is only destroyed when the interface is removed via the legacy sysfs interface. But it also has to be destroyed when the standard rtnl-link interface is used. Fixes: 5d2c05b2 ("batman-adv: add per VLAN interface attribute framework") Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 3b55e442 upstream. The skb_linearize may reallocate the skb. This makes the calculated pointer for ethhdr invalid. But it the pointer is used later to fill in the RR field of the batadv_icmp_packet_rr packet. Instead re-evaluate eth_hdr after the skb_linearize+skb_cow to fix the pointer and avoid the invalid read. Fixes: da6b8c20 ("batman-adv: generalize batman-adv icmp packet handling") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-
Sven Eckelmann authored
BugLink: https://bugs.launchpad.net/bugs/1868629 commit 9c4604a2 upstream. The tt_req_node is added and removed from a list inside a spinlock. But the locking is sometimes removed even when the object is still referenced and will be used later via this reference. For example batadv_send_tt_request can create a new tt_req_node (including add to a list) and later re-acquires the lock to remove it from the list and to free it. But at this time another context could have already removed this tt_req_node from the list and freed it. CPU#0 batadv_batman_skb_recv from net_device 0 -> batadv_iv_ogm_receive -> batadv_iv_ogm_process -> batadv_iv_ogm_process_per_outif -> batadv_tvlv_ogm_receive -> batadv_tvlv_ogm_receive -> batadv_tvlv_containers_process -> batadv_tvlv_call_handler -> batadv_tt_tvlv_ogm_handler_v1 -> batadv_tt_update_orig -> batadv_send_tt_request -> batadv_tt_req_node_new spin_lock(...) allocates new tt_req_node and adds it to list spin_unlock(...) return tt_req_node CPU#1 batadv_batman_skb_recv from net_device 1 -> batadv_recv_unicast_tvlv -> batadv_tvlv_containers_process -> batadv_tvlv_call_handler -> batadv_tt_tvlv_unicast_handler_v1 -> batadv_handle_tt_response spin_lock(...) tt_req_node gets removed from list and is freed spin_unlock(...) CPU#0 <- returned to batadv_send_tt_request spin_lock(...) tt_req_node gets removed from list and is freed MEMORY CORRUPTION/SEGFAULT/... spin_unlock(...) This can only be solved via reference counting to allow multiple contexts to handle the list manipulation while making sure that only the last context holding a reference will free the object. Fixes: a73105b8 ("batman-adv: improved client announcement mechanism") Signed-off-by: Sven Eckelmann <sven@narfation.org> Tested-by: Martin Weinelt <martin@darmstadt.freifunk.net> Tested-by: Amadeus Alfa <amadeus@chemnitz.freifunk.net> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com> Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
-