1. 14 Jul, 2023 2 commits
    • Alan Stern's avatar
      net: usbnet: Fix WARNING in usbnet_start_xmit/usb_submit_urb · 5e1627cb
      Alan Stern authored
      The syzbot fuzzer identified a problem in the usbnet driver:
      
      usb 1-1: BOGUS urb xfer, pipe 3 != type 1
      WARNING: CPU: 0 PID: 754 at drivers/usb/core/urb.c:504 usb_submit_urb+0xed6/0x1880 drivers/usb/core/urb.c:504
      Modules linked in:
      CPU: 0 PID: 754 Comm: kworker/0:2 Not tainted 6.4.0-rc7-syzkaller-00014-g692b7dc8 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023
      Workqueue: mld mld_ifc_work
      RIP: 0010:usb_submit_urb+0xed6/0x1880 drivers/usb/core/urb.c:504
      Code: 7c 24 18 e8 2c b4 5b fb 48 8b 7c 24 18 e8 42 07 f0 fe 41 89 d8 44 89 e1 4c 89 ea 48 89 c6 48 c7 c7 a0 c9 fc 8a e8 5a 6f 23 fb <0f> 0b e9 58 f8 ff ff e8 fe b3 5b fb 48 81 c5 c0 05 00 00 e9 84 f7
      RSP: 0018:ffffc9000463f568 EFLAGS: 00010086
      RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
      RDX: ffff88801eb28000 RSI: ffffffff814c03b7 RDI: 0000000000000001
      RBP: ffff8881443b7190 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000003
      R13: ffff88802a77cb18 R14: 0000000000000003 R15: ffff888018262500
      FS:  0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000556a99c15a18 CR3: 0000000028c71000 CR4: 0000000000350ef0
      Call Trace:
       <TASK>
       usbnet_start_xmit+0xfe5/0x2190 drivers/net/usb/usbnet.c:1453
       __netdev_start_xmit include/linux/netdevice.h:4918 [inline]
       netdev_start_xmit include/linux/netdevice.h:4932 [inline]
       xmit_one net/core/dev.c:3578 [inline]
       dev_hard_start_xmit+0x187/0x700 net/core/dev.c:3594
      ...
      
      This bug is caused by the fact that usbnet trusts the bulk endpoint
      addresses its probe routine receives in the driver_info structure, and
      it does not check to see that these endpoints actually exist and have
      the expected type and directions.
      
      The fix is simply to add such a check.
      
      Reported-and-tested-by: syzbot+63ee658b9a100ffadbe2@syzkaller.appspotmail.com
      Closes: https://lore.kernel.org/linux-usb/000000000000a56e9105d0cec021@google.com/Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      CC: Oliver Neukum <oneukum@suse.com>
      Link: https://lore.kernel.org/r/ea152b6d-44df-4f8a-95c6-4db51143dcc1@rowland.harvard.eduSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5e1627cb
    • Linus Walleij's avatar
      dsa: mv88e6xxx: Do a final check before timing out · 95ce158b
      Linus Walleij authored
      I get sporadic timeouts from the driver when using the
      MV88E6352. Reading the status again after the loop fixes the
      problem: the operation is successful but goes undetected.
      
      Some added prints show things like this:
      
      [   58.356209] mv88e6085 mdio_mux-0.1:00: Timeout while waiting
          for switch, addr 1b reg 0b, mask 8000, val 0000, data c000
      [   58.367487] mv88e6085 mdio_mux-0.1:00: Timeout waiting for
          ATU op 4000, fid 0001
      (...)
      [   61.826293] mv88e6085 mdio_mux-0.1:00: Timeout while waiting
          for switch, addr 1c reg 18, mask 8000, val 0000, data 9860
      [   61.837560] mv88e6085 mdio_mux-0.1:00: Timeout waiting
          for PHY command 1860 to complete
      
      The reason is probably not the commands: I think those are
      mostly fine with the 50+50ms timeout, but the problem
      appears when OpenWrt brings up several interfaces in
      parallel on a system with 7 populated ports: if one of
      them take more than 50 ms and waits one or more of the
      others can get stuck on the mutex for the switch and then
      this can easily multiply.
      
      As we sleep and wait, the function loop needs a final
      check after exiting the loop if we were successful.
      Suggested-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Cc: Tobias Waldekranz <tobias@waldekranz.com>
      Fixes: 35da1dfd ("net: dsa: mv88e6xxx: Improve performance of busy bit polling")
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20230712223405.861899-1-linus.walleij@linaro.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      95ce158b
  2. 13 Jul, 2023 17 commits
  3. 12 Jul, 2023 21 commits
    • Jiawen Wu's avatar
      net: txgbe: fix eeprom calculation error · aa846677
      Jiawen Wu authored
      For some device types like TXGBE_ID_XAUI, *checksum computed in
      txgbe_calc_eeprom_checksum() is larger than TXGBE_EEPROM_SUM. Remove the
      limit on the size of *checksum.
      
      Fixes: 049fe536 ("net: txgbe: Add operations to interact with firmware")
      Fixes: 5e2ea780 ("net: txgbe: Fix unsigned comparison to zero in txgbe_calc_eeprom_checksum()")
      Signed-off-by: default avatarJiawen Wu <jiawenwu@trustnetic.com>
      Link: https://lore.kernel.org/r/20230711063414.3311-1-jiawenwu@trustnetic.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      aa846677
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of https://github.com/openrisc/linux · 0099852f
      Linus Torvalds authored
      Pull OpenRISC fix from Stafford Horne:
      
       - During the 6.4 cycle my fpu support work broke ABI compatibility in
         the sigcontext struct. This was noticed by musl libc developers after
         the release. This fix restores the ABI.
      
      * tag 'for-linus' of https://github.com/openrisc/linux:
        openrisc: Union fpcsr and oldmask in sigcontext to unbreak userspace ABI
      0099852f
    • Mohamed Khalfella's avatar
      tracing/histograms: Add histograms to hist_vars if they have referenced variables · 6018b585
      Mohamed Khalfella authored
      Hist triggers can have referenced variables without having direct
      variables fields. This can be the case if referenced variables are added
      for trigger actions. In this case the newly added references will not
      have field variables. Not taking such referenced variables into
      consideration can result in a bug where it would be possible to remove
      hist trigger with variables being refenced. This will result in a bug
      that is easily reproducable like so
      
      $ cd /sys/kernel/tracing
      $ echo 'synthetic_sys_enter char[] comm; long id' >> synthetic_events
      $ echo 'hist:keys=common_pid.execname,id.syscall:vals=hitcount:comm=common_pid.execname' >> events/raw_syscalls/sys_enter/trigger
      $ echo 'hist:keys=common_pid.execname,id.syscall:onmatch(raw_syscalls.sys_enter).synthetic_sys_enter($comm, id)' >> events/raw_syscalls/sys_enter/trigger
      $ echo '!hist:keys=common_pid.execname,id.syscall:vals=hitcount:comm=common_pid.execname' >> events/raw_syscalls/sys_enter/trigger
      
      [  100.263533] ==================================================================
      [  100.264634] BUG: KASAN: slab-use-after-free in resolve_var_refs+0xc7/0x180
      [  100.265520] Read of size 8 at addr ffff88810375d0f0 by task bash/439
      [  100.266320]
      [  100.266533] CPU: 2 PID: 439 Comm: bash Not tainted 6.5.0-rc1 #4
      [  100.267277] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-20220807_005459-localhost 04/01/2014
      [  100.268561] Call Trace:
      [  100.268902]  <TASK>
      [  100.269189]  dump_stack_lvl+0x4c/0x70
      [  100.269680]  print_report+0xc5/0x600
      [  100.270165]  ? resolve_var_refs+0xc7/0x180
      [  100.270697]  ? kasan_complete_mode_report_info+0x80/0x1f0
      [  100.271389]  ? resolve_var_refs+0xc7/0x180
      [  100.271913]  kasan_report+0xbd/0x100
      [  100.272380]  ? resolve_var_refs+0xc7/0x180
      [  100.272920]  __asan_load8+0x71/0xa0
      [  100.273377]  resolve_var_refs+0xc7/0x180
      [  100.273888]  event_hist_trigger+0x749/0x860
      [  100.274505]  ? kasan_save_stack+0x2a/0x50
      [  100.275024]  ? kasan_set_track+0x29/0x40
      [  100.275536]  ? __pfx_event_hist_trigger+0x10/0x10
      [  100.276138]  ? ksys_write+0xd1/0x170
      [  100.276607]  ? do_syscall_64+0x3c/0x90
      [  100.277099]  ? entry_SYSCALL_64_after_hwframe+0x6e/0xd8
      [  100.277771]  ? destroy_hist_data+0x446/0x470
      [  100.278324]  ? event_hist_trigger_parse+0xa6c/0x3860
      [  100.278962]  ? __pfx_event_hist_trigger_parse+0x10/0x10
      [  100.279627]  ? __kasan_check_write+0x18/0x20
      [  100.280177]  ? mutex_unlock+0x85/0xd0
      [  100.280660]  ? __pfx_mutex_unlock+0x10/0x10
      [  100.281200]  ? kfree+0x7b/0x120
      [  100.281619]  ? ____kasan_slab_free+0x15d/0x1d0
      [  100.282197]  ? event_trigger_write+0xac/0x100
      [  100.282764]  ? __kasan_slab_free+0x16/0x20
      [  100.283293]  ? __kmem_cache_free+0x153/0x2f0
      [  100.283844]  ? sched_mm_cid_remote_clear+0xb1/0x250
      [  100.284550]  ? __pfx_sched_mm_cid_remote_clear+0x10/0x10
      [  100.285221]  ? event_trigger_write+0xbc/0x100
      [  100.285781]  ? __kasan_check_read+0x15/0x20
      [  100.286321]  ? __bitmap_weight+0x66/0xa0
      [  100.286833]  ? _find_next_bit+0x46/0xe0
      [  100.287334]  ? task_mm_cid_work+0x37f/0x450
      [  100.287872]  event_triggers_call+0x84/0x150
      [  100.288408]  trace_event_buffer_commit+0x339/0x430
      [  100.289073]  ? ring_buffer_event_data+0x3f/0x60
      [  100.292189]  trace_event_raw_event_sys_enter+0x8b/0xe0
      [  100.295434]  syscall_trace_enter.constprop.0+0x18f/0x1b0
      [  100.298653]  syscall_enter_from_user_mode+0x32/0x40
      [  100.301808]  do_syscall_64+0x1a/0x90
      [  100.304748]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
      [  100.307775] RIP: 0033:0x7f686c75c1cb
      [  100.310617] Code: 73 01 c3 48 8b 0d 65 3c 10 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 21 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 35 3c 10 00 f7 d8 64 89 01 48
      [  100.317847] RSP: 002b:00007ffc60137a38 EFLAGS: 00000246 ORIG_RAX: 0000000000000021
      [  100.321200] RAX: ffffffffffffffda RBX: 000055f566469ea0 RCX: 00007f686c75c1cb
      [  100.324631] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 000000000000000a
      [  100.328104] RBP: 00007ffc60137ac0 R08: 00007f686c818460 R09: 000000000000000a
      [  100.331509] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
      [  100.334992] R13: 0000000000000007 R14: 000000000000000a R15: 0000000000000007
      [  100.338381]  </TASK>
      
      We hit the bug because when second hist trigger has was created
      has_hist_vars() returned false because hist trigger did not have
      variables. As a result of that save_hist_vars() was not called to add
      the trigger to trace_array->hist_vars. Later on when we attempted to
      remove the first histogram find_any_var_ref() failed to detect it is
      being used because it did not find the second trigger in hist_vars list.
      
      With this change we wait until trigger actions are created so we can take
      into consideration if hist trigger has variable references. Also, now we
      check the return value of save_hist_vars() and fail trigger creation if
      save_hist_vars() fails.
      
      Link: https://lore.kernel.org/linux-trace-kernel/20230712223021.636335-1-mkhalfella@purestorage.com
      
      Cc: stable@vger.kernel.org
      Fixes: 067fe038 ("tracing: Add variable reference handling to hist triggers")
      Signed-off-by: default avatarMohamed Khalfella <mkhalfella@purestorage.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      6018b585
    • Pedro Tammela's avatar
      net/sched: make psched_mtu() RTNL-less safe · 150e33e6
      Pedro Tammela authored
      Eric Dumazet says[1]:
      -------
      Speaking of psched_mtu(), I see that net/sched/sch_pie.c is using it
      without holding RTNL, so dev->mtu can be changed underneath.
      KCSAN could issue a warning.
      -------
      
      Annotate dev->mtu with READ_ONCE() so KCSAN don't issue a warning.
      
      [1] https://lore.kernel.org/all/CANn89iJoJO5VtaJ-2=_d2aOQhb0Xw8iBT_Cxqp2HyuS-zj6azw@mail.gmail.com/
      
      v1 -> v2: Fix commit message
      
      Fixes: d4b36210 ("net: pkt_sched: PIE AQM scheme")
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPedro Tammela <pctammela@mojatatu.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230711021634.561598-1-pctammela@mojatatu.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      150e33e6
    • Krister Johansen's avatar
      net: ena: fix shift-out-of-bounds in exponential backoff · 1e9cb763
      Krister Johansen authored
      The ENA adapters on our instances occasionally reset.  Once recently
      logged a UBSAN failure to console in the process:
      
        UBSAN: shift-out-of-bounds in build/linux/drivers/net/ethernet/amazon/ena/ena_com.c:540:13
        shift exponent 32 is too large for 32-bit type 'unsigned int'
        CPU: 28 PID: 70012 Comm: kworker/u72:2 Kdump: loaded not tainted 5.15.117
        Hardware name: Amazon EC2 c5d.9xlarge/, BIOS 1.0 10/16/2017
        Workqueue: ena ena_fw_reset_device [ena]
        Call Trace:
        <TASK>
        dump_stack_lvl+0x4a/0x63
        dump_stack+0x10/0x16
        ubsan_epilogue+0x9/0x36
        __ubsan_handle_shift_out_of_bounds.cold+0x61/0x10e
        ? __const_udelay+0x43/0x50
        ena_delay_exponential_backoff_us.cold+0x16/0x1e [ena]
        wait_for_reset_state+0x54/0xa0 [ena]
        ena_com_dev_reset+0xc8/0x110 [ena]
        ena_down+0x3fe/0x480 [ena]
        ena_destroy_device+0xeb/0xf0 [ena]
        ena_fw_reset_device+0x30/0x50 [ena]
        process_one_work+0x22b/0x3d0
        worker_thread+0x4d/0x3f0
        ? process_one_work+0x3d0/0x3d0
        kthread+0x12a/0x150
        ? set_kthread_struct+0x50/0x50
        ret_from_fork+0x22/0x30
        </TASK>
      
      Apparently, the reset delays are getting so large they can trigger a
      UBSAN panic.
      
      Looking at the code, the current timeout is capped at 5000us.  Using a
      base value of 100us, the current code will overflow after (1<<29).  Even
      at values before 32, this function wraps around, perhaps
      unintentionally.
      
      Cap the value of the exponent used for this backoff at (1<<16) which is
      larger than currently necessary, but large enough to support bigger
      values in the future.
      
      Cc: stable@vger.kernel.org
      Fixes: 4bb7f4cf ("net: ena: reduce driver load time")
      Signed-off-by: default avatarKrister Johansen <kjlx@templeofstupid.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Reviewed-by: default avatarShay Agroskin <shayagr@amazon.com>
      Link: https://lore.kernel.org/r/20230711013621.GE1926@templeofstupid.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1e9cb763
    • Steven Rostedt (Google)'s avatar
      tracing: Stop FORTIFY_SOURCE complaining about stack trace caller · bec3c25c
      Steven Rostedt (Google) authored
      The stack_trace event is an event created by the tracing subsystem to
      store stack traces. It originally just contained a hard coded array of 8
      words to hold the stack, and a "size" to know how many entries are there.
      This is exported to user space as:
      
      name: kernel_stack
      ID: 4
      format:
      	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
      	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
      	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
      	field:int common_pid;	offset:4;	size:4;	signed:1;
      
      	field:int size;	offset:8;	size:4;	signed:1;
      	field:unsigned long caller[8];	offset:16;	size:64;	signed:0;
      
      print fmt: "\t=> %ps\n\t=> %ps\n\t=> %ps\n" "\t=> %ps\n\t=> %ps\n\t=> %ps\n" "\t=> %ps\n\t=> %ps\n",i
       (void *)REC->caller[0], (void *)REC->caller[1], (void *)REC->caller[2],
       (void *)REC->caller[3], (void *)REC->caller[4], (void *)REC->caller[5],
       (void *)REC->caller[6], (void *)REC->caller[7]
      
      Where the user space tracers could parse the stack. The library was
      updated for this specific event to only look at the size, and not the
      array. But some older users still look at the array (note, the older code
      still checks to make sure the array fits inside the event that it read.
      That is, if only 4 words were saved, the parser would not read the fifth
      word because it will see that it was outside of the event size).
      
      This event was changed a while ago to be more dynamic, and would save a
      full stack even if it was greater than 8 words. It does this by simply
      allocating more ring buffer to hold the extra words. Then it copies in the
      stack via:
      
      	memcpy(&entry->caller, fstack->calls, size);
      
      As the entry is struct stack_entry, that is created by a macro to both
      create the structure and export this to user space, it still had the caller
      field of entry defined as: unsigned long caller[8].
      
      When the stack is greater than 8, the FORTIFY_SOURCE code notices that the
      amount being copied is greater than the source array and complains about
      it. It has no idea that the source is pointing to the ring buffer with the
      required allocation.
      
      To hide this from the FORTIFY_SOURCE logic, pointer arithmetic is used:
      
      	ptr = ring_buffer_event_data(event);
      	entry = ptr;
      	ptr += offsetof(typeof(*entry), caller);
      	memcpy(ptr, fstack->calls, size);
      
      Link: https://lore.kernel.org/all/20230612160748.4082850-1-svens@linux.ibm.com/
      Link: https://lore.kernel.org/linux-trace-kernel/20230712105235.5fc441aa@gandalf.local.home
      
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Reported-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Tested-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      bec3c25c
    • Zheng Yejian's avatar
      ftrace: Fix possible warning on checking all pages used in ftrace_process_locs() · 26efd79c
      Zheng Yejian authored
      As comments in ftrace_process_locs(), there may be NULL pointers in
      mcount_loc section:
       > Some architecture linkers will pad between
       > the different mcount_loc sections of different
       > object files to satisfy alignments.
       > Skip any NULL pointers.
      
      After commit 20e5227e ("ftrace: allow NULL pointers in mcount_loc"),
      NULL pointers will be accounted when allocating ftrace pages but skipped
      before adding into ftrace pages, this may result in some pages not being
      used. Then after commit 706c81f8 ("ftrace: Remove extra helper
      functions"), warning may occur at:
        WARN_ON(pg->next);
      
      To fix it, only warn for case that no pointers skipped but pages not used
      up, then free those unused pages after releasing ftrace_lock.
      
      Link: https://lore.kernel.org/linux-trace-kernel/20230712060452.3175675-1-zhengyejian1@huawei.com
      
      Cc: stable@vger.kernel.org
      Fixes: 706c81f8 ("ftrace: Remove extra helper functions")
      Suggested-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarZheng Yejian <zhengyejian1@huawei.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      26efd79c
    • Dan Carpenter's avatar
      netdevsim: fix uninitialized data in nsim_dev_trap_fa_cookie_write() · f72207a5
      Dan Carpenter authored
      The simple_write_to_buffer() function is designed to handle partial
      writes.  It returns negatives on error, otherwise it returns the number
      of bytes that were able to be copied.  This code doesn't check the
      return properly.  We only know that the first byte is written, the rest
      of the buffer might be uninitialized.
      
      There is no need to use the simple_write_to_buffer() function.
      Partial writes are prohibited by the "if (*ppos != 0)" check at the
      start of the function.  Just use memdup_user() and copy the whole
      buffer.
      
      Fixes: d3cbb907 ("netdevsim: add ACL trap reporting cookie as a metadata")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Reviewed-by: default avatarPavan Chebbi <pavan.chebbi@broadcom.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Link: https://lore.kernel.org/r/7c1f950b-3a7d-4252-82a6-876e53078ef7@moroto.mountainSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f72207a5
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v6.5-2' of... · eb26cbb1
      Linus Torvalds authored
      Merge tag 'platform-drivers-x86-v6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
      
      Pull x86 platform driver fixes from Hans de Goede:
       "Misc small fixes and hw-id additions"
      
      * tag 'platform-drivers-x86-v6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
        platform/x86: touchscreen_dmi: Add info for the Archos 101 Cesium Educ tablet
        platform/x86: dell-ddv: Fix mangled list in documentation
        platform/x86: dell-ddv: Improve error handling
        platform/x86/amd: pmf: Add new ACPI ID AMDI0103
        platform/x86/amd: pmc: Add new ACPI ID AMDI000A
        platform/x86/amd: pmc: Apply nvme quirk to HP 15s-eq2xxx
        platform/x86: Move s2idle quirk from thinkpad-acpi to amd-pmc
        platform/x86: int3472/discrete: set variable skl_int3472_regulator_second_sensor storage-class-specifier to static
        platform/x86/intel/tpmi: Prevent overflow for cap_offset
        platform/x86: wmi: Replace open coded guid_parse_and_compare()
        platform/x86: wmi: Break possible infinite loop when parsing GUID
      eb26cbb1
    • Linus Torvalds's avatar
      Merge tag 'probes-fixes-v6.5-rc1' of... · 9a3236ce
      Linus Torvalds authored
      Merge tag 'probes-fixes-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
      
      Pull probes fixes from Masami Hiramatsu:
      
       - Fix fprobe's rethook release issues:
      
           - Release rethook after ftrace_ops is unregistered so that the
             rethook is not accessed after free.
      
           - Stop rethook before ftrace_ops is unregistered so that the
             rethook is NOT used after exiting unregister_fprobe()
      
       - Fix eprobe cleanup logic. If it attaches to multiple events and
         failes to enable one of them, rollback all enabled events correctly.
      
       - Fix fprobe to unlock ftrace recursion lock correctly when it missed
         by another running kprobe.
      
       - Cleanup kprobe to remove unnecessary NULL.
      
       - Cleanup kprobe to remove unnecessary 0 initializations.
      
      * tag 'probes-fixes-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        fprobe: Ensure running fprobe_exit_handler() finished before calling rethook_free()
        kernel: kprobes: Remove unnecessary ‘0’ values
        kprobes: Remove unnecessary ‘NULL’ values from correct_ret_addr
        fprobe: add unlock to match a succeeded ftrace_test_recursion_trylock
        kernel/trace: Fix cleanup logic of enable_trace_eprobe
        fprobe: Release rethook after the ftrace_ops is unregistered
      9a3236ce
    • Linus Torvalds's avatar
      Merge tag 'for-linus-2023071101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid · 1d754604
      Linus Torvalds authored
      Pull HID fixes from Benjamin Tissoires:
      
       - AMD SFH shift-out-of-bounds fix (Basavaraj Natikar)
      
       - avoid struct memcpy overrun warning in the hid-hyperv module (Arnd
         Bergmann)
      
       - a quick HID kselftests script fix for our CI to be happy (Benjamin
         Tissoires)
      
       - various fixes and additions of device IDs
      
      * tag 'for-linus-2023071101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
        HID: amd_sfh: Fix for shift-out-of-bounds
        HID: amd_sfh: Rename the float32 variable
        HID: input: fix mapping for camera access keys
        HID: logitech-hidpp: Add wired USB id for Logitech G502 Lightspeed
        HID: nvidia-shield: Pack inner/related declarations in HOSTCMD reports
        HID: hyperv: avoid struct memcpy overrun warning
        selftests: hid: fix vmtests.sh not running make headers
      1d754604
    • Zheng Yejian's avatar
      ring-buffer: Fix deadloop issue on reading trace_pipe · 7e42907f
      Zheng Yejian authored
      Soft lockup occurs when reading file 'trace_pipe':
      
        watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [cat:4488]
        [...]
        RIP: 0010:ring_buffer_empty_cpu+0xed/0x170
        RSP: 0018:ffff88810dd6fc48 EFLAGS: 00000246
        RAX: 0000000000000000 RBX: 0000000000000246 RCX: ffffffff93d1aaeb
        RDX: ffff88810a280040 RSI: 0000000000000008 RDI: ffff88811164b218
        RBP: ffff88811164b218 R08: 0000000000000000 R09: ffff88815156600f
        R10: ffffed102a2acc01 R11: 0000000000000001 R12: 0000000051651901
        R13: 0000000000000000 R14: ffff888115e49500 R15: 0000000000000000
        [...]
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007f8d853c2000 CR3: 000000010dcd8000 CR4: 00000000000006e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         __find_next_entry+0x1a8/0x4b0
         ? peek_next_entry+0x250/0x250
         ? down_write+0xa5/0x120
         ? down_write_killable+0x130/0x130
         trace_find_next_entry_inc+0x3b/0x1d0
         tracing_read_pipe+0x423/0xae0
         ? tracing_splice_read_pipe+0xcb0/0xcb0
         vfs_read+0x16b/0x490
         ksys_read+0x105/0x210
         ? __ia32_sys_pwrite64+0x200/0x200
         ? switch_fpu_return+0x108/0x220
         do_syscall_64+0x33/0x40
         entry_SYSCALL_64_after_hwframe+0x61/0xc6
      
      Through the vmcore, I found it's because in tracing_read_pipe(),
      ring_buffer_empty_cpu() found some buffer is not empty but then it
      cannot read anything due to "rb_num_of_entries() == 0" always true,
      Then it infinitely loop the procedure due to user buffer not been
      filled, see following code path:
      
        tracing_read_pipe() {
          ... ...
          waitagain:
            tracing_wait_pipe() // 1. find non-empty buffer here
            trace_find_next_entry_inc()  // 2. loop here try to find an entry
              __find_next_entry()
                ring_buffer_empty_cpu();  // 3. find non-empty buffer
                peek_next_entry()  // 4. but peek always return NULL
                  ring_buffer_peek()
                    rb_buffer_peek()
                      rb_get_reader_page()
                        // 5. because rb_num_of_entries() == 0 always true here
                        //    then return NULL
            // 6. user buffer not been filled so goto 'waitgain'
            //    and eventually leads to an deadloop in kernel!!!
        }
      
      By some analyzing, I found that when resetting ringbuffer, the 'entries'
      of its pages are not all cleared (see rb_reset_cpu()). Then when reducing
      the ringbuffer, and if some reduced pages exist dirty 'entries' data, they
      will be added into 'cpu_buffer->overrun' (see rb_remove_pages()), which
      cause wrong 'overrun' count and eventually cause the deadloop issue.
      
      To fix it, we need to clear every pages in rb_reset_cpu().
      
      Link: https://lore.kernel.org/linux-trace-kernel/20230708225144.3785600-1-zhengyejian1@huawei.com
      
      Cc: stable@vger.kernel.org
      Fixes: a5fb8331 ("ring-buffer: Fix uninitialized read_stamp")
      Signed-off-by: default avatarZheng Yejian <zhengyejian1@huawei.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      7e42907f
    • Arnd Bergmann's avatar
      tracing: arm64: Avoid missing-prototype warnings · 7d8b31b7
      Arnd Bergmann authored
      These are all tracing W=1 warnings in arm64 allmodconfig about missing
      prototypes:
      
      kernel/trace/trace_kprobe_selftest.c:7:5: error: no previous prototype for 'kprobe_trace_selftest_target' [-Werror=missing-pro
      totypes]
      kernel/trace/ftrace.c:329:5: error: no previous prototype for '__register_ftrace_function' [-Werror=missing-prototypes]
      kernel/trace/ftrace.c:372:5: error: no previous prototype for '__unregister_ftrace_function' [-Werror=missing-prototypes]
      kernel/trace/ftrace.c:4130:15: error: no previous prototype for 'arch_ftrace_match_adjust' [-Werror=missing-prototypes]
      kernel/trace/fgraph.c:243:15: error: no previous prototype for 'ftrace_return_to_handler' [-Werror=missing-prototypes]
      kernel/trace/fgraph.c:358:6: error: no previous prototype for 'ftrace_graph_sleep_time_control' [-Werror=missing-prototypes]
      arch/arm64/kernel/ftrace.c:460:6: error: no previous prototype for 'prepare_ftrace_return' [-Werror=missing-prototypes]
      arch/arm64/kernel/ptrace.c:2172:5: error: no previous prototype for 'syscall_trace_enter' [-Werror=missing-prototypes]
      arch/arm64/kernel/ptrace.c:2195:6: error: no previous prototype for 'syscall_trace_exit' [-Werror=missing-prototypes]
      
      Move the declarations to an appropriate header where they can be seen
      by the caller and callee, and make sure the headers are included where
      needed.
      
      Link: https://lore.kernel.org/linux-trace-kernel/20230517125215.930689-1-arnd@kernel.org
      
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Florent Revest <revest@chromium.org>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      [ Fixed ftrace_return_to_handler() to handle CONFIG_HAVE_FUNCTION_GRAPH_RETVAL case ]
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      7d8b31b7
    • Beau Belgrave's avatar
      selftests/user_events: Test struct size match cases · 769e6372
      Beau Belgrave authored
      The self tests for user_events currently does not ensure that the edge
      case for struct types work properly with size differences.
      
      Add cases for mis-matching struct names and sizes to ensure they work
      properly.
      
      Link: https://lkml.kernel.org/r/20230629235049.581-3-beaub@linux.microsoft.com
      
      Cc: Shuah Khan <skhan@linuxfoundation.org>
      Cc: linux-kselftest@vger.kernel.org
      Signed-off-by: default avatarBeau Belgrave <beaub@linux.microsoft.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      769e6372
    • Ido Schimmel's avatar
      net/sched: flower: Ensure both minimum and maximum ports are specified · d3f87278
      Ido Schimmel authored
      The kernel does not currently validate that both the minimum and maximum
      ports of a port range are specified. This can lead user space to think
      that a filter matching on a port range was successfully added, when in
      fact it was not. For example, with a patched (buggy) iproute2 that only
      sends the minimum port, the following commands do not return an error:
      
       # tc filter add dev swp1 ingress pref 1 proto ip flower ip_proto udp src_port 100-200 action pass
      
       # tc filter add dev swp1 ingress pref 1 proto ip flower ip_proto udp dst_port 100-200 action pass
      
       # tc filter show dev swp1 ingress
       filter protocol ip pref 1 flower chain 0
       filter protocol ip pref 1 flower chain 0 handle 0x1
         eth_type ipv4
         ip_proto udp
         not_in_hw
               action order 1: gact action pass
                random type none pass val 0
                index 1 ref 1 bind 1
      
       filter protocol ip pref 1 flower chain 0 handle 0x2
         eth_type ipv4
         ip_proto udp
         not_in_hw
               action order 1: gact action pass
                random type none pass val 0
                index 2 ref 1 bind 1
      
      Fix by returning an error unless both ports are specified:
      
       # tc filter add dev swp1 ingress pref 1 proto ip flower ip_proto udp src_port 100-200 action pass
       Error: Both min and max source ports must be specified.
       We have an error talking to the kernel
      
       # tc filter add dev swp1 ingress pref 1 proto ip flower ip_proto udp dst_port 100-200 action pass
       Error: Both min and max destination ports must be specified.
       We have an error talking to the kernel
      
      Fixes: 5c72299f ("net: sched: cls_flower: Classify packets using port ranges")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d3f87278
    • David S. Miller's avatar
      Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · b6c9ebde
      David S. Miller authored
      Tony Nguyen says:
      
      ====================
      igc: Fix corner cases for TSN offload
      
      Florian Kauer says:
      
      The igc driver supports several different offloading capabilities
      relevant in the TSN context. Recent patches in this area introduced
      regressions for certain corner cases that are fixed in this series.
      
      Each of the patches (except the first one) addresses a different
      regression that can be separately reproduced. Still, they have
      overlapping code changes so they should not be separately applied.
      
      Especially #4 and #6 address the same observation,
      but both need to be applied to avoid TX hang occurrences in
      the scenario described in the patches.
      ====================
      Signed-off-by: default avatarFlorian Kauer <florian.kauer@linutronix.de>
      Reviewed-by: Kurt Kanzenbach's avatarKurt Kanzenbach <kurt@linutronix.de>
      Acked-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Reviewed-by: default avatarMuhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6c9ebde
    • Andrew Halaney's avatar
      MAINTAINERS: Add another mailing list for QUALCOMM ETHQOS ETHERNET DRIVER · e522c1bd
      Andrew Halaney authored
      linux-arm-msm is the list most people subscribe to in order to receive
      updates about Qualcomm related drivers. Make sure changes for the
      Qualcomm ethernet driver make it there.
      Signed-off-by: default avatarAndrew Halaney <ahalaney@redhat.com>
      Acked-by: default avatarVinod Koul <vkoul@kernel.org>
      Link: https://lore.kernel.org/r/20230710195240.197047-1-ahalaney@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e522c1bd
    • Jakub Kicinski's avatar
      docs: netdev: update the URL of the status page · cf28792f
      Jakub Kicinski authored
      Move the status page from vger to the same server as mailbot.
      
      Link: https://lore.kernel.org/r/20230710174636.1174684-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cf28792f
    • Johannes Berg's avatar
      wifi: iwlwifi: remove 'use_tfh' config to fix crash · 12a89f01
      Johannes Berg authored
      This is equivalent to 'gen2', and it was always confusing to have
      two identical config entries. The split config patch actually had
      been originally developed after removing 'use_tfh" and didn't add
      the use_tfh in the new configs as they'd later been copied to the
      new files. Thus the easiest way to fix the init crash here now is
      to just remove use_tfh (which is erroneously unset in most of the
      configs now) and use 'gen2' in the code instead.
      
      There's possibly still an unwind error in iwl_txq_gen2_init() as
      it crashes if TXQ 0 fails to initialize, but we can deal with it
      later since the original failure is due to the use_tfh confusion.
      Tested-by: default avatarXi Ruoyao <xry111@xry111.site>
      Reported-and-tested-by: default avatarNiklāvs Koļesņikovs <pinkflames.linux@gmail.com>
      Reported-and-tested-by: default avatarJeff Chua <jeff.chua.linux@gmail.com>
      Reported-and-tested-by: default avatarZhang Rui <rui.zhang@intel.com>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=217622
      Link: https://lore.kernel.org/all/9274d9bd3d080a457649ff5addcc1726f08ef5b2.camel@xry111.site/
      Link: https://lore.kernel.org/all/CAAJw_Zug6VCS5ZqTWaFSr9sd85k%3DtyPm9DEE%2BmV%3DAKoECZM%2BsQ@mail.gmail.com/
      Fixes: 19898ce9 ("wifi: iwlwifi: split 22000.c into multiple files")
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Link: https://lore.kernel.org/r/20230710145038.84186-2-johannes@sipsolutions.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      12a89f01
    • Larysa Zaremba's avatar
      xdp: use trusted arguments in XDP hints kfuncs · 2e06c57d
      Larysa Zaremba authored
      Currently, verifier does not reject XDP programs that pass NULL pointer to
      hints functions. At the same time, this case is not handled in any driver
      implementation (including veth). For example, changing
      
      bpf_xdp_metadata_rx_timestamp(ctx, &timestamp);
      
      to
      
      bpf_xdp_metadata_rx_timestamp(ctx, NULL);
      
      in xdp_metadata test successfully crashes the system.
      
      Add KF_TRUSTED_ARGS flag to hints kfunc definitions, so driver code
      does not have to worry about getting invalid pointers.
      
      Fixes: 3d76a4d3 ("bpf: XDP metadata RX kfuncs")
      Reported-by: default avatarStanislav Fomichev <sdf@google.com>
      Closes: https://lore.kernel.org/bpf/ZKWo0BbpLfkZHbyE@google.com/Signed-off-by: default avatarLarysa Zaremba <larysa.zaremba@intel.com>
      Acked-by: default avatarJesper Dangaard Brouer <hawk@kernel.org>
      Acked-by: default avatarStanislav Fomichev <sdf@google.com>
      Link: https://lore.kernel.org/r/20230711105930.29170-1-larysa.zaremba@intel.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      2e06c57d
    • Pu Lehui's avatar
      bpf: cpumap: Fix memory leak in cpu_map_update_elem · 43690164
      Pu Lehui authored
      Syzkaller reported a memory leak as follows:
      
      BUG: memory leak
      unreferenced object 0xff110001198ef748 (size 192):
        comm "syz-executor.3", pid 17672, jiffies 4298118891 (age 9.906s)
        hex dump (first 32 bytes):
          00 00 00 00 4a 19 00 00 80 ad e3 e4 fe ff c0 00  ....J...........
          00 b2 d3 0c 01 00 11 ff 28 f5 8e 19 01 00 11 ff  ........(.......
        backtrace:
          [<ffffffffadd28087>] __cpu_map_entry_alloc+0xf7/0xb00
          [<ffffffffadd28d8e>] cpu_map_update_elem+0x2fe/0x3d0
          [<ffffffffadc6d0fd>] bpf_map_update_value.isra.0+0x2bd/0x520
          [<ffffffffadc7349b>] map_update_elem+0x4cb/0x720
          [<ffffffffadc7d983>] __se_sys_bpf+0x8c3/0xb90
          [<ffffffffb029cc80>] do_syscall_64+0x30/0x40
          [<ffffffffb0400099>] entry_SYSCALL_64_after_hwframe+0x61/0xc6
      
      BUG: memory leak
      unreferenced object 0xff110001198ef528 (size 192):
        comm "syz-executor.3", pid 17672, jiffies 4298118891 (age 9.906s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffffadd281f0>] __cpu_map_entry_alloc+0x260/0xb00
          [<ffffffffadd28d8e>] cpu_map_update_elem+0x2fe/0x3d0
          [<ffffffffadc6d0fd>] bpf_map_update_value.isra.0+0x2bd/0x520
          [<ffffffffadc7349b>] map_update_elem+0x4cb/0x720
          [<ffffffffadc7d983>] __se_sys_bpf+0x8c3/0xb90
          [<ffffffffb029cc80>] do_syscall_64+0x30/0x40
          [<ffffffffb0400099>] entry_SYSCALL_64_after_hwframe+0x61/0xc6
      
      BUG: memory leak
      unreferenced object 0xff1100010fd93d68 (size 8):
        comm "syz-executor.3", pid 17672, jiffies 4298118891 (age 9.906s)
        hex dump (first 8 bytes):
          00 00 00 00 00 00 00 00                          ........
        backtrace:
          [<ffffffffade5db3e>] kvmalloc_node+0x11e/0x170
          [<ffffffffadd28280>] __cpu_map_entry_alloc+0x2f0/0xb00
          [<ffffffffadd28d8e>] cpu_map_update_elem+0x2fe/0x3d0
          [<ffffffffadc6d0fd>] bpf_map_update_value.isra.0+0x2bd/0x520
          [<ffffffffadc7349b>] map_update_elem+0x4cb/0x720
          [<ffffffffadc7d983>] __se_sys_bpf+0x8c3/0xb90
          [<ffffffffb029cc80>] do_syscall_64+0x30/0x40
          [<ffffffffb0400099>] entry_SYSCALL_64_after_hwframe+0x61/0xc6
      
      In the cpu_map_update_elem flow, when kthread_stop is called before
      calling the threadfn of rcpu->kthread, since the KTHREAD_SHOULD_STOP bit
      of kthread has been set by kthread_stop, the threadfn of rcpu->kthread
      will never be executed, and rcpu->refcnt will never be 0, which will
      lead to the allocated rcpu, rcpu->queue and rcpu->queue->queue cannot be
      released.
      
      Calling kthread_stop before executing kthread's threadfn will return
      -EINTR. We can complete the release of memory resources in this state.
      
      Fixes: 6710e112 ("bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP")
      Signed-off-by: default avatarPu Lehui <pulehui@huawei.com>
      Acked-by: default avatarJesper Dangaard Brouer <hawk@kernel.org>
      Acked-by: default avatarHou Tao <houtao1@huawei.com>
      Link: https://lore.kernel.org/r/20230711115848.2701559-1-pulehui@huaweicloud.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      43690164