• Vladimir Oltean's avatar
    net/sched: taprio: avoid disabling offload when it was never enabled · db46e3a8
    Vladimir Oltean authored
    In an incredibly strange API design decision, qdisc->destroy() gets
    called even if qdisc->init() never succeeded, not exclusively since
    commit 87b60cfa ("net_sched: fix error recovery at qdisc creation"),
    but apparently also earlier (in the case of qdisc_create_dflt()).
    
    The taprio qdisc does not fully acknowledge this when it attempts full
    offload, because it starts off with q->flags = TAPRIO_FLAGS_INVALID in
    taprio_init(), then it replaces q->flags with TCA_TAPRIO_ATTR_FLAGS
    parsed from netlink (in taprio_change(), tail called from taprio_init()).
    
    But in taprio_destroy(), we call taprio_disable_offload(), and this
    determines what to do based on FULL_OFFLOAD_IS_ENABLED(q->flags).
    
    But looking at the implementation of FULL_OFFLOAD_IS_ENABLED()
    (a bitwise check of bit 1 in q->flags), it is invalid to call this macro
    on q->flags when it contains TAPRIO_FLAGS_INVALID, because that is set
    to U32_MAX, and therefore FULL_OFFLOAD_IS_ENABLED() will return true on
    an invalid set of flags.
    
    As a result, it is possible to crash the kernel if user space forces an
    error between setting q->flags = TAPRIO_FLAGS_INVALID, and the calling
    of taprio_enable_offload(). This is because drivers do not expect the
    offload to be disabled when it was never enabled.
    
    The error that we force here is to attach taprio as a non-root qdisc,
    but instead as child of an mqprio root qdisc:
    
    $ tc qdisc add dev swp0 root handle 1: \
    	mqprio num_tc 8 map 0 1 2 3 4 5 6 7 \
    	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
    $ tc qdisc replace dev swp0 parent 1:1 \
    	taprio num_tc 8 map 0 1 2 3 4 5 6 7 \
    	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \
    	sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
    	flags 0x0 clockid CLOCK_TAI
    Unable to handle kernel paging request at virtual address fffffffffffffff8
    [fffffffffffffff8] pgd=0000000000000000, p4d=0000000000000000
    Internal error: Oops: 96000004 [#1] PREEMPT SMP
    Call trace:
     taprio_dump+0x27c/0x310
     vsc9959_port_setup_tc+0x1f4/0x460
     felix_port_setup_tc+0x24/0x3c
     dsa_slave_setup_tc+0x54/0x27c
     taprio_disable_offload.isra.0+0x58/0xe0
     taprio_destroy+0x80/0x104
     qdisc_create+0x240/0x470
     tc_modify_qdisc+0x1fc/0x6b0
     rtnetlink_rcv_msg+0x12c/0x390
     netlink_rcv_skb+0x5c/0x130
     rtnetlink_rcv+0x1c/0x2c
    
    Fix this by keeping track of the operations we made, and undo the
    offload only if we actually did it.
    
    I've added "bool offloaded" inside a 4 byte hole between "int clockid"
    and "atomic64_t picos_per_byte". Now the first cache line looks like
    below:
    
    $ pahole -C taprio_sched net/sched/sch_taprio.o
    struct taprio_sched {
            struct Qdisc * *           qdiscs;               /*     0     8 */
            struct Qdisc *             root;                 /*     8     8 */
            u32                        flags;                /*    16     4 */
            enum tk_offsets            tk_offset;            /*    20     4 */
            int                        clockid;              /*    24     4 */
            bool                       offloaded;            /*    28     1 */
    
            /* XXX 3 bytes hole, try to pack */
    
            atomic64_t                 picos_per_byte;       /*    32     0 */
    
            /* XXX 8 bytes hole, try to pack */
    
            spinlock_t                 current_entry_lock;   /*    40     0 */
    
            /* XXX 8 bytes hole, try to pack */
    
            struct sched_entry *       current_entry;        /*    48     8 */
            struct sched_gate_list *   oper_sched;           /*    56     8 */
            /* --- cacheline 1 boundary (64 bytes) --- */
    
    Fixes: 9c66d156 ("taprio: Add support for hardware offloading")
    Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
    Reviewed-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
    Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
    db46e3a8
sch_taprio.c 51.9 KB