• Fenghua Yu's avatar
    x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR · ae28d1aa
    Fenghua Yu authored
    Currently, when moving a task to a resource group the PQR_ASSOC MSR is
    updated with the new closid and rmid in an added task callback. If the
    task is running, the work is run as soon as possible. If the task is not
    running, the work is executed later in the kernel exit path when the
    kernel returns to the task again.
    
    Updating the PQR_ASSOC MSR as soon as possible on the CPU a moved task
    is running is the right thing to do. Queueing work for a task that is
    not running is unnecessary (the PQR_ASSOC MSR is already updated when
    the task is scheduled in) and causing system resource waste with the way
    in which it is implemented: Work to update the PQR_ASSOC register is
    queued every time the user writes a task id to the "tasks" file, even if
    the task already belongs to the resource group.
    
    This could result in multiple pending work items associated with a
    single task even if they are all identical and even though only a single
    update with most recent values is needed. Specifically, even if a task
    is moved between different resource groups while it is sleeping then it
    is only the last move that is relevant but yet a work item is queued
    during each move.
    
    This unnecessary queueing of work items could result in significant
    system resource waste, especially on tasks sleeping for a long time.
    For example, as demonstrated by Shakeel Butt in [1] writing the same
    task id to the "tasks" file can quickly consume significant memory. The
    same problem (wasted system resources) occurs when moving a task between
    different resource groups.
    
    As pointed out by Valentin Schneider in [2] there is an additional issue
    with the way in which the queueing of work is done in that the task_struct
    update is currently done after the work is queued, resulting in a race with
    the register update possibly done before the data needed by the update is
    available.
    
    To solve these issues, update the PQR_ASSOC MSR in a synchronous way
    right after the new closid and rmid are ready during the task movement,
    only if the task is running. If a moved task is not running nothing
    is done since the PQR_ASSOC MSR will be updated next time the task is
    scheduled. This is the same way used to update the register when tasks
    are moved as part of resource group removal.
    
    [1] https://lore.kernel.org/lkml/CALvZod7E9zzHwenzf7objzGKsdBmVwTgEJ0nPgs0LUFU3SN5Pw@mail.gmail.com/
    [2] https://lore.kernel.org/lkml/20201123022433.17905-1-valentin.schneider@arm.com
    
     [ bp: Massage commit message and drop the two update_task_closid_rmid()
       variants. ]
    
    Fixes: e02737d5 ("x86/intel_rdt: Add tasks files")
    Reported-by: default avatarShakeel Butt <shakeelb@google.com>
    Reported-by: default avatarValentin Schneider <valentin.schneider@arm.com>
    Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
    Signed-off-by: default avatarReinette Chatre <reinette.chatre@intel.com>
    Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
    Reviewed-by: default avatarTony Luck <tony.luck@intel.com>
    Reviewed-by: default avatarJames Morse <james.morse@arm.com>
    Reviewed-by: default avatarValentin Schneider <valentin.schneider@arm.com>
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/17aa2fb38fc12ce7bb710106b3e7c7b45acb9e94.1608243147.git.reinette.chatre@intel.com
    ae28d1aa
rdtgroup.c 79 KB