• Mengting Zhang's avatar
    perf evsel: Enable ignore_missing_thread for pid option · ca800068
    Mengting Zhang authored
    While monitoring a multithread process with pid option, perf sometimes
    may return sys_perf_event_open failure with 3(No such process) if any of
    the process's threads die before we open the event. However, we want
    perf continue monitoring the remaining threads and do not exit with
    error.
    
    Here, the patch enables perf_evsel::ignore_missing_thread for -p option
    to ignore complete failure if any of threads die before we open the event.
    But it may still return sys_perf_event_open failure with 22(Invalid) if we
    monitors several event groups.
    
            sys_perf_event_open: pid 28960  cpu 40  group_fd 118202  flags 0x8
            sys_perf_event_open: pid 28961  cpu 40  group_fd 118203  flags 0x8
            WARNING: Ignored open failure for pid 28962
            sys_perf_event_open: pid 28962  cpu 40  group_fd [118203]  flags 0x8
            sys_perf_event_open failed, error -22
    
    That is because when we ignore a missing thread, we change the thread_idx
    without dealing with its fds, FD(evsel, cpu, thread). Then get_group_fd()
    may return a wrong group_fd for the next thread and sys_perf_event_open()
    return with 22.
    
            sys_perf_event_open(){
               ...
               if (group_fd != -1)
                   perf_fget_light()//to get corresponding group_leader by group_fd
               ...
               if (group_leader)
                  if (group_leader->ctx->task != ctx->task)//should on the same task
                       goto err_context
               ...
            }
    
    This patch also fixes this bug by introducing perf_evsel__remove_fd() and
    update_fds to allow removing fds for the missing thread.
    
    Changes since v1:
    - Change group_fd__remove() into a more genetic way without changing code logic
    - Remove redundant condition
    
    Changes since v2:
    - Use a proper function name and add some comment.
    - Multiline comment style fixes.
    
    Committer testing:
    
    Before this patch the recently added 'perf stat --per-thread' for system
    wide counting would race while enumerating all threads using /proc:
    
      [root@jouet ~]# perf stat --per-thread
      failed to parse CPUs map: No such file or directory
    
       Usage: perf stat [<options>] [<command>]
    
          -C, --cpu <cpu>       list of cpus to monitor in system-wide
          -a, --all-cpus        system-wide collection from all CPUs
      [root@jouet ~]# perf stat --per-thread
      failed to parse CPUs map: No such file or directory
    
       Usage: perf stat [<options>] [<command>]
    
          -C, --cpu <cpu>       list of cpus to monitor in system-wide
          -a, --all-cpus        system-wide collection from all CPUs
      [root@jouet ~]#
    
    When, say, the kernel was being built, so lots of shortlived threads,
    after this patch this doesn't happen.
    Signed-off-by: default avatarMengting Zhang <zhangmengting@huawei.com>
    Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
    Cc: Cheng Jian <cj.chengjian@huawei.com>
    Cc: Li Bin <huawei.libin@huawei.com>
    Cc: Wang Nan <wangnan0@huawei.com>
    Link: http://lkml.kernel.org/r/1513148513-6974-1-git-send-email-zhangmengting@huawei.com
    [ Remove one use 'evlist' alias variable ]
    Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    ca800068
evsel.c 70.6 KB