Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
erp5 erp5
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Labels
    • Labels
  • Merge requests 139
    • Merge requests 139
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Environments
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Graph
  • Jobs
  • Commits
Collapse sidebar
  • nexedi
  • erp5erp5
  • Merge requests
  • !1648

Closed
Created Jul 05, 2022 by Vincent Pelletier@vpelletierOwner
  • Report abuse
Report abuse

Products.CMFActivity: Fix poor performance with many family-bound activities

  • Overview 7
  • Commits 3
  • Changes 1

When there are many simultaneously-pending activities attached to any processing node family, the node>=0 subquery becomes dominant (taking hundreds of time longer than the other subqueries). As a consequence, this starves processing nodes of activities and increases the CPU needs of the mariadb process hosting the activity tables.

So, move this subquery out of the regular codepath, and only run it if no other subquery found any activity:

  • there is no activity preferentially targeting the current node
  • there is no activity bound to any of the current node's families
  • there is no activity without any node preference at all

Also, simplify the content of that subquery: the effective priority can only be 3 * priority + 1 when this query is run, and node=0 rows can be excluded (they should not exist in the current database view).

Also, factorise the logic producing "node=processing_node" and "node IN node_set" subqueries, for simplicity. In turn, this makes all family-dependent subqueries use a simple equality test, ensuring a stable query plan independently from the number of families the current node is member of.

Also, use "UNION ALL" always, as now:

  • all subqueries have stritly distinct result sets
  • as per mariadb documentation, "UNION [DISTINCT] applies to all UNIONs on the left", so the original comment about where ALL is used was incorrect in assuming it was improving the effective query performance

Also, line-split SQL queries as visible in the python source to be more readable, without effect on the produced SQL.

Also, line-split a few non-trivial python expression to make their internal structure immediately apparent.

Another effect of this change this change is to reduce activity theft (activities to be preferentially executed by one node being executed by another), potentially improving object cache hit-rate and hence decreasing I/O pressure on the ZODB.

/cc @jm

Assignee
Assign to
Reviewer
Request review from
None
Milestone
None
Assign milestone
Time tracking
Source branch: cmfactivity_family_efficiency
GitLab Nexedi Edition | About GitLab | About Nexedi | 沪ICP备2021021310号-2 | 沪ICP备2021021310号-7