Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
erp5 erp5
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Labels
    • Labels
  • Merge requests 136
    • Merge requests 136
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Operations
    • Operations
    • Environments
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Graph
  • Jobs
  • Commits
Collapse sidebar
  • nexedi
  • erp5erp5
  • Merge requests
  • !616

Merged
Created Mar 27, 2018 by Vincent Pelletier@vpelletierOwner

Remove recursiveImmediateReindexObject, simplify recursiveReindexObject

  • Overview 30
  • Commits 12
  • Changes 55

@arnau @jerome @jm @romain @seb @tb (please invite more if applicable)

recursiveImmediateReindexObject

recursiveImmediateReindexObject is a generic method (implemented on Base and Folder) which, by contract, walks its entire subtree withing one transaction.

These 3 facts together (generic, recursive and immediate) make this method a risk of memory exhaustion as it can be called on unexpected documents, in turn threatening availability.

Genericity and recusion can hardly be avoided: we want to have a common way of reindexing sets of documents based on a common ancestor (ex: reindexing an entire module). So it is immediateness must go away: we should not need reindexations or arbitrarily-large trees to happen in a single transaction. So this patch set removed recursiveImmediateReindexObject method, and all uses shouldgo through recursiveReindexObject instead.

There is one case though where recursion and immediateness are needed: the (normally rare) cases where immediate indexation of a newly-created document is desired. Such newly-created document could spawn a (limited) number of subdocuments, so I made such method purpose-specific: reindexOnCreation. Until a few days ago, I thought this would not be needed, but it seem it is. So we still have a dangerous-ish method, only now with a more specific name - so misuses would be harder to justify. Ideas on how to make misuses even harder are very welcome.

recursiveReindexObject

recursiveReindexObject is a generic method (declared on Base and Folder again) which calls recursiveImmediateReindexObject in some cases (based on the number of immediate subdocuments). As described above, recursiveImmediateReindexObject is now gone, so another solution is needed.

So I change recursiveReindexObject so that it calls immediateReindexObject instead (immediate, generic but non-recursive), and to do that I fixed the recursion limits of recursiveReindexObject by making it call the existing recurseCallMethod (_recurseCallMethod actually, the unrestricted variant). This way, we have precise size-limiting capabilities, independently from tree depth.

In turn, this means recursiveReindexObjectcan now be used on any document: cells, lines, top-level documents but also modules. So as a bonus, we can get rid of more complexity by dropping Folder_reindex{All,Objectlist,TreeObjectList} and simplifying ERP5Site_reindexAll.

Other commits

Most of the commits are preparatory work to reach the above. These are touching other important ERP5 components:

  • ActivityTool becomes itself indexable while only its content was previously indexable. This was inconsistent (it means parent_uid cannot be used on immediate portal_activities subdocuments) and in turn hurt catalog reproductibility.
  • CategoryTool becomes a BaseTool subclass so it inherits from ERP5Type/Core/Folder so it can be reindex the same way as everything else
  • ActivityTool becomes a BaseTool subclass just to simplify inheritance) so that they behave consistently with other tools

A big discovery while working on this patch set was that our indexation tests (intended to check that a site is properly indexed right after its creation) were utterly broken: the test would check that after a full-site reindexation the catalog was a superset of its initial state. Why ? Because initial state was already tainted by previous tests in the same test class... Which basically cleared the catalog, indexing only a handful of document (~10) out of the thousand expected. Fixing this test is also what these preparatory commits are for: I needed a reliable way of testing catalog completeness.

One change I'm especially not sure about is ERP5Site: Do reindex the whole site after creation.. The rationale is only that for whatever reason(s) we make site object non-indexable during bootstrap. As a result, many indexation activities are not even spawned. I believe there should be no fundamental reason to make it non-indexable, "only" bootstrap issues. Which I did not investigate, falling back to such easy change. Insights on bootstrap considerations is very welcome, as I would prefer "natural" on-installation indexations to be relied upon rather than such "reindex-all-after" hack (which , modulo the fact we do not clear catalog,(EDIT: we do clear catalog in both codes) reduces the relevance of catalog reproducibility tests).

Desired review process

I would like this merge request, if it does not achieve unanimous approval (which seems unlikely), to happen in progressive steps: all commits up to CategoryTool: Become a BaseTool subclass. could conceivable go in master while the rest is still discussed, and in about any order. So please ACK/NACK individual commits, to help with this approach. I will then reorder commits to push ACK'ed ones so they get out of the way ASAP.

Assignee
Assign to
Reviewer
Request review from
None
Milestone
None
Assign milestone
Time tracking
Source branch: Folder_scalable_recursiveReindexObject
GitLab Nexedi Edition | About GitLab | About Nexedi | 沪ICP备2021021310号-2 | 沪ICP备2021021310号-7