Draft: erp5: Introduce mariadb replication at SlapOS level
EDIT: I rewrote the description to focus on the key points because the previous description had gotten way too long and technical. Everything is described in the commit messages. I invite you to read them in order for a detailed understanding.
Motivation
Despite its name and the high focus on mariadb replication, the overall concern of this MR is the wider question of ERP5 resiliency. Not resiliency with ERP5 inside Theia, but "native" resiliency of ERP5. The involves replicating ERP5's object (ZODB) and SQL database (index catalog, activities, ...). ZODB replication is already well implemented using Neo, thus this MR focuses mostly on mariadb replication; but it does bring some improvements to Neo.
Before this MR, some ERP5 projects on the cutting edge already use Neo + mariadb replication to ensure ERP5 resiliency. But this is mostly done and maintained manually outside of SlapOS. The goal is to mainstream this technique by automating it by integrating it inside SlapOS. Ultimately, EPR5 replication inside Theia should be replaced by "native" ERP5 replication everywhere.
This MR does not aim to complete this transition all in a single step. Instead it make a significant step in this direction.
Overview of tasks and future todos
Some of these will not be implemented in this current MR
-
neo
-
Request a neo replica without needing to manually access the partition to set its state to
BACKINGUP
— Powered by neoppod!25 (merged) -
Make
check_neo_health
promise assert that state is BACKINGUP when it should be — Powered by slapos.toolbox!139 (merged) (backported inslapos.toolbox==0.128.2
) -
Make
check_neo_health
promise notbang
when it fails — Powered by slapos.core!786 (merged)
-
Request a neo replica without needing to manually access the partition to set its state to
-
zope
-
Deactivate zope promises when the neo is expected to be
BACKINGUP
(temporary solution) -
Adapt the zope service so that it detects when neo is in
BACKINGUP
state and goes on standby until neo isRUNNING
(instead of crashing) - Support starting only select zope processes, e.g. to disable external interfaces when creating a dev or test clone of an ERP5
-
Deactivate zope promises when the neo is expected to be
-
mariadb
-
Remove
mariadb_update
service that could break replication — instead users are only created on database creation, and updater is run on every mariadb restart -
Create a
replication_user
withREPLICATION SLAVE
grant and a randomly generated password (or the same password as the primary) -
Support mariabackups in addition to sqldumps — introduce new
backup
parameter dict to control backups (£ — !1792) - Optimize mariabackups size and speed by mixing full and incremental mariabackups as introduced in !1792 (£ — !1792)
-
Serve mariadb backups statically with
simplehttpserver
so that another mariadb can fetch them to bootstrap replication (££ — binlogs retention) -
Introduce
replication
parameters to make a mariadb replicate & bootstrap from another mariadb instance (£££ — usage) -
Make
mariadb_replication
promise notbang
when it fails — Powered by slapos.core!786 (merged) - Support disabling TCP access on mariadb replica
-
Enable TLS IPv6
https://
access to bootstrap and TLS IPv6mysql://
access toreplication_user
of mariadb (££££ — example)-
caucased
- Introduce an embedded caucased server and autoapprove a caucase user (=admin) certificate; publish the embedded caucased url.
- Request and renew locally the autoapproved caucase user certificate
- Request a local caucase service certificate for mariadb & bootstrap TLS access
- Automatically sign the local service certificate using the user certificate
-
Allow external certificate requests to this embedded caucased to be signed by passing the CSR via new
csr-to-sign
parameter -
Support passing an
external-caucased-url
instead of launching an embedded caucased; in that case there is no user certificate and nothing is automatically approved
-
reverse-proxy: haproxy & proxysql
-
Use haproxy to serve backups for bootstrap over IPv6
https://
-
Use proxysl to give access to
replication_user
over IPv6mysql://
- Decide whether ProxySQL's lack of CRL support is an issue, and find a workaround or another solution if it is
-
Use haproxy to serve backups for bootstrap over IPv6
-
replica mTLS
-
Pass the primary's
caucased-url
to mariadb replicas so that it can request and renew a replica caucased service certificate - Make the replica publish the corresponding CSR
-
Make the replica connect with mTLS to the primary's bootstrap
https
server (behind haproxy) andmysql://
mariadb (behind proxysql)
-
Pass the primary's
-
caucased
-
Remove
-
takoever
-
Provide a takoever script (
mariadb-replica-become-primary
) in the mariadb partition that can be called manually by logging as compute node administrator into the partition - Allow a replica mariadb to stop replicating and become a primary without requiring manual login to the instance and manual operations on the DB (e.g. by providing a url where the user can click to perform this action) — this will be a necessary step of an eventual automated takeover procedure
-
Streamline the takoever steps on neo in a single script: change state to
RUNNING
, truncate — this will be a necessary step of an eventual automated takeover procedure - Provide a non-manual way for a replica neo to become a primary (e.g. by providing a url where the user can click to perform this action) — this will be a necessary step of an eventual automated takeover procedure
- Integrate the procedure to make mariadb coherent with a truncated neo using ERP5Site_resynchroniseCatalogSince (£££££)
- Provide a comprehensive "one-click" takoever method for a whole replica ERP5: mariadb takoever + neo takeover + neo truncation + ERP5Site_resynchroniseCatalogSince + zope management
-
Provide a takoever script (
Footnotes
£: !1792 proposes a much more advanced way to generate and store mariabackups, using frequent incremental mariabackups combined with infrequent full mariabackups, and storing them with restic. This makes for faster and smaller backups. Restic stores the backups as content defined chunks, so the backups are not available as a single file without asking restic to reconstitute it. Thus using restic will imply serving the bootstrap backups withs something like rest server that will reconstitute and serve the backup files on demand. UPDATE: The full + incremental mariabackups feature has now been included here without restic.
££: Replication works by fetching mariadb binlogs. Binlogs are retained on the primary only for a few days (by default). So if when creating a replica the primary is older than the binlog retention time, the replica must first restore itself to a recent backup of the primary to bootstrap replication.
£££: To request a mariadb replica — either standalone or as a sub-instance of ERP5 (§):
'replication': {
'upstream-mariadb-url': 'mysql://<user>:<password>@<ip>:<port>',
'upstream-mariabackup-url': 'http(s)://<recent-mariabackup-of-primary>',
}
or
'replication': {
'upstream-mariadb-url': 'mysql://<user>:<password>@<ip>:<port>',
'upstream-bootstrap-url': 'http(s)://<recent-sqldump-backup-of-primary>',
}
This takes effect on mariadb database creation - when no data exists yet. That way existing data cannot be deleted by setting or changing the replication parameters after the fact.
A promise checks that the state of the running mariadb matches the requested state (replica/primary, replication source); but if not, the mariadb database will not automatically converge without human intervention once ~/srv/mariadb directory exists.
The bootstrap-url
or mariabackup-url
may be omitted: this skips replication bootstrap and requires that all binlogs be still available on the primary. This is useful when the primary is recent and may not have a ready backup for bootstrap yet.
The primary mariadb publishes the needed parameters under replication-primary-url
, replication-bootstrap-url
, and replication-mariabackup-url
. They can then be plugged directly into the replica request.
££££: If the replica is accessed over TLS IPv6, the caucased-url
of the primary on which the replica will request a certificate must be passed as well:
'replication': {
'upstream-mariadb-url': 'mysql://<user>:<password>@<ipv6>:<port>',
'upstream-mariabackup-url': 'http(s)://<recent-mariabackup-of-primary>',
'upstream-caucased-url': 'http://[<ipv6>]:<port>',
}
The replica will then publish a CSR under caucased-csr-to-sign
— the ERP5 root instance (if there is one ) will republish it (§§). To make the primary caucased sign it, it can be passed back to the primary:
'caucased': {
'csr-to-sign': '<PEM-content>',
}
£££££: For many ERP5 uses cases to work correctly (accurate stock evaluation, activities, ...), the ZODB (neo) and the index catalog (mariadb) must be coherent with each other. This coherence is maintained by the zope processes and the activity queue. At the time a takeover is needed, most likely the replica mariadb and replica neo will not be coherent with each other. One way to reattain coherence is to regenerate the mariadb catalog from scratch by re-indexing the whole ZODB; this is a very lengthy process that can take days or weeks, which makes it unsuitable in practice. Our practical "state-of-the-art" solution is to truncate the neo to its state a few minutes back in time; enough minutes to be certain that all the ZODB objects created and modified prior to that truncation point are correctly indexed in the non-truncated mariadb. Then it's only a matter of examining the indexations in mariadb that occurred in the interval between the truncation time and the most recent state of mariadb to determine which remain valid. This is done by ERP5Site_resynchroniseCatalogSince. Given that that only a few minutes need to be examined, this process is very fast. Thus this technique trades a few minutes of data in the past for the ability to be up and running again a short time in the future.
§: To request a ERP5 with a mariadb replica sub-instance, the same parameters can be forwarded from ERP5 root instance to mariadb by wrapping them in a 'mariadb'
dict:
'mariadb': {
'replication': { '...' },
'caucased': { '...' }
}
§§: The ERP5 root instance (when mariadb is not standalone) will republish the needed parameters by prefixing them with 'mariadb-'
, e.g. mariadb-replication-primary-url
, mariadb-caucased-url
, mariadb-caucased-csr-to-sign
.