Base resilient stack ==================== This stack is meant to be extended by SR profiles, or other stacks, that need to provide automated backup/restore, election of backup candidates, and instance failover. As reference implementations, both stack/lamp and stack/lapp define resilient behavior for MySQL and Postgres respectively. This involves three different software_types: * pull-backup * {mysoftware}_export * {mysoftware}_import where 'mysoftware' is the component that needs resiliency (can be postgres, mysql, erp5, and so on). pull-backup ----------- This software type is defined in http://git.erp5.org/gitweb/slapos.git/blob/HEAD:/stack/resilient/instance-pull-backup.cfg.in?js=1 and there should be no reason to modify or extend it. An instance of type 'pull-backup' will receive data from an 'export' instance and immediately populate an 'import' instance. The backup data is automatically used to build an historical, incremental archive in srv/backup/pbs. export ------ example: http://git.erp5.org/gitweb/slapos.git/blob/HEAD:/stack/lapp/postgres/instance-postgres-export.cfg.in?js=1 This is the *active* instance - the one providing live data to the application. A backup is run via the bin/exporter script: it will 1) run bin/{mysoftware}-backup and 2) notify the pull-backup instance that data is ready. The pull-backup, upon receiving the notification, will make a copy of the data and transmit it to the 'import' instances. You should provide the bin/{mysoftware}-exporter script, see for instance http://git.erp5.org/gitweb/slapos.git/blob/HEAD:/slapos/recipe/postgres/__init__.py?js=1#l207 http://git.erp5.org/gitweb/slapos.git/blob/HEAD:/slapos/recipe/mydumper.py?js=1#l71 By default, as defined in http://git.erp5.org/gitweb/slapos.git/blob/HEAD:/stack/resilient/pbsready-export.cfg.in?js=1#l27 the bin/exporter script is run every 60 minutes. import ------ example: http://git.erp5.org/gitweb/slapos.git/blob/HEAD:/stack/lapp/postgres/instance-postgres-import.cfg.in?js=1 This is the *fallback* instance - the one that can be activated and thus become active. Any number of import instances can be used. Deciding which one should take over can be done manually or through a monitoring + election script. You should provide the bin/{mysoftware}-importer script, see for instance http://git.erp5.org/gitweb/slapos.git/blob/HEAD:/slapos/recipe/postgres/__init__.py?js=1#l233 http://git.erp5.org/gitweb/slapos.git/blob/HEAD:/slapos/recipe/mydumper.py?js=1#l71 In practice ----------- Add resilience to your software Let's say you already have a file instance-mysoftware.cfg.in that instantiates your software. In which there is a part [mysoftware] where there is the main recipe that instantiates the program. You need to create two new files, instance-mysoftware-import.cfg.in and instance-mysoftware-export.cfg.in, following this layout: IMPORT: [buildout] extends = ${instance-mysoftware:output} ${pbsready-import:output} parts += mysoftware import-on-notification [importer] recipe = YourImportRecipe wrapper = $${rootdirectory:bin}/$${slap-parameter:namebase}-importer backup-directory = $${directory:backup} ... EXPORT: [buildout] extends = ${instance-mysoftware:output} ${pbsready-export:output} parts += mysoftware cron-entry-backup [exporter] recipe = YourExportRecipe wrapper = $${rootdirectory:bin}/$${slap-parameter:namebase}-exporter backup-directory = $${directory:backup} ... In the [exporter] / [importer] part, you are free to do whatever you want, but you need to dump / import your data from $${directory:backup} and specify a wrapper. I suggest you only add options and specify your export/import recipe. Checking that it works ---------------------- To check that your software instance is resilient you can proceed this way: Once all instances are successfully deployed, go to your export instance, connect as the instance user and run: $ ~/bin/exporter It is the script responsible for triggering the resiliency stack on your instance. After doing a backup of your data, it will notify the pull-backup instances of a new backup, triggering the transfer of this data to the import instances. Once this script is run successfully, go to your import instance, connect as its instance user and check ~/srv/backup/"your sofwtare"/, the location of the data you wanted to receive. The last part of the resiliency is up to your import script. DEBUGGING: Here is a partial list of things you can check to understand what is causing the problem: - Check that your import script does not fail and successfully places your data in ~/srv/backup/"your software" (as the import instance user) by runnig: $ ~/bin/"your software"-exporter - Check the export instance script is run successfully as this instance user by running: $ ~/bin/exporter - Check the pull-instance system did its job by going to one of your pull-backup instance, connect as its user and check the log : ~/var/log/equeue.log ----------------------------------------------------------------------------------------- Finally, instance-mysoftware-import.cfg.in and instance-mysoftware-export.cfg.in need to be downloaded and accessible by switch_softwaretype, and you need to extend stack/resilient/buildout.cfg and stack/resilient/switchsoftware.cfg to download the whole resiliency bundle. Here is how it's done in the mariadb case for the lamp stack: ** buildout.cfg ** extends = ../resilient/buildout.cfg [instance-mariadb-import] recipe = slapos.recipe.template url = ${:_profile_base_location_}/mariadb/instance-mariadb-import.cfg.in output = ${buildout:directory}/instance-mariadb-import.cfg md5sum = ... mode = 0644 [instance-mariadb-export] recipe = slapos.recipe.template url = ${:_profile_base_location_}/mariadb/instance-mariadb-export.cfg.in output = ${buildout:directory}/instance-mariadb-export.cfg md5sum = ... mode = 0644 ** instance.cfg.in ** extends = ../resilient/switchsoftware.cfg [switch-softwaretype] ... mariadb = ${instance-mariadb:output} mariadb-import = ${instance-mariadb-import:output} mariadb-export = ${instance-mariadb-export:output} ... Then, in the .cfg file where you want to instantiate your software, you can do, instead of requesting your software * template-resilient.cfg.in * [buildout] ... parts += {{ parts.replicate("Name","3") }} ... [...] ... [ArgLeader] ... [ArgBackup] ... {{ replicated.replicate("Name", "3", "mysoftware-export", "mysoftware-import", "ArgLeader","ArgBackup") }} and it'll expend into the sections require to request Name0, Name1 and Name2, backuped and resilient. The leader will expend the section [ArgLeader], backups will expend [ArgBackup]. If you don't need to specify any options, you can omit the last two arguments in replicate(). Since you will compile your template with jinja2, there should be no $${}, because it is not yet possible to use jinja2 -> buildout template. To compile with jinja2, see jinja2's recipe.